# 01_Basics

## 01_01_Matrices and Algebra fundamentals

### Mathematical Objects

The basic mathematical objects we have to understand as data scientist are the following picture. The two primary mathematical entities that are of interest in linear algebra are the vector and the matrix. They are examples of a more general entity known as a tensor. Tensors possess an order (or rank), which determines the number of dimensions in an array required to represent it.:
![matrix-image](https://miro.medium.com/max/700/0*sjDnWS2QtJUa0Gy8.png)

The two primary mathematical entities that are of interest in linear algebra are the vector and the matrix. They are examples of a more general entity known as a tensor. Tensors possess an order (or rank), which determines the number of dimensions in an array required to represent it.

#### [Scalar](https://en.wikipedia.org/wiki/Scalar_(mathematics)):

__Scalars__ are single numbers and are an example of a __0th-order tensor__.  

In mathematics it is necessary to describe the set of values to which a scalar belongs.  The notation  $x \in \mathbb{R}$ states that the (lowercase) scalar value $x$ is an element of (or member of) the set of real-valued numbers, $\mathbb{R}$.

There are various sets of numbers of interest within machine learning.  $\mathbb{N}$ represents the set of positive integers ($1,2,3,...$).  $\mathbb{Z}$represents the integers, which include positive, negative and zero values.  $\mathbb{Q}$ represents the set of rational numbers that may be expressed as a fraction of two integers.

In the graphic above the example for an scalar is the single number 24. 

#### [Vector](https://en.wikipedia.org/wiki/Vector_(mathematics_and_physics)):

__Vectors__ are ordered arrays of single numbers and are an example of __1st-order tensor__. Vectors are members of objects known as __vector spaces__. 

A __vector space__ can be thought of as the entire collection of all possible vectors of a particular length (or dimension). The three-dimensional real-valued vector space, denoted by $\mathbb{R}^3$ is often used to represent our real-world notion of three-dimensional space mathematically.

More formally a vector space is an $n$-dimensional [Cartesian product](https://en.wikipedia.org/wiki/Cartesian_product) of a set with itself, along with proper definitions on how to add vectors and multiply them with scalar values. If all of the scalars in a vector are real-valued then the notation $x \in \mathbb{R}^n $ states that the (boldface lowercase) vector value $x$ is a member of the $n$-dimensional vector space of real numbers, $\mathbb{R}^n$.

Sometimes it is necessary to identify the _components_ of a vector explicitly. The $i$th scalar element of a vector is written as $x_i$ . Notice that this is non-bold lowercase since the element is a scalar. An $n$-dimensional vector itself can be explicitly written using the following notation:

$$
\begin{equation}
\boldsymbol{x}=\begin{bmatrix}
  \kern4pt x_1 \kern4pt \\
  \kern4pt x_2 \kern4pt \\
  \kern4pt \vdots \kern4pt \\
  \kern4pt x_n \kern4pt
\end{bmatrix}
\end{equation}
$$

Given that __scalars__ exist to represent values why are vectors necessary? One of the primary use cases for vectors is to represent physical quantities that have both a magnitude and a direction. Scalars are only capable of representing magnitudes.

![vector-representing](https://www.mathsisfun.com/algebra/images/vector-mag-dir.svg)

For instance __scalars__ and __vectors__ encode the difference between the _speed_ of a car and its _velocity_. The velocity contains not only its speed but also its direction of travel. It is not difficult to imagine many more physical quantities that possess similar characteristics such as gravitational and electromagnetic forces or wind velocity.

In machine learning vectors often represent _feature vectors_, with their individual components specifying how important a particular feature is. Such features could include relative importance of words in a text document, the intensity of a set of pixels in a two-dimensional image or historical price values for a cross-section of financial instruments.

> Simply expressed a __Vector__ is an ordered array of numbers and can be in __a row or a column__. A Vector has just a __single index__, which can point to a specific value within the Vector. For example, V2 refers to the second value within the Vector, which is -8 in the inital graphic above.

#### [Matrices](https://en.wikipedia.org/wiki/Matrix_(mathematics)):

__Matrices__ are rectangular arrays consisting of numbers and are an example of __2nd-order tensors__. 

If $m$ and $m$ are positive integers, that is $m,n \in \mathbb{N}$ then the  matrix contains $mn$ numbers, with $m$ rows and $n$ columns.

If all of the __scalars__ in a __matrix__ are real-valued then a matrix is denoted with uppercase boldface letters, such as $ \boldsymbol{A} \in \mathbb{R}^{m \times n}$ . That is the matrix lives in a $m \times n$-dimensional real-valued vector space. Hence matrices are really vectors that are just written in a two-dimensional table-like manner.

Its components are now identified by two indices $i$ and $j$. $i$ represents the index to the matrix row, while $j$ represents the index to the matrix column. Each component of $\boldsymbol{A}$ is identified by $a_{ij}$.

The full $m \times n$ matrix can be written as:

$$
\begin{equation}
\boldsymbol{A}=\begin{bmatrix}
  \kern4pt a_{11} & a_{12} & a_{13} & \ldots & a_{1n} \kern4pt \\
  \kern4pt a_{21} & a_{22} & a_{23} & \ldots & a_{2n} \kern4pt \\
  \kern4pt a_{31} & a_{32} & a_{33} & \ldots & a_{3n} \kern4pt \\
  \kern4pt \vdots & \vdots & \vdots & \ddots & \vdots \kern4pt \\
  \kern4pt a_{m1} & a_{m2} & a_{m3} & \ldots & a_{mn} \kern4pt \\
\end{bmatrix}
\end{equation}
$$

It is often useful to abbreviate the full matrix component display into the following expression:

$$
\begin{equation}
\boldsymbol{A} = [a_{ij}]_{m \times n}
\end{equation}
$$

Where $a_{ij}$ is referred to as the -element of the __matrix__ $\boldsymbol{A}$. The subscript of $m \times n$ can be dropped if the dimension of the matrix is clear from the context.

Note that a _column vector_ is a size $m \times 1$ matrix, since it has $m$ rows and 1 column. Unless otherwise specified all vectors will be considered to be column vectors.

__Matrices__ represent a type of function known as a [linear map](https://en.wikipedia.org/wiki/Linear_map). Based on rules that will be outlined in subsequent articles, it is possible to define multiplication operations between matrices or between matrices and vectors. Such operations are immensely important across the physical sciences, quantitative finance, computer science and machine learning.

__Matrices__ can encode geometric operations such as rotation, reflection and transformation. Thus if a collection of vectors represents the vertices of a three-dimensional geometric model in [Computer Aided Design](https://en.wikipedia.org/wiki/Computer-aided_design) software then multiplying these vectors individually by a pre-defined [rotation matrix](https://en.wikipedia.org/wiki/Rotation_matrix) will output new vectors that represent the locations of the rotated vertices. This is the basis of modern 3D computer graphics.

In __deep learning neural network__ weights are stored as __matrices__, while feature inputs are stored as __vectors__. Formulating the problem in terms of linear algebra allows compact handling of these computations. By casting the problem in terms of __tensors__ and utilising the machinery of linear algebra, rapid training times on modern GPU hardware can be obtained.


> Summarized and in sinple words a __Matrix__ is an ordered 2D array of numbers and it has __two indices__. The first one points to the row and the second one to the column. For example, M23 refers to the value in the second row and the third column, which is 8 in the inital yellow graphic above at the beginning. A Matrix can have __multiple numbers of rows and columns__. Note that a __Vector__ is also a Matrix, but with only one row or one column.

![matrix-image](https://upload.wikimedia.org/wikipedia/commons/b/bb/Matrix.svg)

#### [Tensors](https://en.wikipedia.org/wiki/Tensor):

The more general entity of a __tensor__ encapsulates the __scalar__, __vector__ and the __matrix__. It is sometimes necessary—both in the physical sciences and machine learning—to make use of tensors with order that exceeds two.

In theoretical physics, and general relativity in particular, the [Riemann curvature tensor](https://en.wikipedia.org/wiki/Riemann_curvature_tensor) is a __4th-order tensor__ that describes the local curvature of [spacetime](https://en.wikipedia.org/wiki/Spacetime). In machine learning, and deep learning in particular, a __3rd-order tensor__ can be used to describe the intensity values of multiple channels (red, green and blue) from a two-dimensional image.

__Tensors__ will be identified in this series of posts via the boldface sans-serif notation, $\textsf{A}$. For a __3rd-order tensor__ elements will be given by $a_{ijk}$, whereas for a __4th-order tensor__ elements will be given by $a_{ijkl}$.

The following graphic should summarize and help to understand the relationships between scalar, vector, matrix and tensor.

![matrix-image](https://cdn-images-1.medium.com/max/880/1*WbLIc4-xIOfHiO2oWzimyA.png)

> __Tensor__ is the most general term for all of these concepts above because a Tensor is a multidimensional array and it can be a __Vector__ and a __Matrix__, depending on the __number of indices__ it has. For example, a __first-order Tensor__ would be a __Vector__ (1 index). A __second-order Tensor__ is a __Matrix__ (2 indices) and __third-order Tensors__ (3 indices) and higher are called __Higher-Order Tensors__ (3 or more indices)

#### Sources: 
* [Scalars, Vectors, Matrices and Tensors - Linear Algebra for Deep Learning](https://www.quantstart.com/articles/scalars-vectors-matrices-and-tensors-linear-algebra-for-deep-learning-part-1/#:~:text=Vectors%20and%20Matrices,array%20required%20to%20represent%20it)
* [Basic Linear Algebra for Deep Learning](https://towardsdatascience.com/linear-algebra-for-deep-learning-f21d7e7d7f23)




### Operations

There are a number of basic operations that can be applied to modify matrices:

* [Addition](https://en.wikipedia.org/wiki/Matrix_addition)
* [Matrix Multiplication](https://en.wikipedia.org/wiki/Scalar_multiplication)
* [Transposition](https://en.wikipedia.org/wiki/Transpose)
* [Multiplication](https://en.wikipedia.org/wiki/Matrix_multiplication)

#### Sources: 
* [Matrix Algebra - Linear Algebra for Deep Learning (Part 2)](https://www.quantstart.com/articles/matrix-algebra-linear-algebra-for-deep-learning-part-2/)
* [Basic Linear Algebra for Deep Learning](https://towardsdatascience.com/linear-algebra-for-deep-learning-f21d7e7d7f23)

