In [1]:
import numpy as np

In [9]:
%%html
<style>
.pquote {
  text-align: left;
  margin: 40px 0 40px auto;
  width: 70%;
  font-size: 1.5em;
  font-style: italic;
  display: block;
  line-height: 1.3em;
  color: #5a75a7;
  font-weight: 600;
  border-left: 5px solid rgba(90, 117, 167, .1);
  padding-left: 6px;
}
.notes {
  font-style: italic;
  display: block;
  margin: 40px 10%;
}
img + em {
  text-align: center;
  display: block;
  color: gray;
  font-size: 0.9em;
  font-weight: 600;
}
</style>

$$
\newcommand\bs[1]{\boldsymbol{#1}}
$$

> This content is part of a series following the chapter 2 on linear algebra from the [Deep Learning Book](http://www.deeplearningbook.org/) by Goodfellow, I., Bengio, Y., and Courville, A. (2016).

# Introduction
This first chapter provides a light introduction to scalars, vectors, matrices, tensors and some basic operations, like the transposition and addition of vectors and matrices. There's also a final word on broadcasting, which is how numpy handles ambiguity in mathematical operations with arrays. 

# 2.1 Scalars, Vectors, Matrices and Tensors

Let's start with some basic definitions:

<img src="images/scalar-vector-matrix-tensor.png" width="400" alt="An example of a scalar, a vector, a matrix and a tensor" title="Difference between a scalar, a vector, a matrix and a tensor">
<em>Difference between a scalar, a vector, a matrix and a tensor</em>

- A scalar is a single number

$$a$$

- A vector is a sequence (an array) of scalars.

$$
\bs{x} =\begin{bmatrix}
    x_1 \\\\
    x_2 \\\\
    \cdots \\\\
    x_n
\end{bmatrix}
$$

- A matrix is a 2-D array. We often describe a matrix in terms of its dimensions, saying that matrix $\bs{A}$ is an array of numbers with $m$ rows and $n$ columns. Thus each element $a$ within the array can denoted as $a_{mn}$

$$
\bs{A}=
\begin{bmatrix}
    A_{1,1} & A_{1,2} & \cdots & A_{1,n} \\\\
    A_{2,1} & A_{2,2} & \cdots & A_{2,n} \\\\
    \cdots & \cdots & \cdots & \cdots \\\\
    A_{m,1} & A_{m,2} & \cdots & A_{m,n}
\end{bmatrix}
$$

- A tensor is an $n$-dimensional array with $n>2$

>#### A Note on Notation:
>We will follow the conventions used in the [Deep Learning Book](http://www.deeplearningbook.org/):

>- scalars are written in lowercase and italics. For instance: $a$
>- vectors are written in lowercase, italics and bold type. For instance: $\bs{x}$
>- matrices are written in uppercase, italics and bold. For instance: $\bs{X}$

## Creating a vector with numpy

You can create $n$-dimensionnal arrays with the `array()` function. We'll use that below.

Note also, however, that there's also the `matrix()` function, which necessarily creates $2$-dimensional matrices. The main advantage of using `matrix()` is that you'll immediately have access to a lot of useful matrix operations via object methods. 

To start, we will keep things simple and use the `array()` function.

First, we'll create a vector (i.e. a $1$-dimensional array):

In [3]:
x = np.array([1, 2, 3, 4])
x

array([1, 2, 3, 4])

## Creating a (3x2) matrix

You can use nested brackets witin the `array()` function to create $2$-dimensional arrays (i.e. matrices):

In [5]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

### Shape
Numpy calls a matrix's dimensions its shape. For a $2$-dimensional array, the shape will tell you the number of rows and the number of columns. 

To find the shape of $\bs{A}$, you use the `shape` attribute of the numpy array object.

In [6]:
A.shape

(3, 2)

As we saw when we created the array, $\bs{A}$ has 3 rows and 2 columns. In other words, it's a $3$x$2$ matrix.

Let's check the shape of our first vector:

In [44]:
x.shape

(4,)

As expected, you can see that $\bs{x}$ has only one dimension. That number also corresponds to the length of the array:

In [7]:
len(x)

4

## Vector and Matrix Operations

### Transposition

With transposition you can convert a row vector to a column vector and vice versa:

<img src="images/vector-transposition.png" alt="Transposition of a vector" title="Vector transposition" width="200">
<em>Vector transposition</em>

The transpose $\bs{A}^{\text{T}}$ of the matrix $\bs{A}$ corresponds to the mirrored axes. If the matrix is a square matrix (same number of columns and rows):

<img src="images/square-matrix-transposition.png" alt="Transposition of a square matrix" title="Square matrix transposition" width="300">
<em>Square matrix transposition</em>

If the matrix is not square the idea is the same:

<img src="images/non-squared-matrix-transposition.png" alt="Transposition of a square matrix" title="Non square matrix transposition" width="300">
<em>Non-square matrix transposition</em>


The superscript $^\text{T}$ is used to denote the transposition of matrices.

$$
\bs{A}=
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}
$$

$$
\bs{A}^{\text{T}}=
\begin{bmatrix}
    A_{1,1} & A_{2,1} & A_{3,1} \\\\
    A_{1,2} & A_{2,2} & A_{3,2}
\end{bmatrix}
$$

A simple way to think about transposition is what happens to the matrix's shape. What was once ($m \times n$) becomes ($n \times m$) when you transpose it.

<img src="images/dimensions-transposition-matrix.png" alt="Dimensions of matrix transposition" title="Dimensions of matrix transposition" width="300">
<em>Dimensions of matrix transposition</em>

#### Create and transpose a marix with numpy

In [10]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [11]:
A_t = A.T
A_t

array([[1, 3, 5],
       [2, 4, 6]])

Let's see how the dimensions changed:

In [12]:
A.shape

(3, 2)

In [13]:
A_t.shape

(2, 3)

### Addition

<img src="images/matrix-addition.png" alt="Addition of two matrices" title="Addition of two matrices" width="300">
<em>Addition of two matrices</em>

Matrices can be added if they have the same shape:

$$\bs{A} + \bs{B} = \bs{C}$$

Each element of $\bs{A}$ is added to the corresponding element of $\bs{B}$:

$$\bs{A}_{i,j} + \bs{B}_{i,j} = \bs{C}_{i,j}$$

> Note that $i$ is the row index and $j$ the column index.

$$
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}+
\begin{bmatrix}
    B_{1,1} & B_{1,2} \\\\
    B_{2,1} & B_{2,2} \\\\
    B_{3,1} & B_{3,2}
\end{bmatrix}=
\begin{bmatrix}
    A_{1,1} + B_{1,1} & A_{1,2} + B_{1,2} \\\\
    A_{2,1} + B_{2,1} & A_{2,2} + B_{2,2} \\\\
    A_{3,1} + B_{3,1} & A_{3,2} + B_{3,2}
\end{bmatrix}
$$

Note that he shapes of $\bs{A}$, $\bs{B}$ and $\bs{C}$ are identical.

#### Matrix addition with numpy

With numpy you can add matrices just as you would add vectors or scalars.

In [23]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [24]:
B = np.array([[2, 5], [7, 4], [4, 3]])
B

array([[2, 5],
       [7, 4],
       [4, 3]])

In [25]:
# Add matrices A and B
C = A + B
C

array([[ 3,  7],
       [10,  8],
       [ 9,  9]])

In [26]:
# You can also perform matrix subtraction!
D = A - B
D

array([[-1, -3],
       [-4,  0],
       [ 1,  3]])

It is also possible to add a scalar to a matrix. This means adding this scalar to each element of the matrix.

$$
\alpha+ \begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}=
\begin{bmatrix}
    \alpha + A_{1,1} & \alpha + A_{1,2} \\\\
    \alpha + A_{2,1} & \alpha + A_{2,2} \\\\
    \alpha + A_{3,1} & \alpha + A_{3,2}
\end{bmatrix}
$$

#### Add a scalar to a matrix with numpy

In [27]:
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [29]:
# Add the scalar 4 to the matrix A
C = A+4
C

array([[ 5,  6],
       [ 7,  8],
       [ 9, 10]])

In [30]:
# Subtract the scalar 4 from Matrix A
D = A-4
D

array([[-3, -2],
       [-1,  0],
       [ 1,  2]])

## Broadcasting

Numpy can handle operations on arrays of different shapes. In such cases, the smaller array will be extended (broadcast) to match the shape of the bigger one. The advantage is that this is done in `C` under the hood (like any vectorized operations in Numpy). Actually, we used broadcasting in the example 5. The scalar was converted in an array of same shape as $\bs{A}$.

Here is another generic example:

$$
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}+
\begin{bmatrix}
    B_{1,1} \\\\
    B_{2,1} \\\\
    B_{3,1}
\end{bmatrix}
$$

is equivalent to

$$
\begin{bmatrix}
    A_{1,1} & A_{1,2} \\\\
    A_{2,1} & A_{2,2} \\\\
    A_{3,1} & A_{3,2}
\end{bmatrix}+
\begin{bmatrix}
    B_{1,1} & B_{1,1} \\\\
    B_{2,1} & B_{2,1} \\\\
    B_{3,1} & B_{3,1}
\end{bmatrix}=
\begin{bmatrix}
    A_{1,1} + B_{1,1} & A_{1,2} + B_{1,1} \\\\
    A_{2,1} + B_{2,1} & A_{2,2} + B_{2,1} \\\\
    A_{3,1} + B_{3,1} & A_{3,2} + B_{3,1}
\end{bmatrix}
$$

where the ($3 \times 1$) matrix is converted to the right shape ($3 \times 2$) by copying the first column. Numpy will do that automatically so long as the shapes can evenly match up.

#### Adding two matrices of different shapes with numpy

In [19]:
A = np.array([[1, 2], [3, 4], [5, 6]])
A

array([[1, 2],
       [3, 4],
       [5, 6]])

In [20]:
B = np.array([[2], [4], [6]])
B

array([[2],
       [4],
       [6]])

In [21]:
# Broadcasting
C=A+B
C

array([[ 3,  4],
       [ 7,  8],
       [11, 12]])

# What's Next?
Next we'll learn about multiplying matrices and vectors. In that chapter, we'll mainly explore a concept known as the dot product. Then, we will see how to synthesize a system of linear equations using matrix notation. This is a major prerequisite for subsequent chapters.

# References

- [Broadcasting in Numpy](https://docs.scipy.org/doc/numpy-1.13.0/user/basics.broadcasting.html)

- [Discussion on Arrays and matrices](https://stackoverflow.com/questions/4151128/what-are-the-differences-between-numpy-arrays-and-matrices-which-one-should-i-u)

- [Math is fun - Matrix introduction](https://www.mathsisfun.com/algebra/matrix-introduction.html)