## Deep Learning - Goodfellow et al.

We will be using these notebooks to review and work through Goodfellow et al.'s Deep Learning text.

### Chapter 2 - Linear Algebra

#### Scalars

A scalar is a single number -- example being the slope of a line, described as $s \in \mathbb{R}$ or $n \in \mathbb{N}$ described as the number of units when defining the natural number scalars

#### Vectors

This is an array of numbers typically arranged in order. We can identify the position, or index, of an element using subscript notation for the elements of an array. If $\mathbf{x}$ is a vector, then we can identify the position of elements within this array as $x_{i}$. Vectors contain elements which reference points in space in $\mathbb{R}^n$ dimensions. Each element references a coordinate along a different axis. 

In the example below we use [numpy](http://www.numpy.org/) to create array(s).

** note : I use vector and array interchangably **

In [2]:
## Numpy representations

import numpy as np

# Straight forward creation of 5 element vector of zeroes

zeroes = np.zeros((1,5))
print(zeroes)

# Numpy random.rand creates an n-dimensional vector (array) by
# sampling values uniformly between (0,1]

random_vals = np.random.rand(1,5)
print(random_vals)

[[ 0.  0.  0.  0.  0.]]
[[ 0.44097475  0.21059968  0.09870492  0.53145336  0.17612998]]


##### Vector Sets

Sometimes we need to reference the indices of elements within a vector, and we can do this by defining a set $S = \{1,3,6\}$ which references $x_1$, $x_3$, and $x_6$, respectively.

We can then use the $-$ sign to define the complement of a given set. For example $x_{-1}$ is the vector $\mathbf{x}$ containing all elements _except_ the first element, and $x_{-S}$ references the vector containing all elements, except those defined within the set $S$.



In [3]:
# Slicing numpy arrays allows for subsetting vectors within numpy
# example below selects first 2 rows of the all zeroes vector

print "First two elements of zeroes vector: ", zeroes[0][:2]

# Just to prove this is the case I subset the random_vals array as well
print "First two elements of the random vector : ", random_vals[0][:2]

First two elements of zeroes vector:  [ 0.  0.]
First two elements of the random vector :  [ 0.44097475  0.21059968]


In [4]:
# Can we negative subset (for lack of a better term) we can do that 
# using the same logic

print "All but last two elements of zeroes vector: ", zeroes[0][:-2]

print "All but last two elements of random_vals vector : ", random_vals[0][0:-2]

All but last two elements of zeroes vector:  [ 0.  0.  0.]
All but last two elements of random_vals vector :  [ 0.44097475  0.21059968  0.09870492]


#### Matrices

* 2D array of numbers
* Use bold typeface to reference an a full matrix $\textbf{A}$
* Matrix has height of $m$ and width of $n$, we say that $A \in \mathbb{R}^{m \times n}$
* $A_{1,1}$ is upper left entry of $\textbf{A}$ and $A_{m,n}$ is the bottom right entry of a general matrix
* ":" used to identify matrix cross-sections
    * $A_{i,:}$ = horizontal cross-section (row)
    * $A_{:,i}$ = vertical cross-section (column)
* Explicit identification of Matrix elements requires presenting the array as such :

$$
\begin{bmatrix}
A_{1,1} & A_{1,2} \\
A_{2,1} & A_{2,2}
\end{bmatrix}
$$
* $f(\textbf{A})_{i,j}$ gives the i,jth entry of matrix $\textbf{A}$ after the function $f$ has been applied to the matrix

In [5]:
# Note the double brackets like the array defined above using numpy
# due to the nature of matrix being a generalization of a vector
# and a tensor (more below) being the generalization of a matrix
# numpy handles matrices a bit different (keeping 2D nature through)
# operations, but they're by definition similar

matrix_A = np.random.random((2,2))
print "matrix_A : \n\n", matrix_A, "\n"

# Then we can call a single entry of the entire matrix
print "Single bottom left entry of matrix_A : ", matrix_A[1][1]

matrix_A : 

[[ 0.52905214  0.46369121]
 [ 0.95629514  0.12338068]] 

Single bottom left entry of matrix_A :  0.123380683408


In [6]:
# Accessing rows and columns in numpy matches the textbook
# note the commas before and after the ith element number

print "The first row of matrix_A is : ", matrix_A[0,:]
print "The first column of matrix_A is : ", matrix_A[:,0]

The first row of matrix_A is :  [ 0.52905214  0.46369121]
The first column of matrix_A is :  [ 0.52905214  0.95629514]


##### Tensors

* Array with more than two axis
* General case - array of numbers arranged on a regular grid with a variable number of axes
* Denote a tensor similar to a matrix as $\textbf{A}$, however will have $(i,j,k)$ coordinates defined.
    * $\textbf{A}_{i,j,k}$

In [31]:
# Example 3x3x3 tensor, using ints for illustrative purposes

tensor = np.random.randint(1, 10, (3,3,3))

print tensor

[[[4 5 1]
  [1 2 8]
  [4 7 2]]

 [[3 9 4]
  [6 6 3]
  [3 6 8]]

 [[3 7 9]
  [5 3 7]
  [2 4 8]]]


#### Transpose operation

* Mirror image of the matrix across the *main diagonal*
    * *main diagonal* starts in upper left hand corner of matrix and ends in bottom right corner
* Written as $A^{T}$ or :

$$(\textbf{A}^{T})_{i,j} = \textbf{A}_{j,i}$$

* Vectors are single column matrices -- when transposed we end up with single row matrix
* Can notationally write vectors as $\textbf{x} = [x_{1},x_{2},x_{3}]^{T}$
* Scalar can be interpreted as a single entry matrix, as well, therefore a scalars transpose is itself
    * $a = a^{T}$

In [7]:
# The transpose function is a bit hard(er) to see with a 2x2 matrix

print "Matrix_A before transpose operation is applied : \n\n", matrix_A, "\n"

print "Matrix_A after transpost operation is applied : \n\n", matrix_A.T, "\n"

Matrix_A before transpose operation is applied : 

[[ 0.52905214  0.46369121]
 [ 0.95629514  0.12338068]] 

Matrix_A after transpost operation is applied : 

[[ 0.52905214  0.95629514]
 [ 0.46369121  0.12338068]] 



In [8]:
# It's a bit easier to see the transposition with a 3x3 matrix, and on.
# Just throwing the 3x3 example in for completeness
matrix_B = np.random.random((3,3))

print "Matrix_B before transpose operation is applied : \n\n", matrix_B, "\n"

print "Matrix_B after transpost operation is applied : \n\n", matrix_B.T, "\n"

Matrix_B before transpose operation is applied : 

[[ 0.76359093  0.19536575  0.0759568 ]
 [ 0.70939379  0.8572922   0.92919314]
 [ 0.78102358  0.07331387  0.87375616]] 

Matrix_B after transpost operation is applied : 

[[ 0.76359093  0.70939379  0.78102358]
 [ 0.19536575  0.8572922   0.07331387]
 [ 0.0759568   0.92919314  0.87375616]] 



In [37]:
# Transposing single vectors

vector = np.random.random((1,5))

print "Vector before transpose : \n\n", vector, "\n"

print "Vector after transpose : \n\n", vector.T, "\n"

Vector before transpose : 

[[ 0.74287045  0.14115694  0.39784215  0.64221029  0.37196598]] 

Vector after transpose : 

[[ 0.74287045]
 [ 0.14115694]
 [ 0.39784215]
 [ 0.64221029]
 [ 0.37196598]] 



##### Addition operation

* Can add matricies as long as they have the same shape by summing their elements

$$\textbf{C} = \textbf{A} + \textbf{B}$$

where

$$C_{i,j} = A_{i,j} + B_{i,j}$$

* We can multiply a matrix or add a matrix by a scalar

$$\textbf{D} = a \cdot \textbf{B} + c$$

where 

$$D_{i,j} = a \cdot B_{i,j} + c$$

In [9]:
# For simplicity I created matrices with ints as entries for ease of 
# illustration for the addition operations

matrix_A_int = np.random.randint(1, 10, (3,3))
matrix_B_int = np.random.randint(1, 10, (3,3))

print "matrix_A_int : \n\n ", matrix_A_int, "\n"
print "matrix_B_int : \n\n ", matrix_B_int, "\n"

# Simple addition operator is invoked on the two numpy matrices
matrix_C_int = matrix_A_int + matrix_B_int

print "matrix_C_int : \n\n ", matrix_C_int, "\n"

matrix_A_int : 

  [[5 8 1]
 [8 9 5]
 [3 1 6]] 

matrix_B_int : 

  [[2 8 1]
 [4 1 4]
 [4 1 9]] 

matrix_C_int : 

  [[ 7 16  2]
 [12 10  9]
 [ 7  2 15]] 



In [40]:
# We can also add scalars

print "matrix_A_int before scalar addition : \n\n", matrix_A_int, "\n"

print "matrix_A_int after adding a scalar to each element : \n\n", matrix_A_int + 2

matrix_A_int before scalar addition : 

[[5 8 1]
 [8 9 5]
 [3 1 6]] 

matrix_A_int after adding a scalar to each element : 

[[ 7 10  3]
 [10 11  7]
 [ 5  3  8]]
