_CREDITS: (based on Brian Mann's notes and Jack Benedetto's notebook, plus some other cool stuff of my own)_

In [None]:
import numpy as np
from numpy.random import randn as randn
from numpy.random import randint as randint
%load_ext autoreload
%autoreload 2

%matplotlib inline
import numpy as np
from vectorplotter import VectorPlotter

# 2. Matrices

An $n \times p$ matrix is an array of numbers with $n$ rows and $p$ columns:

$$
X =
  \begin{bmatrix}
    x_{11} & x_{12} & \cdots & x_{1p} \\
    x_{21} & x_{22} & \cdots & x_{2p} \\
    \vdots & \vdots & \ddots & \vdots \\
    x_{n1} & x_{n2} & \cdots & x_{np} 
  \end{bmatrix}
$$

$n$ = the number of subjects  
$p$ = the number of features

For the following $2 \times 3$ matrix
$$
X =
  \begin{bmatrix}
    1 & 2 & 3\\
    4 & 5 & 6
  \end{bmatrix}
$$

We can create in Python using NumPy

In [None]:
X = np.array([[1,2,3],[4,5,6]])
print X
print X.shape

## 2.1. Basic Properties
Let $X$ and $Y$ be matrices **of the dimension $n \times p$**. Let $x_{ij}$ $y_{ij}$ for $i=1,2,\ldots,n$ and $j=1,2,\ldots,p$ denote the entries in these matrices, then

1. $X+Y$ is the matrix whose $(i,j)^{th}$ entry is $x_{ij} + y_{ij}$
2. $X-Y$ is the matrix whose $(i,j)^{th}$ entry is $x_{ij} - y_{ij}$
3. $aX$, where $a$ is any real number, is the matrix whose $(i,j)^{th}$ entry is $ax_{ij}$ 

In [None]:
X = np.array([[1,2,3],[4,5,6]])
print X
Y = np.array([[7,8,9],[10,11,12]])
print Y
print X+Y

In [None]:
X = np.array([[1,2,3],[4,5,6]])
print X
Y = np.array([[7,8,9],[10,11,12]])
print Y
print X-Y

In [None]:
X = np.array([[1,2,3],[4,5,6]])
print X
a=5
print a*X

In order to multiply two matrices, they must be _conformable_ such that the number of columns of the first matrix must be the same as the number of rows of the second matrix.

Let $X$ be a matrix of dimension $n \times k$ and let $Y$ be a matrix of dimension $k \times p$, then the product $XY$ will be a matrix of dimension $n \times p$ whose $(i,j)^{th}$ element is given by the dot product of the $i^{th}$ row of $X$ and the $j^{th}$ column of $Y$

$$\sum_{s=1}^k x_{is}y_{sj} = x_{i1}y_{1j} + \cdots + x_{ik}y_{kj}$$




In [None]:
X = np.array([[2,1,0],[-1,2,3]])
print X
Y = np.array([[0,-2],[1,2],[1,1]])
print Y

# Matrix multiply with dot operator
print np.dot(X,Y)
print X.dot(Y)

## 2.2. Additional Properties of Matrices
1. If $X$ and $Y$ are both $n \times p$ matrices,
then $$X+Y = Y+X$$
2. If $X$, $Y$, and $Z$ are all $n \times p$ matrices,
then $$X+(Y+Z) = (X+Y)+Z$$
3. If $X$, $Y$, and $Z$ are all conformable,
then $$X(YZ) = (XY)Z$$
4. If $X$ is of dimension $n \times k$ and $Y$ and $Z$ are of dimension $k \times p$, then $$X(Y+Z) = XY + XZ$$
5. If $X$ is of dimension $p \times n$ and $Y$ and $Z$ are of dimension $k \times p$, then $$(Y+Z)X = YX + ZX$$
6. If $a$ and $b$ are real numbers, and $X$ is an $n \times p$ matrix,
then $$(a+b)X = aX+bX$$
7. If $a$ is a real number, and $X$ and $Y$ are both $n \times p$ matrices,
then $$a(X+Y) = aX+aY$$
8. If $z$ is a real number, and $X$ and $Y$ are conformable, then
$$X(aY) = a(XY)$$

### Matrix Transpose

The transpose of an $n \times p$ matrix is a $p \times n$ matrix with rows and columns interchanged

$$
X^T =
  \begin{bmatrix}
    x_{11} & x_{12} & \cdots & x_{1n} \\
    x_{21} & x_{22} & \cdots & x_{2n} \\
    \vdots & \vdots & \ddots & \vdots \\
    x_{p1} & x_{p2} & \cdots & x_{pn} 
  \end{bmatrix}
$$



In [None]:
print X
print X.transpose()
print X.T
print X.shape, X.T.shape

### Properties of Transpose
1. Let $X$ be an $n \times p$ matrix and $a$ a real number, then 
$$(cX)^T = cX^T$$
2. Let $X$ and $Y$ be $n \times p$ matrices, then
$$(X \pm Y)^T = X^T \pm Y^T$$
3. Let $X$ be an $n \times k$ matrix and $Y$ be a $k \times p$ matrix, then
$$(XY)^T = Y^TX^T$$

### <span style="color:red">QUESTION: multiply matrices by hand</span>

$$ X =
\begin{bmatrix}
1 & 2 \\
0 & -1
\end{bmatrix}
\qquad
Y =
\begin{bmatrix}
-2 & 0 \\
-1 & 1
\end{bmatrix}$$

- what is $XY$ ?
- what is $YX$ ?
- what is $X^T Y^T$ ?
- what is $Y^T X^T$ ?

### Note: 

$$XY \neq YX$$ (most of the time)

In [None]:
print X.dot(Y)
print Y.dot(X)

If $X$ and $Y$ are square matrices of the same dimension then the both the product $XY$ and $YX$ exist, but even in this case there is no guarantee the two products will be the same

In [None]:
# Note that the regular multiply operator is just element-wise multiplication
print X
print Y.transpose()
print X*Y.transpose()

## 2.3. Vector in Matrix Form
A column vector is a matrix with $n$ rows and 1 column and to differentiate from a standard matrix $X$ of higher dimensions can be denoted as a bold lower case $\boldsymbol{x}$

$$
\boldsymbol{x} =
  \begin{bmatrix}
    x_{1}\\
    x_{2}\\
    \vdots\\
    x_{n}
  \end{bmatrix}
$$

In numpy, when we enter a vector, it will not normally have the second dimension, so we can reshape it

In [None]:
x = np.array([1,2,3,4])
print x
print x.shape

In [None]:
y = x.reshape(4,1)
z = x[:,np.newaxis]
print y
print z
print y.shape
print z.shape

and a row vector is generally written as the transpose

$$\boldsymbol{x}^T = [x_1, x_2, \ldots, x_n]$$

In [None]:
x_T = y.transpose()
print x_T
print x_T.shape
print x

If we have two vectors $\boldsymbol{x}$ and $\boldsymbol{y}$ of the same length $(n)$, then the _dot product_ is give by matrix multiplication

$$\boldsymbol{x}^T \boldsymbol{y} =   
    \begin{bmatrix} x_1& x_2 & \ldots & x_n \end{bmatrix}
    \begin{bmatrix}
    y_{1}\\
    y_{2}\\
    \vdots\\
    y_{n}
  \end{bmatrix}  =
  x_1y_1 + x_2y_2 + \cdots + x_ny_n$$

### <span style="color:red">QUESTION: column vectors</span>

Consider the following matrix:

$$ A =
\begin{bmatrix}
1 & -2 \\
1 & 3
\end{bmatrix}$$

On a piece of paper, draw the vectors corresponding to the two columns of A.

Now consider the vector $x = \begin{bmatrix}
0.5 \\
1
\end{bmatrix}$

Draw that vector on your referential. Then draw the vector resulting from $A \times x$. Is there a relation between those ?

In [None]:
A = np.array([[1, -2],[1,3]])

print("A = {}".format(A))

u = np.array([[1],[0]])
v = np.array([[0],[1]])

print("u = {}".format(u))

print("v = {}".format(v))

print("A * u = {}".format(np.dot(A,u)))

print("A * v = {}".format(np.dot(A,v)))

In [None]:
col1_A = np.dot(A,u)
col2_A = np.dot(A,v)

vp = VectorPlotter(figsize=(5, 5), limits=[-5,5])

# A columns
vp.plot_vector(col1_A, color='red')
vp.plot_vector(col2_A, color='orange')

vp.show()

In [None]:
x = np.array([[0.5],[1]])

# x
vp.plot_vector(x, color='b')

vp = VectorPlotter(figsize=(5, 5), limits=[-5,5])

# A columns
vp.plot_vector(col1_A, color='red')
vp.plot_vector(col2_A, color='orange')

# x
vp.plot_vector(x, color='b')

# A times x
vp.plot_vector(np.dot(A,x), color='cyan')

# col1 times x[0]
vp.plot_vector(col1_A * x[0], color='gray')

# col2 times x[1]
vp.plot_vector(col2_A * x[1], orig=(col1_A * x[0]), color='gray')

vp.show()

## 2.4. The Identity Matrix and Matrix Inverse

The identity matrix $I$ is a special matrix with 1's along the diagonal and zeros elsewhere, that when multiplied by another matrix returns the same value.
$$X = XI = IX$$

In [None]:
X = np.array([[1.,2,3],[4,5,6],[7,8,9]])
I = np.identity(3)
print X
print I
print X.dot(I)
print I.dot(X)

### Inverse of a Matrix

The inverse of a square $n \times n$ matrix $X$ is an $n \times n$ matrix $X^{-1}$ such that 

$$X^{-1}X = XX^{-1} = I$$

Where $I$ is the identity matrix. 

If such a matrix exists, then $X$ is said to be _invertible_ or _nonsingular_, otherwise $X$ is said to be _noninvertible_ or _singular_.

In [None]:
X = np.array([[1,2,3], [0,1,0], [-2, -1, 0]])
Y = np.linalg.inv(X)

print X
print Y
print Y.dot(X)

### Properties of Inverse
1. If $X$ is invertible, then $X^{-1}$ is invertible and
$$(X^{-1})^{-1} = X$$
2. If $X$ and $Y$ are both $n \times n$ invertible matrices, then $XY$ is invertible and
$$(XY)^{-1} = Y^{-1}X^{-1}$$
3. If $X$ is invertible, then $X^T$ is invertible and
$$(X^T)^{-1} = (X^{-1})^T$$

### Orthogonal Matrices

Let $X$ be an $n \times n$ matrix such than $X^TX = I$, then $X$ is said to be orthogonal which implies that $X^T=X^{-1}$

This is equivalent to saying that the columns of $X$ are all orthogonal to each other (and have unit length).

In [None]:
X = np.array([[0.,1,0],[0,0,1],[1,0,0]])
print X
print X.dot(X.T)

Multiplying two vectors each by an orthogonal matrix will not affect their dot product, so if
$$ X^T X = I $$
then
$$ (Xa)^T Xb = a^T X^T X b = a^T I b = a^T b $$


## 2.5. Matrix Equations

A system of equations of the form:
\begin{align*}
    a_{11}x_1 + \cdots + a_{1n}x_n &= b_1 \\
    \vdots \hspace{1in} \vdots \\
    a_{m1}x_1 + \cdots + a_{mn}x_n &= b_m 
\end{align*}
can be written as a matrix equation:
$$
A\mathbf{x} = \mathbf{b}
$$
and hence, has solution
$$
\mathbf{x} = A^{-1}\mathbf{b}
$$

## 2.6. Eigenvectors and Eigenvalues

Let $A$ be an $n \times n$ matrix and $\boldsymbol{x}$ be an $n \times 1$ nonzero vector. An _eigenvalue_ of $A$ is a number $\lambda$ such that

$$A \boldsymbol{x} = \lambda \boldsymbol{x}$$


A vector $\boldsymbol{x}$ satisfying this equation is called an eigenvector associated with $\lambda$

Eigenvectors and eigenvalues will play a huge roll in matrix methods later in the course (PCA, SVD, NMF).

In [None]:
X = np.array([[1, 1], [1, 2]])
vals, vecs = np.linalg.eig(X)
print vals
print vecs

In [None]:
lam = vals[0]
vec = vecs[:,0]
print X.dot(vec)
print lam * vec

### <span style="color:red">QUESTION: eigenvalues</span>

$A =
\begin{bmatrix}
0 & 1 \\
-2 & -3
\end{bmatrix}$

- The direction of the eigenvectors is given by: [1, -1] and [-1, 2].
- What are the eigenvalues associated with these eigenvectors?