<h1><center> PPOLS564: Foundations of Data Science </center><h1>
<h3><center> Lecture 13 <br><br><font color='grey'> Matrix Operations </font></center></h3>

# Concepts For today:

- (Cont.) Matrix Multiplication 
- Matrix Addition & Subtraction 
- Transposing Matrices
- Different Types of Matrices
- Useful Statistics

## Note
In the following lectures, we'll delve into exploring linear algebra. Note that I'll be using some code to help generate some interactive visualizes for some concepts. To use this code yourself, two things must be true: (1) the `bokeh` module must be installed, and (2) the `visualize.py` script must be in the same file director as this notebook and the jupyter notebook must be activated from that location.

Finally, note that these lecture slides are intended to be supplementary to the lectures and readings.

In [1]:
import numpy as np
from visualize import LinearAlgebra as vla
plot = vla()

## Multiplying Matrices

What happens when we want to perform two transformation simultaneously?

$$ g \circ f: x \mapsto s \mapsto y $$

$$ g(\vec{x}) = \vec{s}$$

$$ f(\vec{s}) = y $$

We can represent this as the multiplication of two (or more) matrices.

$$ f(g(\vec{x})) = \textbf{A}_{2x2}\textbf{B}_{2x2}  \vec{x}$$

That is, we transform $\vec{x}$ by $\textbf{B}$ and then transform that resulting vector by $\textbf{A}$ much as we would with the nested function $f(g(\vec{x}))$. 

In [2]:
A = np.array([[ 1., -1.],[ 0.,  3.]])
A

array([[ 1., -1.],
       [ 0.,  3.]])

In [3]:
B = np.array([[-3. ,  1. ],[ 0.5,  2.3]])
B

array([[-3. ,  1. ],
       [ 0.5,  2.3]])

In [4]:
a = np.array([1,2])
a

array([1, 2])

In [5]:
a_new = B.dot(a)
a_new

array([-1. ,  5.1])

In [6]:
A.dot(a_new)

array([-6.1, 15.3])

This is equivalent to...

In [7]:
A.dot(B).dot(a)

array([-6.1, 15.3])

#### Now visually

In [8]:
plot.graph(25,grid=True)
plot.vector(a)
plot.show()

In [9]:
plot.clear()
plot.change_basis(B)
plot.graph(25,grid=True)
plot.vector(a)
plot.show()

In [10]:
plot.clear()
plot.change_basis(A.dot(B))
plot.graph(25,grid=True)
plot.vector(a)
plot.show()

In [11]:
# Let's do this all on in one go.
plot.clear().graph(25)
plot.vector(a)
plot.vector(A.dot(B).dot(a),add_color="purple")
plot.show()

### Multiplying two matrices by hand 

$$\textbf{A}_{mxn} = 
                     \begin{bmatrix}
                         \vec{a}_{1} & \vec{a}_{2} & \dots & \vec{a}_{n}\\
                     \end{bmatrix} = 
                    \begin{bmatrix} a_{11} & a_{12} & \dots & a_{1n}  \\ 
                                     a_{21}  & a_{22} & \dots & a_{2n} \\
                                     \vdots & \vdots & \ddots & \vdots \\
                                     a_{m1}  & a_{m2} & \dots & a_{mn} \\
                     \end{bmatrix} $$
                     
$$\textbf{B}_{nxk} = 
                    \begin{bmatrix}
                         \vec{b}_{1} & \vec{b}_{2} & \dots & \vec{b}_{k}\\
                     \end{bmatrix}  = 
                    \begin{bmatrix} b_{11} & b_{12} & \dots & b_{1k}  \\ 
                                     b_{21}  & b_{22} & \dots & b_{2k} \\
                                     \vdots & \vdots & \ddots & \vdots \\
                                     b_{n1}  & b_{n2} & \dots & b_{nk} \\
                     \end{bmatrix}$$
                     
Note that to multiply two matrices, their corresponding dimensions must align. Why? 

$$m \times \textbf{n}~~\textbf{n}\times k $$

Think of the $\textbf{A}$ as performing a linear transformation on each column vector of $\textbf{B}$

$$ \textbf{A}_{m \times \textbf{n}} \textbf{B}_{\textbf{n}\times k} = 
                     \begin{bmatrix}
                         \textbf{A} \vec{b}_{1} & \textbf{A} \vec{b}_{2} & \dots & \textbf{A} \vec{b}_{k}\\
                     \end{bmatrix}$$
                     
                     
$$
\begin{bmatrix}
 [a_{11} b_{11} + \dots + a_{1n} b_{n1} ] & [a_{11} b_{12} + \dots + a_{1n} b_{n2} ] & \dots & [a_{11} b_{1k} + \dots + a_{1n} b_{nk} ]\\
 \vdots & \vdots & \vdots & \vdots \\
 [a_{m1} b_{11} + \dots + a_{mn} b_{n1} ] & [a_{m1} b_{22} + \dots + a_{mn} b_{n2} ] & \dots & [a_{m1} b_{1k} + \dots + a_{mn} b_{nk} ]
\end{bmatrix}$$

$$ \textbf{A}_{m \times \textbf{n}} \textbf{B}_{\textbf{n}\times k} = \textbf{C}_{m\times k} $$

**Example**:

$$\textbf{A}_{2x3} = 
                    \begin{bmatrix} 1 & 3 & 1  \\ 
                                    0 & 1 & -1 \\
                     \end{bmatrix} $$

$$\textbf{B}_{3x2} = 
                    \begin{bmatrix} 2 & 1 \\ 
                                    -1 & -2 \\
                                    4 & 3 \\
                     \end{bmatrix} $$
                     
                     
$$ \textbf{A}_{2x3} \textbf{B}_{3x2} $$

$$ \begin{bmatrix}
   \textbf{A}\begin{bmatrix} 2  \\ -1 \\ 4 \\ \end{bmatrix} &
   \textbf{A}\begin{bmatrix} 1 \\ -2 \\ 3 \\ \end{bmatrix} 
   \end{bmatrix} $$
   
   
$$ \begin{bmatrix}
   \begin{bmatrix} 1  \\ 3 \\ 1 \\ \end{bmatrix} \begin{bmatrix} 2  \\ -1 \\ 4 \\ \end{bmatrix} &
   \begin{bmatrix} 1  \\ 3 \\ 1 \\ \end{bmatrix}\begin{bmatrix} 1 \\ -2 \\ 3 \\ \end{bmatrix} \\
   \begin{bmatrix} 0  \\ 1 \\ -1 \\ \end{bmatrix} \begin{bmatrix} 2  \\ -1 \\ 4 \\ \end{bmatrix} &
   \begin{bmatrix} 0  \\ 1 \\ -1 \\ \end{bmatrix}  \begin{bmatrix} 1 \\ -2 \\ 3 \\ \end{bmatrix} 
   \end{bmatrix} $$
   
$$ \begin{bmatrix}
    1(2) + 3(-1) + 1(4) & 1(1) + 3(-2) + 1(3)\\
    0(2) + 1(-1) + -1(4) & 0(1) + 1(-2) + -1(3)
   \end{bmatrix} $$
   
$$ \begin{bmatrix}
    3 & -2\\
    -5 & -5
   \end{bmatrix} $$

In [12]:
A = np.array([[1,3,1],[0,1,-1]])
B = np.array([[2,1],[-1,-2],[4,3]])
print(A)
print(B)

[[ 1  3  1]
 [ 0  1 -1]]
[[ 2  1]
 [-1 -2]
 [ 4  3]]


In [13]:
A.dot(B)

array([[ 3, -2],
       [-5, -5]])

## Properties of Matrix Multiplication 

<font color = "grey">~~**COMMUNITIVE**~~</font>

<font color = "grey">$$ \textbf{A} \textbf{B} \ne \textbf{B} \textbf{A}  $$  </font>


**ASSOCIATIVE**

$$(\textbf{A} \textbf{B}) \textbf{C} = \textbf{A} (\textbf{B} \textbf{C}) = \textbf{A} \textbf{B} \textbf{C} $$


**DISTRIBUTIVE**

$$\textbf{A}(\textbf{B} + \textbf{C}) = \textbf{A}\textbf{B} + \textbf{A}\textbf{C}$$

But remember it's not communicative, so order matters!

-----

# Matrix Addition and Substitution

Much like vectors, multiply and adding vectors is done so element-wise.

$$\textbf{B}_{3x2} = 
                    \begin{bmatrix} 2 & 1 \\ 
                                    -1 & -2 \\
                                    4 & 3 \\
                     \end{bmatrix} $$

$$\textbf{C}_{3x2} = 
                    \begin{bmatrix} 1 & 2 \\ 
                                    -2 & 1 \\
                                    2 & 1 \\
                     \end{bmatrix} $$

In [14]:
B = np.array([[2,1],[-1,-2],[4,3]])
B

array([[ 2,  1],
       [-1, -2],
       [ 4,  3]])

In [15]:
C = np.array([[1,2],[-2,1],[2,1]])
C

array([[ 1,  2],
       [-2,  1],
       [ 2,  1]])

### Addition 
$$ \textbf{B}_{3x2} + \textbf{C}_{3x2} $$


$$   \begin{bmatrix} 2 & 1 \\ 
                                    -1 & -2 \\
                                    4 & 3 \\
                     \end{bmatrix} +  \begin{bmatrix} 1 & 2 \\ 
                                    -2 & 1 \\
                                    2 & 1 \\
                     \end{bmatrix} $$
                     
                     
$$  \begin{bmatrix} 2 + 1 & 1 + 2 \\ 
                                   -1 + -2 & -2 + 1 \\
                                    4 + 2& 3 + 1\\
                     \end{bmatrix} $$  
                     
                     
$$  \begin{bmatrix} 3 & 3 \\ 
                   -3 & -1 \\
                    6 & 4\\
    \end{bmatrix} $$                       

In [16]:
B + C

array([[ 3,  3],
       [-3, -1],
       [ 6,  4]])

### Subtraction 
$$ \textbf{B}_{3x2} + \textbf{C}_{3x2} $$


$$   \begin{bmatrix} 2 & 1 \\ 
                                    -1 & -2 \\
                                    4 & 3 \\
                     \end{bmatrix} +  \begin{bmatrix} 1 & 2 \\ 
                                    -2 & 1 \\
                                    2 & 1 \\
                     \end{bmatrix} $$
                     
                     
$$  \begin{bmatrix} 2 - 1 & 1 - 2 \\ 
                                   -1 - -2 & -2 - 1 \\
                                    4 - 2& 3 - 1\\
                     \end{bmatrix} $$  
                     
                     
$$  \begin{bmatrix} 1 & -1 \\ 
                    1 & -3 \\
                    2 & 2\\
    \end{bmatrix} $$                       

In [17]:
B - C

array([[ 1, -1],
       [ 1, -3],
       [ 2,  2]])

#### Must have corresponding elements

In [18]:
D = np.array([[1,2],[2,4]])
D

array([[1, 2],
       [2, 4]])

In [19]:
B - D

ValueError: operands could not be broadcast together with shapes (3,2) (2,2) 

# Transposing a Matrix


$$\textbf{A}_{2x3} = \begin{bmatrix} a_{11} & a_{12} & a_{13}  \\ 
                                     a_{21}  & a_{22} & a_{23} \\
                     \end{bmatrix} $$
                     
                     

$$\textbf{A}^T_{3x2} = \begin{bmatrix} a_{11} & a_{12} \\ 
                                       a_{21}  & a_{22}\\
                                       a_{31}  & a_{32}\\
                     \end{bmatrix} $$

In [None]:
A = np.array([[1,2,3],
             [4,5,6]])
A

In [None]:
A.T

### Properties

$$ (\textbf{A}^T)^T = A $$

$$ (\textbf{A} + \textbf{B})^T = \textbf{A}^T + \textbf{B}^T $$

$$ (C\textbf{A})^T = cA^T $$

$$ (\textbf{A}\textbf{B})^T = \textbf{A}^T \textbf{B}^T $$

## "Squaring" a matrix: Sum of Squares

Recall that to multiply two matrices, their rows and columns must correspond. We can manufacture this condition by taking the dot product of a matrix transposed with itself.

In [None]:
A

In [None]:
At = A.T
At

In [None]:
A.dot(At)

In [None]:
At.dot(A)

What is going on here?

$$ \textbf{A}_{2x3} \textbf{A}^T_{3x2} $$

$$\begin{bmatrix} a_{11} & a_{12} & a_{13}  \\ 
                                     a_{21}  & a_{22} & a_{23} \\
                     \end{bmatrix} 
                     \begin{bmatrix} a_{11} & a_{12} \\ 
                                       a_{21}  & a_{22}\\
                                       a_{31}  & a_{32}\\
                     \end{bmatrix} $$
                     
$$  \begin{bmatrix} a_{11}a_{11} + a_{12}a_{21}  + a_{13}a_{31} &  a_{11}a_{11}  + a_{12}a_{21} + a_{13}a_{31}\\ 
                    a_{21}a_{11}  + a_{22}a_{21} + a_{23}a_{31} &  a_{21}a_{12}  + a_{22}a_{22} + a_{23}a_{32}\\
    \end{bmatrix}  $$  
    
With numbers this time ...    
$$\begin{bmatrix} 1 & 2 & 3  \\ 
                   4  & 5 & 6 \\
                     \end{bmatrix} 
                     \begin{bmatrix} 1 & 4\\ 
                                       2  & 5\\
                                       3  & 6\\
                     \end{bmatrix} $$
                     
$$  \begin{bmatrix} 1(1) + (2)(2) + (3)(3) &  1(4) + 2(5) + 3(6)\\ 
                    4(1) + 5(2) + 3(6) &  4(4) + (5)(5) + (6)(6)\\
    \end{bmatrix}  $$     
    
$$  \begin{bmatrix}14 &  32\\ 
                    32&  77\\
    \end{bmatrix}  $$     
    
Given what we know about vector dot products...

$$  \begin{bmatrix}
    length &  projection\\ 
    projection &  length\\
    \end{bmatrix}  $$  

In other words, a matrix dotted by its transpose generates a sum of squares. We can leverage this property to calculate a who range of vital statistics! (See Below)

# Different Types of Matrices

In [35]:
X = np.random.randn(25).reshape(5,5).round(2)
X

array([[-1.06, -0.33, -0.34, -0.17, -0.77],
       [ 0.92,  0.92,  0.26,  0.72,  1.22],
       [-1.13, -0.18,  1.98, -0.3 , -0.78],
       [-2.24, -1.45, -0.46, -1.83,  1.01],
       [ 1.89,  0.9 , -0.88,  0.43,  1.25]])

### Symmetric Matrices

In [36]:
X.dot(X.T)

array([[ 1.9699, -2.429 ,  1.2356,  2.5427, -3.0368],
       [-2.429 ,  3.7672, -1.858 , -3.5998,  4.1726],
       [ 1.2356, -1.858 ,  5.9281,  1.6426, -5.1441],
       [ 2.5427, -3.5998,  1.6426, 11.7007, -4.6582],
       [-3.0368,  4.1726, -5.1441, -4.6582,  6.9039]])

### Upper Triangle Matrices

In [22]:
np.triu(X)

array([[-0.41,  0.3 , -0.86, -2.29, -0.14],
       [ 0.  , -0.84, -0.74,  1.68, -0.07],
       [ 0.  ,  0.  ,  0.85, -2.12, -0.38],
       [ 0.  ,  0.  ,  0.  ,  0.75,  0.83],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.52]])

### Lower Triangle Matrices

In [23]:
# Lower Triangle
np.tril(X)

array([[-0.41,  0.  ,  0.  ,  0.  ,  0.  ],
       [-1.56, -0.84,  0.  ,  0.  ,  0.  ],
       [-1.58,  0.14,  0.85,  0.  ,  0.  ],
       [ 0.04, -0.31,  0.42,  0.75,  0.  ],
       [ 0.41, -1.39,  0.08, -1.22,  0.52]])

### Diagonal Matrices

In [24]:
np.diag(np.array([4,2,10,-1]))

array([[ 4,  0,  0,  0],
       [ 0,  2,  0,  0],
       [ 0,  0, 10,  0],
       [ 0,  0,  0, -1]])

### Zero Matrices

In [25]:
np.zeros((5,5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

### Idempotent Matrices

In [26]:
P = np.array([[2,-2,-4],[-1,3,4],[1,-2,-3]])
P

array([[ 2, -2, -4],
       [-1,  3,  4],
       [ 1, -2, -3]])

In [27]:
P.dot(P)

array([[ 2, -2, -4],
       [-1,  3,  4],
       [ 1, -2, -3]])

### Identity Matrix

In [28]:
I = np.eye(5)
I

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

Note that an identity matrix is also a diagonal and idempotent matrix.

In [29]:
I.dot(I)

array([[1., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0.],
       [0., 0., 1., 0., 0.],
       [0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 1.]])

In [30]:
I.dot(X)

array([[-0.41,  0.3 , -0.86, -2.29, -0.14],
       [-1.56, -0.84, -0.74,  1.68, -0.07],
       [-1.58,  0.14,  0.85, -2.12, -0.38],
       [ 0.04, -0.31,  0.42,  0.75,  0.83],
       [ 0.41, -1.39,  0.08, -1.22,  0.52]])

### Sparse Matrices

In [31]:
X = np.zeros((10,10))
X[[1,4,6,5,2],[1,5,3,5,1]] = 1
X

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [32]:
np.nonzero(X)

(array([1, 2, 4, 5, 6]), array([1, 1, 5, 5, 3]))

In [33]:
from scipy import sparse
print(sparse.csc_matrix(X))

  (1, 1)	1.0
  (2, 1)	1.0
  (6, 3)	1.0
  (4, 5)	1.0
  (5, 5)	1.0
