Reading: Chapter 5 pages 72-76, Chapter 6 pages 81-88.


### I. Matrix Transpose

a) $[a_{ij}]^T = [a_{ji}]$ (Compact definition of the $A^T$)

b) $(LIVE)^T = E^TV^TI^TL^T$ (How the transpose relates to the matrix product)

c) Symmetric Matrices: $A = A^T$ (A special case)

In [None]:
import numpy as np

A = np.array([[2,2,3],[1,4,5],[9,3,2]])
B = np.array([[-1,-3,9],[2,-6,2],[9,8,7]])
C = np.array([[7,6],[8,-1],[9,2]])
D = np.array([[-2, 3, 1], [9, 2, 3]])
I3 = np.eye(len(A))
Ones = np.ones((3,3))
Zeros = np.zeros((3,3))
v = np.array([[1],[-2],[3]])
u = np.array([[1,-2,3]])
l = 2


**Exercise**. Suppose $D_1$ is 3 x 4, $D_2$ is 4 x 7, $D_3$ is 2 x 7, and $\vec{v} \in \mathbb{R}^2$. Is $D_1D_2D_3^T\vec{v}$ defined? If so, what are its dimensions?  

**Exercise**. For any matrix $A$, what are the properties of $A^TA$? Illustrate these properties with examples in python.

In [None]:
import numpy as np
print(A.T)

ata = A.T@A


print(ata)
print()

print(ata.T)
#simatrical

[[2 1 9]
 [2 4 3]
 [3 5 2]]
[[86 35 29]
 [35 29 32]
 [29 32 38]]

[[86 35 29]
 [35 29 32]
 [29 32 38]]


In [None]:
C = np.array([[7,6],[8,-1],[9,2]])
ctc = C@C.T
print(ctc)

[[85 50 75]
 [50 65 70]
 [75 70 85]]


In [None]:
3 X 4 * 4 X 7 =3 X 7
v = 2 by 1
3 by 1


#### Transpose Application: Multivariate Data Covariance Matrices.

Recall the Pearson Correlation Coefficient measures of how closely two different data vectors correspond with each other (linear relationship).

$r = \frac{\Sigma_{i=1}^n(x_i-\bar{x})(y_i-\bar{y})}{\sqrt{\Sigma_{i=1}^n(x_i-\bar{x})^2}\sqrt{\Sigma_{i=1}^n(y_i-\bar{y})^2}} = \frac{\tilde{x}\cdot\tilde{y}}{||\tilde{x}|| ||\tilde{y}||}$.

The <u>covariance</u> of two vectors $\vec{x}$ and $\vec{y}$ is defined as

$c_{x,y} = \frac{\Sigma_{i=1}^n(x_i-\bar{x})(y_i-\bar{y})}{n-1} = \frac{\tilde{x}\cdot\tilde{y}}{n-1}$.

Note that $c_{x,y}$ is just a variation of $r$: it has the same numerator but is "scaled" by a different denominator.

**Exercise**. Suppose you have a (presumably tall) $n \times m$ data matrix $X$. Write an expression for $C$, the data's $m \times m$ covariance matrix. (Hint: see previous exercise.)

### II. Matrix Norms

Recall that the *Euclidian Norm* (2-Norm) of a vector $\vec{v}$ is $||v|| = \sqrt{\sum_{i=1}^nv_i^2}$. This is a measure of vector magnitude.

The *Frobenius Norm* ($\ell2$ norm) of a matrix is defined as $||A||_F = \sqrt{\sum_{i=1}^N \sum_{j=1}^Ma_{ij}^2}$.

The *trace* of a matrix is the sum of its diagonal elements: $tr(A) = \sum_{i=1}^N a_{ii}$.

Interesting property: $||A||_F = \sqrt{tr(A^TA)}$.

**Exercise**. Calculate the Frobenius norm for matrix $A$ by using a double for-loop. Verify that your function works correctly using ```norm``` from the ```numpy.linalg``` library.   

In [None]:
import numpy as np
import math

B = np.array([[1,3,4],[2,-1,2],[4,-1,0]])
print(A[0])

def F_norm(A):
  sum = 0
  for row in A:
    for element in row:
      sum += element**2
  return math.sqrt(sum)

print(F_norm(B))
print(np.linalg.norm(B))

[1 3 4]
7.211102550927978
7.211102550927978


### III. Matrix Spaces

A <u>matrix equation</u> $A\vec{x} = \vec{b}$ has three possible outcomes for its solution vector $\vec{x}$:

1. There is a unique solution.
2. There are infinitely many solutions.
3. There are no solutions.

**Question**. What does it mean for the matrix equation $Ax=b$ to be "well-defined?"

**Question**. What does it mean to "find a solution" to $Ax=b$? (Think in terms of linear combinations of the columns of $A$.)

To do advanced analysis (such as finding eigenvectors or supervised learning models) we need to describe these 3 cases in terms of vector subspaces (the textbook calls them matrix spaces).

Recall that we can build new vectors from old vectors by the vector equation:

\begin{align} \mathbb{w} = \lambda_1\mathbb{v_1} + \lambda_2\mathbb{v_2} + ... + \lambda_n\mathbb{v_p}. \end{align}
These are linear combinations of vectors giving us a (possibly) new vector $w$.

Take $w = \vec{0}$. If there does not exist a set of weights that satisfy the above equation then we say that the set of vectors $V$ is a *linearly independent* set.

Also, recall that a *subspace* of $\mathbb{R}^n$ is a subset $V$ such that:
1. $\vec{0} \in V$
2. $\vec{u}, \vec{v} \in V ⇒ \vec{u} + \vec{v} \in V$
3. $\vec{v} \in V ⇒ k\vec{v} \in V$.

"Matrix Spaces" are special vector subspaces that are generated by matrices (in the context of matrix equations $A\vec{x} = \vec{b}$).

#### Column Space.

If we consider the columns of a matrix to be individual vectors $\vec{a_1}$, $\vec{a_2}$, ..., $\vec{a_m}$, then we can define the *column space* of a matrix $A$ as

$\begin{align}
C(A) & = \{\lambda_1\vec{a_1} + \lambda_2\vec{a_2} + ... + \lambda_m\vec{a_m} : \lambda_i \in \mathbb{R}\}
\end{align}$.

"For a given vector $\vec{v}$, do there exist weights $\lambda_1, ..., \lambda_m$ such that I can get $\vec{v}$ via linear combinations of the columns of matrix $A$?" If yes, then $\vec{v}$ is in the column space of $A$.

**Question**. If $A$ is $n \times m$ then the vectors in $C(A)$ are in $\mathbb{R}^?$.

**Examples**. Describe the column space for each of the following matrices.


1.   [[1, 2], [2, 4]]
2.   [[1, 2], [3, -2]]
3.   [[1, 0, 2], [0, 1, 5], [1, 1, 7]]
4.   [[1, 0, 2], [0, 1, 5], [0, 0, 1]]

**Question**. How can we think of the existence of a solution to the matrix equation $Ax=b$ in terms of $C(A)$? If a solution exists then $b \in$ ____.

**Question**. How can we think of the uniqueness of a solution to $Ax=b$ in terms of $C(A)$?