# Linear Algebra

Sources: [Deep Learning](www.deeplearningbook.org)

In [21]:
# Library imports
import numpy as np
from scipy import linalg

Definitions and notation:

- **Scalar**: a single number, such as $s \in \mathbb{R}$ or $n \in \mathbb{N}$.
- **Vector**: an array of numbers in order. If each element $x_i \in \mathbb{R}$ for vector $\mathbf{a}$, then vector $\mathbf{a}$ lies in set $\mathbb{R}^n$. Vectors in machine learning are typically column vectors (shape $n \times 1$). You can think of vectors as identifying points in space, with each element giving the coordinate along a diﬀerent axis.

\begin{align}
\mathbf{a} = \sum_{i=1}^n a_i b_i
\end{align}

- **Matrix**: 2D array of numbers, each element has two indices. A matrix $\mathbf{A}$ with $m$ rows and $n$ columns, then $\mathbf{A} \in \mathbb{R}^{m \times n}$. Elements of a matrix are identified as $A_{i, j}$ where the subscripts identify the $i$-th row and $j$-th column for the item.

\begin{align}
\mathbf{A} = \begin{bmatrix}
A_{1, 1} & A_{1, 2} \\
A_{2, 1} & A_{2, 2} 
\end{bmatrix}
\end{align}

- **Tensor**: an array $\mathsf{A}$ with more than two axes. Elements are identified by $\mathsf{A}_{i, j, k}$.
- **Transpose**: the transpose of a matrix is the mirror image of the matrix across the main diagonal (running down and to right):

\begin{align}
\mathbf{A} = \begin{bmatrix}
A_{1, 1} & A_{1, 2} \\
A_{2, 1} & A_{2, 2} \\
A_{3, 1} & A_{3, 2}
\end{bmatrix} \Rightarrow
\mathbf{A}^{\operatorname{T}} = \begin{bmatrix}
A_{1, 1} & A_{2, 1} & A_{3, 1} \\
A_{1, 2} & A_{2, 2} & A_{3, 2} 
\end{bmatrix}
\end{align}

- A **diagonal** matrix consists mostly of zeroes and has entries only along the main diagonal ($\mathbf{D}$ is diagonal if and only if $\mathbf{D}_{i, j} = 0$ for all $i \ne j$). A square diagonal matrix is written as a vector diag($\mathbf{v}$) which indicates the non-zero entries. They are easy to compute with since diag($\mathbf{v}$)$\mathbf{x}$ is just element-wise multiplication between the vectors (or the Hadamard product, defined below). Additionally, assuming all diagonal entries are non-zero, the inverse is diag($\mathbf{v}\text{)}^{-1}=$ diag($[1/v_1, \ldots , 1/v_n]^{\operatorname{T}}$).
- A **symmetric matrix** is one that's equal to its own transpose: $\mathbf{A} = \mathbf{A}^{\operatorname{T}}$.

Computations with vectors and matrices:

- The **dot product** of two vectors $\mathbf{a}$ and $\mathbf{b}$ (which have the same dimensionality) is defined as the sum of the element-wise products:

\begin{align}
\mathbf{a} \cdot \mathbf{b} = \sum_{i=1}^n a_i b_i
\end{align}

- **Matrix multiplication** of $A$ and $B$ only works if $A$ has the same number of columns as $B$ has rows. So if $A$ is $m \times n$ and $B$ is $n \times p$, the result $C$ is of shape $m \times p$

$$
C_{i, j} = \sum_k \mathbf{A}_{i, k} \mathbf{B}_{k, j}
$$

- There is an element-wise product defined for two matrices, which is the **Hadamard product** and is denoted ($\mathbf{A} \odot \mathbf{B}$)
- A system of equations can be written as $\mathbf{Ax} = \mathbf{b}$, where $\mathbf{A} \in \mathbb{R}^{m \times n}$ is a known matrix, $\mathbf{b} \in \mathbb{R}^{m}$ is a known vector, and $\mathbf{x} \in \mathbb{R}^{n}$ is a vector of unknown variables to solve for.

In [8]:
# Vectors and dot products
a = np.array([1, 2, 3, 4]).reshape(4,)
print('Vector a:', a)

b = np.array([1, 0, 2, 1]).reshape(4,)
print('Vector b:', b)

print('Dot product of a and b:', np.dot(a, b))

Vector a: [1 2 3 4]
Vector b: [1 0 2 1]
Dot product of a and b: 11


In [13]:
# Matrix multiplication
A = np.array([5, 2, 10, 1, 0, 7]).reshape(3, 2)
print('Matrix A:')
print(A)

B = np.array([1, 3, 0, 1]).reshape(2, 2)
print('Matrix B:')
print(B)

print('AB =')
print(np.matmul(A, B))

Matrix A:
[[ 5  2]
 [10  1]
 [ 0  7]]
Matrix B:
[[1 3]
 [0 1]]
AB =
[[ 5 17]
 [10 31]
 [ 0  7]]


## Matrix Multiplication Properties

Matrix multiplication is both distributive $\mathbf{A}(\mathbf{B} + \mathbf{C}) = \mathbf{A}\mathbf{B} + \mathbf{A}\mathbf{C}$ as well as associative $\mathbf{A}(\mathbf{B} \mathbf{C}) = (\mathbf{A}\mathbf{B}) \mathbf{C}$.

However, matrix multiplication is NOT commutative $\mathbf{A} \mathbf{B} \ne \mathbf{B} \mathbf{A}$. That said, the dot product between two vectors is commutative: $\mathbf{x}^{\operatorname{T}} \mathbf{y} = \mathbf{y}^{\operatorname{T}} \mathbf{x}$.

The transpose of a matrix product can be written as $\mathbf{AB}^{\operatorname{T}} = \mathbf{B}^{\operatorname{T}} \mathbf{A}^{\operatorname{T}}$.

In [14]:
# Transpose examples
print('Matrix A:')
print(A)
print('Transpose of A:')
print(A.T)

Matrix A:
[[ 5  2]
 [10  1]
 [ 0  7]]
Transpose of A:
[[ 5 10  0]
 [ 2  1  7]]


In [15]:
# Dot product examples
print('Vector a:')
print('Vector b:')
print('Tranpose of a dot b:')
print(np.dot(a.T, b))
print('Tranpose of b dot a:')
print(np.dot(b.T, a))

Vector a:
Vector b:
Tranpose of a dot b:
11
Tranpose of b dot a:
11


In [16]:
# Distributive property example
C = np.array([4, 4, 0, 1]).reshape(2, 2)
print('A(B + C):')
print(np.matmul(A, B+C))

print('AB + AC:')
print(np.matmul(A, B) + np.matmul(A, C))

A(B + C):
[[25 39]
 [50 72]
 [ 0 14]]
AB + AC:
[[25 39]
 [50 72]
 [ 0 14]]


In [17]:
# Associative property example
print('A(BC):')
print(np.matmul(A, np.matmul(B, C)))

print('(AB)C:')
print(np.matmul(np.matmul(A, B), C))

A(BC):
[[20 37]
 [40 71]
 [ 0  7]]
(AB)C:
[[20 37]
 [40 71]
 [ 0  7]]


## Identity and Inverse Matrices

The **identity matrix** is a matrix that does not change any vector when you multiply the vector by that matrix. The identity matrix that preserves $n$-dimensional vectors is denoted $\mathbf{I}_n$:

\begin{align}
\mathbf{I}_n \in \mathbb{R}^{n \times n} \text{and } \forall \mathbf{x} \in \mathbb{R}^n, \, \mathbf{I}_n \mathbf{x} = \mathbf{x}
\end{align}

The structure of an identity matrix has $1$'s along the main diagonal and zeroes for all other entries. For example:

\begin{align}
\mathbf{I}_3 = \begin{bmatrix}
1 & 0 & 0 \\
0 & 1 & 0 \\
0 & 0 & 1 \\
\end{bmatrix}
\end{align}

A **matrix inverse** of $\mathbf{A}$ is written as $\mathbf{A}^{-1}$ and is defined so $\mathbf{A}^{-1} \mathbf{A} = \mathbf{I}_n$. It's also possible to define an inverse that's multiplied on the right, such that $\mathbf{A} \mathbf{A}^{-1} = \mathbf{I}_n$. For square matrices ($m = n$), the left and right inverses are the same.

This is useful in theory to solve a system of linear equations $\mathbf{Ax} = \mathbf{b}$, where the solution is $\mathbf{x} = \mathbf{A}^{-1} \mathbf{b}$. This assumes that the inverse exists, for that to happen, the equation $\mathbf{Ax} = \mathbf{b}$ has exactly one solution (versus no solutions or an infinite number of solutions).

In [60]:
# Identity and inverse examples
I4 = np.eye(4)
print('Identity matrix (4x4):')
print(I4)
print()

print('Vector a:')
print(a)
print()

print('I_4 * a:')
print(np.matmul(I4, a))

Identity matrix (4x4):
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]

Vector a:
[1 2 3 4]

I_4 * a:
[1. 2. 3. 4.]


## Linear Combinations and Span

- A **linear combination** of a set of vectors $\{\mathbf{v}^{(1)}, \ldots , \mathbf{v}^{(n)} \}$ is given be multiplying each vector $\mathbf{v}^{(i)}$ by a scalar and adding the results:

\begin{align}
\displaystyle \sum_i = c_i \mathbf{v}^{(i)}
\end{align}

- The **span** of a set of vectors is the set of all points obtainable by linear combination of the original vectors.

Finding whether $\mathbf{Ax} = \mathbf{b}$ has a solution boils down to the following:

- $\mathbf{b}$ is in the span of the columns of $\mathbf{A}$ (also known as the **column space**, or **range** of $\mathbf{A}$)
- If $\mathbf{b} \in \mathbb{R}^m$, then the column space of $\mathbf{A}$ is all of $\mathbb{R}^m$. (If not, there's a potential value of $\mathbf{b}$ with no solution)
    - This implies that $\mathbf{A}$ have at least $m$ columns - $n \ge m$ (the matrix is at least as wide as it is tall) - otherwise the dimensionality of the column space would be less than $m$. For example, if $\mathbf{A}$ has shape $3 \times 2$ (so $\mathbf{b} \in \mathbb{R}^3$), it would be a system of $3$ equations with only $2$ unknown variables. At best, this could trace out a plane in $\mathbb{R}^3$, so the system would only have a solution if $\mathbf{b}$ fell on that plane. Note: the condition $n \ge m$ is only necessary for every point to have a solution, but doesn't guarantee that columns are independent (not redundant)
    - The columns must be **linearly independent** (a set of vectors is linearly independent if no vector is a linear combination of other vectors in the set - if you added a vector to the set that were a linear combination of others, it would not add points to the set's span)
    - Therefore, for the column space to encompass all $\mathbb{R}^m$, it must have a set of $m$ linearly independent columns
- To guarantee that the matrix $\mathbf{A}$ has an inverse, there must be *at most* one solution for each value of $\mathbf{b}$. To do so, $\mathbf{A}$ must have exactly $m$ columns (otherwise, there can be more than one way of parameterizing the solution)

To summarize, in order to find the solution of the system of linear equations using the inverse, $\mathbf{A}$ must be a **square matrix** ($m = n$) and all columns are linearly independent. If $\mathbf{A}$ isn't square, or is square but has linearly dependent columns (known as a **singular**), it's still possible to solve, but just not using the matrix inversion technique.

In [33]:
# Using scipy to solve a system of linear equations
solve_A = np.array([1., 2., 7., 1.]).reshape(2, 2)
solve_b = np.array([1., 20.])
print('A:')
print(solve_A)
print('b:')
print(solve_b)
print()

# Get inverse of A
solve_A_inv = linalg.inv(solve_A)

solution = np.matmul(solve_A_inv, solve_b)
print('Inverse solution:')
print(solution)
print()

# Use scipy's solver
alt_sol = linalg.solve(solve_A, solve_b)
print('Scipy sover solution:')
print(alt_sol)
print()

#Check
print('Check')
print(solve_A[0, 0] * solution[0] + solve_A[0, 1] * solution[1])
print(solve_A[1, 0] * solution[0] + solve_A[1, 1] * solution[1])

A:
[[1. 2.]
 [7. 1.]]
b:
[ 1. 20.]

Inverse solution:
[ 3. -1.]

Scipy sover solution:
[ 3. -1.]

Check
1.0
19.999999999999996


## Norms

- A function for the size of a vector is called a **norm**, or more formally, norms are functions that map vectors to non-negative values. The $L^p$ norm for $p \in \mathbb{R}$, $p \ge 1$ is:

\begin{align}
\Vert \mathbf{x} \Vert_p = \left( \displaystyle \sum_i \vert x_i \vert^p \right)^{\frac{1}{p}}
\end{align}

- The $L^2$ norm ($p = 2$) is the **Euclidean norm**, and is so common it can be seen written as $\Vert \mathbf{x} \Vert$. Another common norm is using the square of the $L^2$ norm, or simply $\mathbf{x}^{\mathsf{T}} \mathbf{x}$
- A **unit vector** is one with unit norm: $\Vert \mathbf{x} \Vert_2 = 1$
- The $L^1$ norm (which is the sum of the absolute values of vector entries) is important in machine learning when the difference between zero and non-zero elements is important.

\begin{align}
\Vert \mathbf{x} \Vert_1 = \displaystyle \sum_i \vert x_i \vert
\end{align}

- Occasionally the $L^{\infty}$ is used, which is defined as:

\begin{align}
\Vert \mathbf{x} \Vert_{\infty} = \text{max}_i \vert x_i \vert
\end{align}

- In deep learning, the **Frobenius norm** is used to calculate the size of a matrix

\begin{align}
\Vert \mathbf{A} \Vert_F = \sqrt{\displaystyle \sum_{i, j} A_{i, j}^2}
\end{align}

- The dot product in terms of norms, where $\theta$ is the angle between the vectors, is:

\begin{align}
\mathbf{x}^{\operatorname{T}} \mathbf{y} = \Vert \mathbf{x} \Vert_2 \Vert \mathbf{y} \Vert_2 \cos \theta
\end{align}

- A vector $\mathbf{x}$ and a vector $\mathbf{y}$ are **orthogonal** to each other if $\mathbf{x}^{\operatorname{T}} \mathbf{y} = 0$ (assuming both vectors have non-zero norms), which means they are at a $90^{\circ}$ angle to each other. In $\mathbb{R}^n$, at most $n$ vectors with non-zero norms can be mutually orthogonal.
- **Orthonormal** vectors are both orthogonal and have unit norms.
- An **orthogonal matrix** is a square matrix where the rows are mutually orthonormal and the columns are mutually orthonormal:

\begin{align}
\mathbf{A}^{\operatorname{T}} \mathbf{A} = \mathbf{A} \mathbf{A}^{\operatorname{T}} = \mathbf{I} \\
\text{and } \mathbf{A}^{-1} = \mathbf{A}^{\operatorname{T}}
\end{align}

In [36]:
# Norm examples
a_norm_l2 = linalg.norm(a)  # default is L2 for vector; Frobenius for matrix
a_norm_max = linalg.norm(a, ord=np.inf)
A_norm_frob = linalg.norm(A)

print('Vector a:')
print(a)
print()

print('L2 norm for a:')
print(a_norm_l2)
print()

print('L-inf norm (max of abs values) for a:')
print(a_norm_max)
print()

print('Matrix A:')
print(A)
print()

print('Frobenius norm for A:')
print(A_norm_frob)

Vector a:
[1 2 3 4]

L2 norm for a:
5.477225575051661

L-inf norm (max of abs values) for a:
4.0

Matrix A:
[[ 5  2]
 [10  1]
 [ 0  7]]

Frobenius norm for A:
13.379088160259652


## Eigenvalues, Eigenvectors, and Eigendecomposition

- An **eigenvector** of a square matrix $\mathbf{A}$ is a nonzero vector $\mathbf{v}$ such that multiplication by $\mathbf{A}$ alters only the scale of $\mathbf{v}$:

\begin{align}
\mathbf{Av} = \lambda \mathbf{v}
\end{align}

- The scalar $\lambda$ is known as the **eigenvalue** that corresponds to the above eigenvector
- Generally, you focus on unit eigenvectors. If $\mathbf{v}$ is an eigenvector of $\mathbf{A}$, then any rescaled vector $s\mathbf{v}$ for $s \in \mathbb{R}, \, s \ne 0$ is also an eigenvector with the same $\lambda$

Another way to look at eigenvectors and eigenvalues is through the lens that matrix multiplication applies a linear transformation to that original matrix $\mathbf{A}$. That transformation may rotate, flip, or stretch space in some way, but the eigenvectors of $\mathbf{A}$ are special, in that they'll remain on their original span, just scaled by $\lambda$ after the transformation. To find the eigenvectors, first re-write the right side of the above equation from scalar multiplication to matrix multiplication, and rearrange:

\begin{align}
\mathbf{Av} = (\lambda \mathbf{I}) \mathbf{v} \\
\mathbf{Av} - (\lambda \mathbf{I}) \mathbf{v} = \mathbf{0} \\
(\mathbf{A} - \lambda \mathbf{I}) \mathbf{v} = \mathbf{0}
\end{align}

The resulting equation shows $\mathbf{A}$ less the $\lambda$ value on the main diagonal multiplied by vector $\mathbf{v}$ gives the zero vector. When a nonzero matrix times a nonzero vector results in the zero vector, that means the determinant of the matrix must be zero. To find the eigenvectors and eigenvalues for $\mathbf{A}$, solve for $\text{det(}\mathbf{A} - \lambda \mathbf{I}\text{)} = 0$

- **Eigendecomposition** decomposes a matrix into a set of eigenvectors and eigenvalues. If you concatenate all eigenvectors $\mathbf{v}^{(i)}$ into a matrix $\mathbf{V} = [\mathbf{v}^{(1)}, \ldots , \mathbf{v}^{(n)}]$ and all eigenvalues into a vector $\mathbf{\lambda} = [\lambda^{(1)}, \ldots , \lambda^{(n)}]^{\operatorname{T}}$, then the decomposition of $\mathbf{A}$ is:

\begin{align}
\mathbf{A} = \mathbf{V} \text{diag(} \mathbf{\lambda}\text{)} \mathbf{V}^{-1}
\end{align}

Constructing a matrix with specific eigenvectors and eigenvalues allows you to stretch space in a specific way. In $\mathbb{R}^2$, picture a unit circle  (where the set of all unit vectors is $\mathbf{u}$), showing the existing eigenvectors for a matrix $\mathbf{A}$. After multiplying $\mathbf{Au}$, eigenvector $\mathbf{v}^{(1)}$ will be scaled by $\lambda_1$ and eigenvector $\mathbf{v}^{(2)}$ will be scaled by $\lambda_2$, with the unit circle distorting to encompass those new vector locations.

Not all matrices have real-valued eigendecompositions. However, every real symmetric matrix can be decomposed into an expression using only real-valued eigenvectors and eigenvalues:

\begin{align}
\mathbf{A} = \mathbf{Q} \mathbf{\Lambda} \mathbf{Q}^{\operatorname{T}}
\end{align}

- $\mathbf{Q}$ is an orthogonal matrix composed of eigenvectors of $\mathbf{A}$
- $\mathbf{\Lambda}$ (capital lambda) is a diagonal matrix where the eigenvalue $\Lambda_{i, i}$ is associated with the eigenvector in column $i$ of $\mathbf{Q}$, denoted as $\mathbf{Q}_{:, i}$
- Since $\mathbf{Q}$ is an orthogonal matrix, $\mathbf{A}$ scales space by $\lambda_i$ in the direction $\mathbf{v}^{(i)}$ (akin to the unit circle description above)
- While an eigendecomposition will exist for a real symmetric matrix, it isn't guaranteed to be unique

What an eigendecomposition of a matrix can tell you:

- The matrix is singular if and only if any of the eigenvalues are zero
- The decomposition (of real symmetric matrix) can be use to optimize quadratic expressions in the form $f(\mathbf{x}) = \mathbf{x}^{\operatorname{T}} \mathbf{Ax}$ (where $\Vert \mathbf{x} \Vert_2 = 1$)
- A matrix whose eigenvalues are all positive is called **positive definite**. This guarantees that $\mathbf{x}^{\operatorname{T}} \mathbf{Ax} = 0 \Rightarrow \mathbf{x} = 0$
- A matrix whose eigenvalues are all positive or zero is called **positive semidefinite**, which guarantees that $\forall \mathbf{x} \text{, } \mathbf{x}^{\operatorname{T}} \mathbf{Ax} \ge 0$ 

In [40]:
# Eigenvector and eigenvalue example
tmp = np.array([3., 1., 0, 2.]).reshape(2, 2)
eig_vals = linalg.eig(tmp)

print('Matrix:')
print(tmp)
print()

print('Eigenvalues:')
print(eig_vals[0])
print()

print('Eigenvectors:')
print(eig_vals[1])

Matrix:
[[3. 1.]
 [0. 2.]]

Eigenvalues:
[3.+0.j 2.+0.j]

Eigenvectors:
[[ 1.         -0.70710678]
 [ 0.          0.70710678]]


## Singular Value Decomposition

- **Singular value decomposition (SVD)** is a way to factor a matrix into singular vectors and singular values. It sheds similar light on a matrix as eigendecomposition, but is more universally applicable - every real matrix has an SVD, even ones that aren't square (a requirement for eigendecomposition)


\begin{align}
\mathbf{A} = \mathbf{U}\mathbf{D}\mathbf{V}^{\operatorname{T}}
\end{align}

- $\mathbf{A}$ is shape $m \times n$
- $\mathbf{U}$ is shape $m \times m$ and is an orthogonal matrix. Its columns are the **left-singular vectors**
- $\mathbf{D}$ is shape $m \times n$ (not necessarily square) and is a diagonal matrix. The elements of $\mathbf{D}$ are the **singular values** of $\mathbf{A}$
- $\mathbf{V}$ is shape $n \times n$ and is an orthogonal matrix. Its columns are the **right-singular vectors**
- The left-singular vectors of $\mathbf{A}$ are the eigenvectors of $\mathbf{A} \mathbf{A}^{\operatorname{T}}$
- The right-singular vectors of $\mathbf{A}$ are the eigenvectors of $\mathbf{A}^{\operatorname{T}} \mathbf{A}$
- The nonzero singular values of $\mathbf{A}$ are the square roots of the eigenvalues of both $\mathbf{A}^{\mathsf{T}} \mathbf{A}$ and $\mathbf{A} \mathbf{A}^{\operatorname{T}}$

SVD is useful to partially generalize matrix inversion (which isn't defined for non-square matrices).

In [43]:
# SVD example
U, s, Vh = linalg.svd(A)

print('Shape of matrix A:', A.shape)
print('Matrix A:')
print(A)
print()

print('Shape of U:', U.shape)
print('U:')
print(U)
print()

print('Shape of s:', s.shape)
print('s:')
print(s)
print()

print('Shape of Vh:', Vh.shape)
print('Vh:')
print(Vh)

Shape of matrix A: (3, 2)
Matrix A:
[[ 5  2]
 [10  1]
 [ 0  7]]

Shape of U: (3, 3)
U:
[[-0.46824193  0.0953727  -0.87843813]
 [-0.86978761 -0.2248469   0.43921906]
 [-0.15562458  0.96971538  0.18823674]]

Shape of s: (2,)
s:
[11.41254422  6.98239461]

Shape of Vh: (2, 2)
Vh:
[[-0.96727649 -0.25372463]
 [-0.25372463  0.96727649]]


In [58]:
# Checking SVD output to eigenvalue claims (true, order is different)
A_AT = np.matmul(A, A.T)
eig_A_AT = linalg.eig(A_AT)

AT_A = np.matmul(A.T, A)
eig_AT_A = linalg.eig(AT_A)

print('Eigenvectors of A*A^T (should match cols of U):')
print(eig_A_AT[1])
print()
print('Left-singular vectors of A (U):')
print(U)
print()

print('Eigenvectors of A^T*A (should match cols of Vh):')
print(eig_AT_A[1])
print()
print('Right-singular vectors of A (Vh):')
print(Vh)
print()

print('Eigenvalues of A*A^T:')
print(eig_A_AT[0])
print('Eigenvalues of A^T*A:')
print(eig_AT_A[0])
print('Squares of singular values of A:')
print(s**2)

Eigenvectors of A*A^T (should match cols of U):
[[ 0.46824193  0.87843813  0.0953727 ]
 [ 0.86978761 -0.43921906 -0.2248469 ]
 [ 0.15562458 -0.18823674  0.96971538]]

Left-singular vectors of A (U):
[[-0.46824193  0.0953727  -0.87843813]
 [-0.86978761 -0.2248469   0.43921906]
 [-0.15562458  0.96971538  0.18823674]]

Eigenvectors of A^T*A (should match cols of Vh):
[[ 0.96727649 -0.25372463]
 [ 0.25372463  0.96727649]]

Right-singular vectors of A (Vh):
[[-0.96727649 -0.25372463]
 [-0.25372463  0.96727649]]

Eigenvalues of A*A^T:
[ 1.30246165e+02+0.j -1.86517468e-14+0.j  4.87538345e+01+0.j]
Eigenvalues of A^T*A:
[130.24616546+0.j  48.75383454+0.j]
Squares of singular values of A:
[130.24616546  48.75383454]


## The Determinant

- The **determinant** of a matrix is the factor by which the transformation described by the matrix on the basis vectors stretches or squishes space. In $\mathbb{R}^2$ you can think of how the area of a given region changes, in $\mathbb{R}^3$ it's how the volume changes.