## Matrices  -----  Notation and Operations

## Matrix notation
<a href="https://en.wikipedia.org/wiki/Matrix_(mathematics)">Matrix Notation</a> is a notation system that allows succinct representation of complex operations, such as a change of basis. 
<img style="float: right;width:500px;height:500px" src="images/Matrix.svg.png">


* **Matlab** is based on Matrix Notation.

* **Python**: similar functionality by using **numpy**

Recall that a **vector** can be represented as a one dimensional array of numbers. A **matrix** is a two dimensional rectangle of numbers. A matrix consists of rows, indexed from the top to the bottom and of columns, indexed from the left to the right. As is described in the  figure.

A matrix with $n$ rows and $m$ columns is said to be an "$m$ by $n$ matrix".
In numpy we will say that the **shape** of the matrix is $(m,n)$. We will also use the LaTeX notation $M_{m \times n}$ to indicate that $M$ is an $m \times n$ matrix.

### Transposing a Matrix

At times it is useful to switch the rows and column dimensions of matrices.  Consider the matrix
$$
\begin{equation}
	A=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}
\end{equation}
$$
The transpose of A  is
$$
\begin{equation}
   A^{\mathsf{T}}=\begin{bmatrix}
	  a_{11} & a_{21} & a_{31} \\
	  a_{12} & a_{22} & a_{32} \\
	\end{bmatrix}
\end{equation}
$$

In [1]:
import numpy as np
# The .reshape command reorganized the elements of a matrix into a new shape
A = np.array(list(range(6)))
print('A=',A)
B=A.reshape(2,3)
print("B is a 2X3 matrix:\n",B)
print("the shape of B is:",B.shape)
print("The transpose of B is\n",B.T)
print("the shape of B.T is:",B.T.shape)

A= [0 1 2 3 4 5]
B is a 2X3 matrix:
 [[0 1 2]
 [3 4 5]]
the shape of B is: (2, 3)
The transpose of B is
 [[0 3]
 [1 4]
 [2 5]]
the shape of B.T is: (3, 2)


### Vectors as matrices.
When using matrix notation, vectors can be represented as either [row or column vectors](https://en.wikipedia.org/wiki/Row_and_column_vectors). In a matrix context, a vector $\vec{v}$ is denoted by a bold-face letter. ${\bf v}$ for a column vector and ${\bf v}^\top$ for row vector:

* By default a vector is represented as a **column vector** which is a matrix consisting of a single column:
$$
\begin{equation}
{\bf v}=
	\begin{bmatrix}
	  v_1 \\
      v_2 \\
      \vdots \\
	  v_d
	\end{bmatrix}
\end{equation}
$$

* If $\vec{v}$ is a column vector then the **transpose** of $\vec{v}$, denoted by $\vec{v}^\top$ is a **row vector** which is a matrix consists of a single row:
$$
\begin{equation}
{\bf v}^{\top}=
	\begin{bmatrix}
	  v_1 & v_2 & \cdots & v_d
	\end{bmatrix}
\end{equation}
$$

#### A vector as a matrix
Row and Column vectors can be thought of as matrices.
* The column vector ${\bf v}$ is a $d \times 1$ matrix.
* The row vector ${\bf v}^{\top}$ is a $1 \times d$ matrix.

#### A matrix as a collection of vectors

Matrices can be represented as a collection of vectors. For example, consider the $2\times 3$ matrix ${\bf A}=\begin{bmatrix}
	  a_{11} & a_{12} & a_{13}\\
	  a_{21} & a_{22} & a_{23}	
	\end{bmatrix}$

We can represent ${\bf A}=\begin{bmatrix}
	  a_{11} & a_{12} & a_{13}\\
	  a_{21} & a_{22} & a_{23}	
	\end{bmatrix}$ as vectors in one of two ways:
* As a row of column vectors:
$$ {\bf A} = \begin{bmatrix} {\bf c}_1 , {\bf c}_2 , {\bf c}_3 \end{bmatrix}$$
where
$$
   {\bf c}_1=\begin{bmatrix} a_{11}\\ a_{21} \end{bmatrix},
   {\bf c}_2=\begin{bmatrix} a_{12}\\ a_{22} \end{bmatrix}, 
   {\bf c}_3=\begin{bmatrix} a_{13}\\ a_{23} \end{bmatrix}$$

* Or as a column of row vectors: $
{\bf A} = \begin{bmatrix} {\bf r}_1 \\ {\bf r}_2 \end{bmatrix}$  
where $
   {\bf r}_1=\begin{bmatrix} a_{11}, a_{12}, a_{13} \end{bmatrix},
   {\bf r}_2=\begin{bmatrix} a_{21}, a_{22}, a_{23} \end{bmatrix}, 
$

In [2]:
A=np.array(list(range(6))).reshape(2,3)
print('A=\n',A)

A=
 [[0 1 2]
 [3 4 5]]


In [3]:
print("Splitting A into columns:")
Columns=np.split(A,3,axis=1)
for i in range(len(Columns)):
    print('column %d'%i)
    print(Columns[i])

Splitting A into columns:
column 0
[[0]
 [3]]
column 1
[[1]
 [4]]
column 2
[[2]
 [5]]


In [4]:
A_recon=np.concatenate(Columns,axis=1)
print('reconstructing the matrix from the columns:')
print(A_recon)
print('Checking that the reconstruction is equal to the original')
print(A_recon==A)

reconstructing the matrix from the columns:
[[0 1 2]
 [3 4 5]]
Checking that the reconstruction is equal to the original
[[ True  True  True]
 [ True  True  True]]


In [5]:
print("Splitting A into rows:")
Rows=np.split(A,2,axis=0)

for i in range(len(Rows)):
    print('row %d'%i)
    print(Rows[i])

Splitting A into rows:
row 0
[[0 1 2]]
row 1
[[3 4 5]]


In [6]:
A_recon=np.concatenate(Rows,axis=0)

print('reconstructing the matrix from the rows:')
print(A_recon)
print('Checking that the reconstruction is equal to the original')
print(A_recon==A)

reconstructing the matrix from the rows:
[[0 1 2]
 [3 4 5]]
Checking that the reconstruction is equal to the original
[[ True  True  True]
 [ True  True  True]]


#### Numpy functions
Beyond the commands `reshape`, `split` and `concatanate` numpy has a rich set of functions to manipulate arrays, for a complete list see [Numpy Array Manipulation routines](https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html)

### Matrix - scalar operations

You can add/subtract multiply/divide a scalar from a matrix

#### Adding a scalar value to a matrix

Let $A$=$\bigl[ \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr]$.  Here is how we would add the scalar $3$ to $A$:
$$
\begin{equation}
	A+3=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}+3
	=\begin{bmatrix}
	  a_{11}+3 & a_{12}+3 \\
	  a_{21}+3 & a_{22}+3 	
	\end{bmatrix}
\end{equation}
$$

#### Subtracting a scalar value to a matrix
Substraction is similar
$$
\begin{equation}
	A-3=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}-3
	=\begin{bmatrix}
	  a_{11}-3 & a_{12}-3 \\
	  a_{21}-3 & a_{22}-3 	
	\end{bmatrix}
\end{equation}
$$

#### Product of a scalar and a matrix

Multiplication is also similar
$$
\begin{equation}
	3 \times A = 3 \times \begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}
	=
	\begin{bmatrix}
	  3a_{11} & 3a_{12} \\
	  3a_{21} & 3a_{22} 	
	\end{bmatrix}
\end{equation}
$$

#### Dividing a matrix by a scalar
Division by $a$ is the same as multiplying by $1/a$. Note that you cn divide a matrix by a scalar, but dividing a scalar by a matrix is not defined.
$$
\begin{equation}
	A/5= A \times \frac{1}{5}= \begin{bmatrix}
	  a_{11}/5 & a_{12}/5 \\
	  a_{21}/5 & a_{22}/5 	
	\end{bmatrix}
\end{equation}
$$

In [13]:
# Some examples of matrix-scalar operations using numpy
print('A=\n',A)
print('A+3=3+A=\n',A+3)  # addition

print('A*3=\n',A*3)  # product

print('A/2=\n',A/2)  # integer division
print('A/2.=\n',A/2.)  # floating point division

A=
 [[0 1 2]
 [3 4 5]]
A+3=3+A=
 [[3 4 5]
 [6 7 8]]
A*3=
 [[ 0  3  6]
 [ 9 12 15]]
A/2=
 [[0.  0.5 1. ]
 [1.5 2.  2.5]]
A/2.=
 [[0.  0.5 1. ]
 [1.5 2.  2.5]]


### Adding and subtracting two matrices
Let $A$=$\bigl[ \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr]$ and $B$=$\bigl[ \begin{smallmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{smallmatrix} \bigr]$.  To compute $A-B$, subtract each element of B from the corresponding element of A:

$
	A -B =
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix} -
	\begin{bmatrix} b_{11} & b_{12} \\
	  b_{21} & b_{22}
	\end{bmatrix} $

$	=
	\begin{bmatrix}
	  a_{11}-b_{11} & a_{12}-b_{12} \\
	  a_{21}-b_{21} & a_{22}-b_{22} 	
	\end{bmatrix}
$

Addition works exactly the same way:

$	A + B =
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix} +
	\begin{bmatrix} b_{11} & b_{12} \\
	  b_{21} & b_{22}
	\end{bmatrix} $
    
$	=
	\begin{bmatrix}
	  a_{11}+b_{11} & a_{12}+b_{12} \\
	  a_{21}+b_{21} & a_{22}+b_{22} 	
	\end{bmatrix}
$

An important point to know about matrix addition and subtraction is that it is only defined when $A$ and $B$ are of the same size.  Here, both are $2 \times 2$.  Since operations are performed element by element, these two matrices must be conformable- and for addition and subtraction that means they must have the same numbers of rows and columns.  I like to be explicit about the dimensions of matrices for checking conformability as I write the equations, so write

$$
A_{2 \times 2} + B_{2 \times 2}= \begin{bmatrix}
  a_{11}+b_{11} & a_{12}+b_{12} \\
  a_{21}+b_{21} & a_{22}+b_{22} 	
\end{bmatrix}_{2 \times 2}
$$

Notice that the result of a matrix addition or subtraction operation is always of the same dimension as the two operands.

Let's define another matrix, B, that is also $2 \times 2$ and add it to A:

In [12]:
B = np.random.randn(2,2)
print(B)

[[-1.26221046  0.57475848]
 [-1.47212606  0.36916418]]


In [15]:
print(A.shape,B.shape)
result = A + B

(2, 3) (2, 2)


ValueError: operands could not be broadcast together with shapes (2,3) (2,2) 

### Matrix-Matrix products

#### The dot product of two vectors
* Recall that a vector is just a skinny matrix.
* Consider the dot product $(1,2,3) \cdot (1,1,0) = 1 \times 1 + 2 \times 1 +3 \times 0= 3$.

Conventions of dot product in matrix notation:
  * The first vector is a row vector and the second vector is a column vector.
  * In matrix notation there is no dot ($\cdot$) between the two vectors
  
$$
   \begin{bmatrix} 1,2,3 \end{bmatrix}
   \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = 1 \times 1 + 2 \times 1 +3 \times 0= 3
$$

#### The Outer Product
* Dot Product = Inner Product
$$
   \begin{bmatrix} 1,2,3 \end{bmatrix}
   \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = 1 \times 1 + 2 \times 1 +3 \times 0= 3
$$
* Outer Product:
$$
   \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix}
   \begin{bmatrix} 1,2,3 \end{bmatrix}
    = \begin{bmatrix}
    1,2,3\\
    1,2,3\\
    0,0,0
    \end{bmatrix}
$$

#### The dot product of a matrix and a vector

To multiply the matrix ${\bf A}=\begin{bmatrix}
	  a_{11} & a_{12} & a_{13}\\
	  a_{21} & a_{22} & a_{23}	
	\end{bmatrix}$
by the column vector ${\bf c}=\begin{bmatrix} c_1 \\ c_2 \\ c_3 \end{bmatrix}$  

We think of ${\bf A}$ as consisting or two row vectors:
${\bf A} = \begin{bmatrix} {\bf r}_1 \\ {\bf r}_2 \end{bmatrix}$  
where $
   {\bf r}_1=\begin{bmatrix} a_{11}, a_{12}, a_{13} \end{bmatrix},
   {\bf r}_2=\begin{bmatrix} a_{21}, a_{22}, a_{23} \end{bmatrix}, 
$

and take the dot products of ${\bf r}_1,{\bf r}_2$ with ${\bf c}$ to create a column vector of dimension 2:

${\bf A} {\bf c} = \begin{bmatrix} {\bf r}_1 {\bf c} \\ {\bf r}_2 {\bf c} \end{bmatrix}
 = \begin{bmatrix}
	  a_{11}c_1 + a_{12}c_2 + a_{13} c_3 \\
	  a_{21}c_1 + a_{22}c_2 + a_{23} c_3	
	\end{bmatrix}$

#### Dot product of two matrices

Multiplying a matrix and a column vector can be generalized to multiplying two matrices.



To do so we think of 
Alternatively, consider a matrix ${\bf C}$ of size $2 \times 3$ and a matrix ${\bf A}$ of size $3 \times 2$

$$
\begin{equation}
	{\bf A}=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}
	,
	{\bf C} = 
	\begin{bmatrix}
		  c_{11} & c_{12} & c_{13} \\
		  c_{21} & c_{22} & c_{23} 
	\end{bmatrix}
	\end{equation}
$$

To compute ${\bf AC}$ we think of ${\bf A}$ as a column of row vectors:
${\bf A} =\begin{bmatrix}
	  {\bf a}_1 \\
	  {\bf a}_2 \\
	  {\bf a}_3	
	\end{bmatrix}
    $
    
and of ${\bf C}$ as a row of column vectors: ${\bf C} =\begin{bmatrix}
	  {\bf c}_1,
	  {\bf c}_2,
	  {\bf c}_3	
	\end{bmatrix}
    $

${\bf AC}$ is the matrix generated from taking the dot product of each row vector in ${\bf A}$ with each column vector in ${\bf C}$

${\bf AC}=
	\begin{bmatrix}
	  {\bf a}_1 \\
	  {\bf a}_2 \\
	  {\bf a}_3		
	\end{bmatrix}
	\begin{bmatrix}
	  {\bf c}_1,
	  {\bf c}_2,
	  {\bf c}_3
	\end{bmatrix}
= \begin{bmatrix}
    {\bf a}_1 \cdot {\bf c}_1 & {\bf a}_1 \cdot {\bf c}_2 & {\bf a}_1 \cdot {\bf c}_3 \\
    {\bf a}_2 \cdot {\bf c}_1 & {\bf a}_2 \cdot {\bf c}_2 & {\bf a}_2 \cdot {\bf c}_3 \\
    {\bf a}_3 \cdot {\bf c}_1 & {\bf a}_3 \cdot {\bf c}_2 & {\bf a}_3 \cdot {\bf c}_3
    \end{bmatrix} =
    $

$= \begin{bmatrix}
	  a_{11} c_{11}+a_{12} c_{21} & a_{11} c_{12}+a_{12} c_{22} & a_{11} c_{13}+a_{12} c_{23} \\
	  a_{21} c_{11}+a_{22} c_{21} & a_{21} c_{12}+a_{22} c_{22} & a_{21} c_{13}+a_{22} c_{23} \\
	  a_{31} c_{11}+a_{32} c_{21} & a_{31} c_{12}+a_{32} c_{22} & a_{31} c_{13}+a_{32} c_{23}
	\end{bmatrix}
$
    
For more information on the topic of matrix multiplication, see 
http://en.wikipedia.org/wiki/Matrix_multiplication.

In [16]:
# Matrix - Vector product
A = np.arange(6).reshape((3,2))
C = np.array([-1,1])

print(A.shape)
print(C.shape)
print(np.dot(A,C.T))

(3, 2)
(2,)
[1 1 1]


In [17]:
# Matrix - Matrix product

# Define the matrices A and C
A = np.arange(6).reshape((3,2))
C = np.random.randn(2,2)

print('A=\n',A)
print('C=\n',C)

A=
 [[0 1]
 [2 3]
 [4 5]]
C=
 [[ 1.34468303 -0.42918745]
 [ 0.92444446 -0.27802146]]


We will use the numpy dot operator to perform the these multiplications.  You can use it two ways to yield the same result:

In [18]:
print('A.dot(C)=\n',A.dot(C))
print('np.dot(A,C)=\n',np.dot(A,C))

A.dot(C)=
 [[ 0.92444446 -0.27802146]
 [ 5.46269944 -1.69243928]
 [10.00095442 -3.10685711]]
np.dot(A,C)=
 [[ 0.92444446 -0.27802146]
 [ 5.46269944 -1.69243928]
 [10.00095442 -3.10685711]]


#### Conformity
Note that the number of columns in the first matrix has to be equal to the number of columns in the second matrix. Otherwise, the matrix product is not defined. When this condition holds we say that the two matrices **conform**.

Taking the product of two matrices that don't conform results in an exception:

In [19]:
np.dot(C,A)

ValueError: shapes (2,2) and (3,2) not aligned: 2 (dim 1) != 3 (dim 0)

## Orthonormal matrices and change of Basis
**As was explained in the notebook: "Linear Algebra Review"**

We say that the vectors $\vec{u}_1,\vec{u}_2,\ldots,\vec{u}_d \in R^d$ form an **orthonormal basis** of $R^d$. If:
* **Normality:** $\vec{u}_1,\vec{u}_2,\ldots,\vec{u}_d$ are unit vectors: $\forall 1 \leq i \leq d: \vec{u}_i \cdot \vec{u}_i =1 $
* **Orthogonality:** Every pair of vectors are orthogonal: 
$\forall 1 \leq i\neq j \leq d: \vec{u}_i \cdot \vec{u}_j =0 $

**Orthonormal basis can be used to rotate the coordinate system:**
* $\vec{v}$ is **represented** as a list of $d$ dot products: $$[\vec{v}\cdot\vec{u_1},\vec{v}\cdot\vec{u_2},\ldots,\vec{v}\cdot\vec{u_d}]$$
* $\vec{v}$ is **reconstructed** by summing its projections on the basis vectors:
$$\vec{v} = (\vec{v}\cdot\vec{u_1})\vec{u_1} + (\vec{v}\cdot\vec{u_2})\vec{u_2} + \cdots + (\vec{v}\cdot\vec{u_d})\vec{u_d}$$

### Othonormal basis in  matrix notation
To use matrix notation, we think of $\vec{u}_i$ as a row vector:
$$
   {\bf u}_i=\begin{bmatrix} u_{i1}, u_{i2},\ldots, u_{id} \end{bmatrix},
$$

We can stack the orthonormal vectors, one row per vector,  to create an **orthonormal matrix**

$$ {\bf U} = \begin{bmatrix} {\bf u}_1 \\ {\bf u}_2 \\ \vdots \\ {\bf u}_d \end{bmatrix}
= \begin{bmatrix} 
u_{11}, u_{12},\ldots, u_{1d} \\ 
u_{21}, u_{22},\ldots, u_{2d} \\ 
\vdots\\
u_{d1}, u_{d2},\ldots, u_{dd} 
\end{bmatrix}
$$

### Checking that the basis is orthonormal:
* **Orthogonality:** $\vec{u}_i\vec{u}_j^T=0$ if $i \neq j$
* **Normality** $\vec{u}_i \vec{u}_i = 1$
* In matrix notation: $$\bf UU^{\top} = 
\begin{bmatrix} 
1, 0,\ldots, 0 \\ 
0, 1,\ldots, 0 \\ 
\ddots \\
0,0,\ldots, 1 
\end{bmatrix}
={\bf I}$$

### The Identity  Matrix
The identity matrix behaves like the number $1$: 

The dot product of any matrix ${\bf A}$ by the identity matrix ${\bf I}$ yields ${\bf A}$.

$$ {\bf A I = I A = A} $$

### Changing coordinate system
Suppose we have two orthonormal basis: 
* The standard basis $\vec{e}_1 = [1,0,0,\ldots,0], \vec{e}_2 = [0,1,0,\ldots,0],\ldots,\vec{e}_d = [0,0,0,\ldots,1]$
* Some other orthonormal basis : ${\bf U}$

* We have two ways of representing a vector $\vec{v}$ as an array of length $d$:
1. Using the standard coordinate system:  
$\vec{v}_S = [\vec{v}\cdot\vec{e}_1,\vec{v}\cdot\vec{e}_2,\ldots,\vec{v}\cdot\vec{e}_d]$ 
2. Using ${\bf U}$:  
$\vec{v}_{{\bf U}}=[\vec{v}\cdot\vec{u_1},\vec{v}\cdot\vec{u_2},\ldots,\vec{v}\cdot\vec{u_d}]$

#### transformations in matrix notation.
* Suppose $\vec{v}_S$, $\vec{v}_{{\bf U}}$ are represented as column vectors
* Transforming $\vec{v}_S$ to $\vec{v}_{{\bf U}}$:
$$ \vec{v}_{{\bf U}} = {\bf U} \vec{v}_S = \begin{bmatrix} \vec{u}_1 \cdot \vec{v} \\ \vec{u}_2 \cdot \vec{v} \\ \vdots \\ \vec{u}_d \cdot \vec{v} \end{bmatrix}$$
* Transforming $\vec{v}_{{\bf U}}$ to $\vec{v}_S$:
$$ \vec{v}_S = {\bf U}^{-1}\vec{v}_{{\bf U}} = {\bf U}^T\vec{v}_{{\bf U}}$$

* $\vec{v}$ is **reconstructed exactly** by summing its projections on all of the basis vectors:
$$\vec{v} = (\vec{v}\cdot\vec{u_1})\vec{u_1} + (\vec{v}\cdot\vec{u_2})\vec{u_2} + \cdots + (\vec{v}\cdot\vec{u_d})\vec{u_d}$$

* In Matrix Notation: $\bf v = {\bf U U^{\mathsf{T}} v}$
* equivalently $ {\bf U U^{\mathsf{T}} = I}$

* $\vec{v}$ is **reconstructed APPROXIMATELY** by summing its projections on the basis vectors:
$$\vec{v}_k = (\vec{v}\cdot\vec{u_1})\vec{u_1} + (\vec{v}\cdot\vec{u_2})\vec{u_2} + \cdots + (\vec{v}\cdot\vec{u_k})\vec{u_k}$$
* The **residual** for the $k$-approximation is $\vec{r}_k = \vec{v} - \vec{v}_k$, the **residual error**
is the squared norm of $\vec{r}$ which is $\|\vec{r}\|_2^2 = \sum_{i=1}^d r_i^2$

## Inverting a Matrix

Recall that the multiplicative inverse of the number $a$ is $a^{-1}=1/a$

The property of $a^{-1}$ is that $a a^{-1}=1$.

Recall also that $0$ does not have a multiplicative inverse.

**Some** square matrices ${\bf A}$ have a multiplicative inverse ${\bf A^{-1}}$  
such that ${\bf A A^{-1} = A^{-1} A =I}$

Finding the inverse of a matrix is called *inverting* the matrix.  

An $n\times n$ matrix $\bf A$ represents a linear transformation from $R^n$ to $R^n$. If the matrix is [**invertible**](https://en.wikipedia.org/wiki/Invertible_matrix) then there is another transformation ${\bf A}^{-1}$ that represents the inverse transformation, such that for any column vctor ${\bf v} \in R^n$:
$${\bf A}^{-1}{\bf A}{\bf v} = {\bf A}{\bf A}^{-1}{\bf v} = {\bf v} $$

### Inverting a 2X2 matrix
Consider the square $2 \times 2$ matrix ${\bf A} = \bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{smallmatrix} \bigr)$.  The inverse of matrix ${\bf A}$ is

$$
\begin{equation}
	{\bf A}^{-1}=\begin{bmatrix}
             a_{11} & a_{12} \\
		     a_{21} & a_{22} 
           \end{bmatrix}^{-1}=\frac{1}{a_{11}a_{22}-a_{12}a_{21}}	\begin{bmatrix}
		             a_{22} & -a_{12} \\
				     -a_{21} & a_{11} 
		           \end{bmatrix}
\end{equation}
$$

**Excercise:** Check that $ {\bf A A^{-1}=A^{-1} A=I }$

For more information on inverting matrices, see this
http://en.wikipedia.org/wiki/Matrix_inversion.

In [None]:
# An example of computing the inverse using numpy.linalg.inv
# note, we need a square matrix (# rows = # cols), use C:
C = np.random.randn(2,2)
print("C=\n",C)
C_inverse = np.linalg.inv(C)
print("C_inverse=\n",C_inverse)

Checking that $C\times C^{-1} = I$:

In [None]:
I = np.eye(2)
print("identity matrix=\n",I)
print("C.dot(C_inverse)-I=\n",C.dot(C_inverse)-I)
print("C_inverse.dot(C)-I=\n",C_inverse.dot(C)-I)

### Singular matrices
Not all matrices have an inverse. Those that do not are called **singular**

In [None]:
C=np.array([[1,0],[1,0]])
print("C=\n",C)
try:
    C_inverse = np.linalg.inv(C)
except: 
    print('C cannot be inverted: it is a singular matrix')

## Next video: solving a set of linear equations

## Excercises

1. Write a function that takes as input an array of real numbers $a_1,a_2,\ldots,a_d$ and outputs a $d \times d$ "scaling" matrix ${\bf S}$ such that for any row vector ${\bf v} = [v_1,v_2,\ldots,v_d]$:
$$
{\bf v S} = [a_1 \times v_1,a_2 \times v_2,\ldots,a_d \times v_d]
$$
2. Write a function that takes as input an array of real numbers $a_1,a_2,\ldots,a_d$ and output the
**inverse** of ${\bf S}$, denoted ${\bf S}^{-1}$
3. Orthonormal transformation are also called **rotations**. In [2D](https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions), the rotation has a single rotation angle $\theta$:
$$
R(\theta) = \begin{bmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta \\
\end{bmatrix}
$$
   * Write a function that takes $\theta$ as it's input and returns the rotation matrix $R(\theta)$
   * Write a function that takes $\theta$ as it's input and returns the rotation matrix $R^{-1}(\theta)$. (hint: all you need to do is call the previous function with a different parameter).
   * Write a function that takes as input an angle $\theta$ and a list of points $(x_1,y_1),(x_2,y_2),\ldots,(x_n,y_n)$, rotates the points using the matrix $R(\theta)$, plots the line connecting the rotated points, and returns the rotated points themselves.