<img style="float: right;" src="images/Matrix.svg.png">
## Matrix notation
<a href="https://en.wikipedia.org/wiki/Matrix_(mathematics)">Matrix Notation</a> is a notation systems that for a succinct representation of repetitive operations, such as a change of basis.

Recall that a **vector** can be represented as a one dimensional array of numbers. A **matrix** is a two dimensional rectangle of numbers. A matrix consists of rows, indexed from the top to the bottom and of columns, indexed from the left to the right. As is described in the  figure.

A matrix with $n$ rows and $m$ columns is said to be an "$m$ by $n$ matrix".
In numpy we will say that the **shape** of the matrix is $(m,n)$. We will also use the LaTeX notation $M_{m \times n}$ to indicate that $M$ is an $m \times n$ matrix.

In [1]:
import numpy as np
# The .reshape command reorganized the elements of a matrix into a new shape
A = np.array(range(6))
print 'A=',A
B=A.reshape(2,3)
print "B is a 2X3 matrix:\n",B
print "the shape of B is:",B.shape
print "The transpose of B is\n",B.T
print "the shape of B.T is:",B.T.shape

A= [0 1 2 3 4 5]
B is a 2X3 matrix:
[[0 1 2]
 [3 4 5]]
the shape of B is: (2, 3)
The transpose of B is
[[0 3]
 [1 4]
 [2 5]]
the shape of B.T is: (3, 2)


### Vectors as matrices.
When using matrix notation, vectors can be represented as either [row or column vectors](https://en.wikipedia.org/wiki/Row_and_column_vectors). In a matrix context, a vector $\vec{v}$ is denoted by a bold-face letter. ${\bf v}$ for a column vector and ${\bf v}^\top$ for row vector:
* By default a vector is represented as a **column vectors** which is a matrix consisting of a single column:
$$
\begin{equation}
{\bf v}=
	\begin{bmatrix}
	  v_1 \\
      v_2 \\
      \vdots \\
	  v_d
	\end{bmatrix}
\end{equation}
$$

* If $\vec{v}$ is a column vector then the **transpose** of $\vec{v}$, denoted by $\vec{v}^\top$ is a **row vector** which is a matrix consists of a single row:
$$
\begin{equation}
{\bf v}^{\top}=
	\begin{bmatrix}
	  v_1 & v_2 & \cdots & v_d
	\end{bmatrix}
\end{equation}
$$


#### A matrix as a collection of vectors

Matrices can be represented as a collection of vectors. For example, consider the $2\times 3$ matrix ${\bf A}=\begin{bmatrix}
	  a_{11} & a_{12} & a_{13}\\
	  a_{21} & a_{22} & a_{23}	
	\end{bmatrix}$

We can represent ${\bf A}$ in one of two ways:
* As a row of column vectors:
$$ {\bf A} = \begin{bmatrix} {\bf c}_1 , {\bf c}_2 , {\bf c}_3 \end{bmatrix}\;\;
\mbox{where}\;\;
   {\bf c}_1=\begin{bmatrix} a_{11}\\ a_{21} \end{bmatrix},
   {\bf c}_2=\begin{bmatrix} a_{12}\\ a_{22} \end{bmatrix}, 
   {\bf c}_3=\begin{bmatrix} a_{13}\\ a_{23} \end{bmatrix}$$
* As a column of row vectors:
$$
{\bf A} = \begin{bmatrix} {\bf r}_1 \\ {\bf r}_2 \end{bmatrix}\;\;
\mbox{where}\;\;
   {\bf r}_1=\begin{bmatrix} a_{11}, a_{12}, a_{13} \end{bmatrix},
   {\bf r}_2=\begin{bmatrix} a_{21}, a_{22}, a_{23} \end{bmatrix}, 
$$

In [2]:
A=np.array(range(6)).reshape(2,3)

In [3]:
print "Splitting A into columns:"
Columns=np.split(A,3,axis=1)
for i in range(len(Columns)):
    print 'column %d'%i
    print Columns[i]

A_recon=np.concatenate(Columns,axis=1)
print 'reconstructing the matrix from the columns:'
print A_recon
print 'Checking that the reconstruction is equal to the original'
print A_recon==A

Splitting A into columns:
column 0
[[0]
 [3]]
column 1
[[1]
 [4]]
column 2
[[2]
 [5]]
reconstructing the matrix from the columns:
[[0 1 2]
 [3 4 5]]
Checking that the reconstruction is equal to the original
[[ True  True  True]
 [ True  True  True]]


In [4]:
print "Splitting A into rows:"
Rows=np.split(A,2,axis=0)

for i in range(len(Rows)):
    print 'row %d'%i
    print Rows[i]

A_recon=np.concatenate(Rows,axis=0)

print 'reconstructing the matrix from the rows:'
print A_recon
print 'Checking that the reconstruction is equal to the original'
print A_recon==A

Splitting A into rows:
row 0
[[0 1 2]]
row 1
[[3 4 5]]
reconstructing the matrix from the rows:
[[0 1 2]
 [3 4 5]]
Checking that the reconstruction is equal to the original
[[ True  True  True]
 [ True  True  True]]


#### Numpy functions
Beyond the commands `reshape`, `split` and `concatanate` numpy has a rich set of functions to manipulate arrays, for a complete list see [Numpy Array Manipulation routines](https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html)

### Matrix Addition and Subtraction

#### Adding or subtracting a scalar value to a matrix

To learn the basics, consider a small matrix of dimension $2 \times 2$, where $2 \times 2$ denotes the number of rows $\times$ the number of columns.  Let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$.  Consider adding a scalar value (e.g. 3) to the A.
$$
\begin{equation}
	A+3=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}+3
	=\begin{bmatrix}
	  a_{11}+3 & a_{12}+3 \\
	  a_{21}+3 & a_{22}+3 	
	\end{bmatrix}
\end{equation}
$$
The same basic principle holds true for A-3:
$$
\begin{equation}
	A-3=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}-3
	=\begin{bmatrix}
	  a_{11}-3 & a_{12}-3 \\
	  a_{21}-3 & a_{22}-3 	
	\end{bmatrix}
\end{equation}
$$
Notice that we add (or subtract) the scalar value to each element in the matrix A.  A can be of any dimension.

This is trivial to implement, now that we have defined our matrix A:

In [5]:
result = A + 3
#or
result = 3 + A
print result

[[3 4 5]
 [6 7 8]]


#### Adding or subtracting two matrices
Consider two small $2 \times 2$ matrices, where $2 \times 2$ denotes the \# of rows $\times$ the \# of columns.  Let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$ and $B$=$\bigl( \begin{smallmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{smallmatrix} \bigr)$.  To find the result of $A-B$, simply subtract each element of A with the corresponding element of B:

$$
\begin{equation}
	A -B =
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix} -
	\begin{bmatrix} b_{11} & b_{12} \\
	  b_{21} & b_{22}
	\end{bmatrix}
	=
	\begin{bmatrix}
	  a_{11}-b_{11} & a_{12}-b_{12} \\
	  a_{21}-b_{21} & a_{22}-b_{22} 	
	\end{bmatrix}
\end{equation}
$$

Addition works exactly the same way:

$$
\begin{equation}
	A + B =
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix} +
	\begin{bmatrix} b_{11} & b_{12} \\
	  b_{21} & b_{22}
	\end{bmatrix}
	=
	\begin{bmatrix}
	  a_{11}+b_{11} & a_{12}+b_{12} \\
	  a_{21}+b_{21} & a_{22}+b_{22} 	
	\end{bmatrix}
\end{equation}
$$

An important point to know about matrix addition and subtraction is that it is only defined when $A$ and $B$ are of the same size.  Here, both are $2 \times 2$.  Since operations are performed element by element, these two matrices must be conformable- and for addition and subtraction that means they must have the same numbers of rows and columns.  I like to be explicit about the dimensions of matrices for checking conformability as I write the equations, so write

$$
A_{2 \times 2} + B_{2 \times 2}= \begin{bmatrix}
  a_{11}+b_{11} & a_{12}+b_{12} \\
  a_{21}+b_{21} & a_{22}+b_{22} 	
\end{bmatrix}_{2 \times 2}
$$

Notice that the result of a matrix addition or subtraction operation is always of the same dimension as the two operands.

Let's define another matrix, B, that is also $2 \times 2$ and add it to A:

In [6]:
B = np.random.randn(2,2)
print B

[[-1.26466398  0.87550732]
 [-0.7586141  -0.3989516 ]]


In [7]:
result = A + B
result

ValueError: operands could not be broadcast together with shapes (2,3) (2,2) 

### Matrix Multiplication

#### Multiplying a scalar value times a matrix

As before, let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$.  Suppose we want to multiply A times a scalar value (e.g. $3 \times A$)

$$
\begin{equation}
	3 \times A = 3 \times \begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}
	=
	\begin{bmatrix}
	  3a_{11} & 3a_{12} \\
	  3a_{21} & 3a_{22} 	
	\end{bmatrix}
\end{equation}
$$

is of dimension (2,2).  Scalar multiplication is commutative, so that $3 \times A$=$A \times 3$.  Notice that the product is defined for a matrix A of any dimension.

Similar to scalar addition and subtration, the code is simple:

In [None]:
A * 3

#### Multiplying a matrix and a vector

Now, consider the $2 \times 1$ vector $C=\bigl( \begin{smallmatrix} c_{11} \\
  c_{21}
\end{smallmatrix} \bigr)$  

Consider multiplying matrix $A_{2 \times 2}$ and the vector $C_{2 \times 1}$.  Unlike the addition and subtraction case, this product is defined.  Here, conformability depends not on the row **and** column dimensions, but rather on the column dimensions of the first operand and the row dimensions of the second operand.  We can write this operation as follows

$$
\begin{equation}
	A_{2 \times 2} \times C_{2 \times 1} = 
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}_{2 \times 2}
    \times
    \begin{bmatrix}
	c_{11} \\
	c_{21}
	\end{bmatrix}_{2 \times 1}
	=
	\begin{bmatrix}
	  a_{11}c_{11} + a_{12}c_{21} \\
	  a_{21}c_{11} + a_{22}c_{21} 	
	\end{bmatrix}_{2 \times 1}
\end{equation}
$$

In [8]:
# Let's redefine A and C to demonstrate matrix multiplication:
A = np.arange(6).reshape((3,2))
C = np.array([-1,1])

print A.shape
print C.shape
print np.dot(A,C.T)

(3, 2)
(2,)
[1 1 1]


#### Multiplying two matrices

Alternatively, consider a matrix C of dimension $2 \times 3$ and a matrix A of dimension $3 \times 2$

$$
\begin{equation}
	A_{3 \times 2}=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}_{3 \times 2}
	,
	C_{2 \times 3} = 
	\begin{bmatrix}
		  c_{11} & c_{12} & c_{13} \\
		  c_{21} & c_{22} & c_{23} \\
	\end{bmatrix}_{2 \times 3}
	\end{equation}
$$

Here, A $\times$ C is

$$
\begin{align}
	A_{3 \times 2} \times C_{2 \times 3}=&
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}_{3 \times 2}
	\times
	\begin{bmatrix}
	  c_{11} & c_{12} & c_{13} \\
	  c_{21} & c_{22} & c_{23} 
	\end{bmatrix}_{2 \times 3} \\
	=&
	\begin{bmatrix}
	  a_{11} c_{11}+a_{12} c_{21} & a_{11} c_{12}+a_{12} c_{22} & a_{11} c_{13}+a_{12} c_{23} \\
	  a_{21} c_{11}+a_{22} c_{21} & a_{21} c_{12}+a_{22} c_{22} & a_{21} c_{13}+a_{22} c_{23} \\
	  a_{31} c_{11}+a_{32} c_{21} & a_{31} c_{12}+a_{32} c_{22} & a_{31} c_{13}+a_{32} c_{23}
	\end{bmatrix}_{3 \times 3}	
\end{align}
$$

So in general, $X_{r_x \times c_x} \times Y_{r_y \times c_y}$ we have two important things to remember: 

* For conformability in matrix multiplication, $c_x=r_y$, or the columns in the first operand must be equal to the rows of the second operand.
* The result will be of dimension $r_x \times c_y$, or of dimensions equal to the rows of the first operand and columns equal to columns of the second operand.

Given these facts, you should convince yourself that matrix multiplication is not generally commutative, that the relationship $X \times Y = Y \times X$ does **not** hold in all cases.
For this reason, we will always be very explicit about whether we are pre multiplying ($X \times Y$) or post multiplying ($Y \times X$) the vectors/matrices $X$ and $Y$.

For more information on this topic, see this
http://en.wikipedia.org/wiki/Matrix_multiplication.

In [9]:
# Let's redefine A and C to demonstrate matrix multiplication:
A = np.arange(6).reshape((3,2))
C = np.random.randn(2,2)

print A.shape
print C.shape

(3, 2)
(2, 2)


We will use the numpy dot operator to perform the these multiplications.  You can use it two ways to yield the same result:

In [10]:
print A.dot(C)
print np.dot(A,C)

[[-0.03695345 -0.2265092 ]
 [ 0.71719312  4.12375061]
 [ 1.47133969  8.47401042]]
[[-0.03695345 -0.2265092 ]
 [ 0.71719312  4.12375061]
 [ 1.47133969  8.47401042]]


## Orthonormal matrices and change of Basis
**As was explained in the notebook: "Linear Algebra Review"**

We say that the vectors $\vec{u}_1,\vec{u}_2,\ldots,\vec{u}_d \in R^d$ form an **orthonormal basis** of $R^d$. If:
* **Normality:** $\vec{u}_1,\vec{u}_2,\ldots,\vec{u}_d$ are unit vectors: $\forall 1 \leq i \leq d: \vec{u}_i \cdot \vec{u}_i =1 $
* **Orthogonality:** Every pair of vectors are orthogonal: 
$\forall 1 \leq i\neq j \leq d: \vec{u}_i \cdot \vec{u}_j =0 $

** Orthonormal basis can be used to rotate the vector space:**
* $\vec{v}$ is **represented** as a list of $d$ dot products: $$[\vec{v}\cdot\vec{u_1},\vec{v}\cdot\vec{u_2},\ldots,\vec{v}\cdot\vec{u_d}]$$
* $\vec{v}$ is **reconstructed** by summing its projections on the basis vectors:
$$\vec{v} = (\vec{v}\cdot\vec{u_1})\vec{u_1} + (\vec{v}\cdot\vec{u_2})\vec{u_2} + \cdots + (\vec{v}\cdot\vec{u_d})\vec{u_d}$$

### Transposing a Matrix

At times it is useful to pivot a matrix for conformability- that is in order to matrix divide or multiply, we need to switch the rows and column dimensions of matrices.  Consider the matrix
$$
\begin{equation}
	A_{3 \times 2}=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}_{3 \times 2}	
\end{equation}
$$
The transpose of A (denoted as $A^{\mathsf{T}}$) is
$$
\begin{equation}
   A^{\mathsf{T}}=\begin{bmatrix}
	  a_{11} & a_{21} & a_{31} \\
	  a_{12} & a_{22} & a_{32} \\
	\end{bmatrix}_{2 \times 3}
\end{equation}
$$

### Change of Basis using matrix notation
To use matrix notation, we think of $\vec{u}_i$ as a row vector:
$$
   {\bf u}_i=\begin{bmatrix} u_{i1}, u_{i2},\ldots, u_{id} \end{bmatrix},
$$

We can combine the orthonormal vectors to create an *orthonormal matrix*

$$ {\bf U} = \begin{bmatrix} {\bf u}_1 \\ {\bf u}_2 \\ \vdots \\ {\bf u}_d \end{bmatrix}
= \begin{bmatrix} 
u_{11}, u_{12},\ldots, u_{1d} \\ 
u_{21}, u_{22},\ldots, u_{2d} \\ 
\vdots\\
u_{d1}, u_{d2},\ldots, u_{dd} 
\end{bmatrix}
$$

Using this notation, the representation of a column vector $\bf v$ in the orthonormal basis corresponsing to the rows of ${\bf U}$ is equal to ${\bf Uv}$  

And the reconstruction of $\bf v$ is equal to ${\bf U U^{\mathsf{T}} v}$


## Inverting a Matrix

An $n\times n$ matrix $\bf A$ represents a linear transformation from $R^n$ to $R^n$. If the matrix is [**invertible**](https://en.wikipedia.org/wiki/Invertible_matrix) then there is another transformation ${\bf A}^{-1}$ that represents the inverse transformation, such that for any column vctor ${\bf v} \in R^n$:
$${\bf A}^{-1}{\bf A}{\bf v} = {\bf A}{\bf A}^{-1}{\bf v} = {\bf v} $$

### The Unit Matrix
The transformation ${\bf A}^{-1}{\bf A}$ does not change any vector. It is the **identity operator**.
The matrix corresponding to the identity operator is called the **Unit matrix** and is equal to:
$$
{\bf I} = \begin{bmatrix} 
1, 0,\ldots, 0 \\ 
0, 1,\ldots, 0 \\ 
\ddots \\
0,0,\ldots, 1 
\end{bmatrix}
$$

**Excercise:** Check that ${\bf A I = I A = A}$.

### Inverting a 2X2 matrix
Consider the square $2 \times 2$ matrix ${\bf A} = \bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{smallmatrix} \bigr)$.  The inverse of matrix ${\bf A}$ is

$$
\begin{equation}
	{\bf A}^{-1}=\begin{bmatrix}
             a_{11} & a_{12} \\
		     a_{21} & a_{22} 
           \end{bmatrix}^{-1}=\frac{1}{a_{11}a_{22}-a_{12}a_{21}}	\begin{bmatrix}
		             a_{22} & -a_{12} \\
				     -a_{21} & a_{11} 
		           \end{bmatrix}
\end{equation}
$$

**Excercise:** Check that $ {\bf A A^{-1}=A^{-1} A=I }$

For more information on this topic, see this
http://en.wikipedia.org/wiki/Matrix_inversion, for example, on inverting matrices.

In [15]:
# note, we need a square matrix (# rows = # cols), use C:
print "C=\n",C
C_inverse = np.linalg.inv(C)
print "C_inverse=\n",C_inverse

C=
[[ 0.41402674  2.40163911]
 [-0.03695345 -0.2265092 ]]
C_inverse=
[[  45.01364475  477.27213257]
 [  -7.34367309  -82.27856691]]


Check that $C\times C^{-1} = I$:

In [21]:
I = np.eye(2)
print "identity matrix=\n",I
print "C.dot(C_inverse)-I=\n",C.dot(C_inverse)-I
print "C_inverse.dot(C)-I=\n",C_inverse.dot(C)-I

identity matrix=
[[ 1.  0.]
 [ 0.  1.]]
C.dot(C_inverse)-I=
[[  0.00000000e+00   0.00000000e+00]
 [  2.22044605e-16   0.00000000e+00]]
C_inverse.dot(C)-I=
[[  0.00000000e+00  -1.42108547e-14]
 [  0.00000000e+00   0.00000000e+00]]


## Excercises

1. Write a function that takes as input an array of real numbers $a_1,a_2,\ldots,a_d$ and outputs a $d \times d$ "scaling" matrix ${\bf S}$ such that for any row vector ${\bf v} = [v_1,v_2,\ldots,v_d]$:
$$
{\bf v S} = [a_1 \times v_1,a_2 \times v_2,\ldots,a_d \times v_d]
$$
2. Write a function that takes as input an array of real numbers $a_1,a_2,\ldots,a_d$ and output the
**inverse** of ${\bf S}$, denoted ${\bf S}^{-1}$
3. Orthonormal transformation are also called **rotations**. In [2D](https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions), the rotation has a single rotation angle $\theta$:
$$
R(\theta) = \begin{bmatrix}
\cos \theta & -\sin \theta \\
\sin \theta & \cos \theta \\
\end{bmatrix}
$$
   * Write a function that takes $\theta$ as it's input and returns the rotation matrix $R(\theta)$
   * Write a function that takes $\theta$ as it's input and returns the rotation matrix $R^{-1}(\theta)$. (hint: all you need to do is call the previous function with a different parameter).
   * Write a function that takes as input an angle $\theta$ and a list of points $(x_1,y_1),(x_2,y_2),\ldots,(x_n,y_n)$, rotates the points using the matrix $R(\theta)$, plots the line connecting the rotated points, and returns the rotated points themselves.