<img style="float: right;" src="images/Matrix.svg.png">
## Matrix notation
<a href="https://en.wikipedia.org/wiki/Matrix_(mathematics)">Matrix Notation</a> is a notation systems that for a succinct representation of repetitive operations, such as a change of basis.

Recall that a **vector** can be represented as a one dimensional array of numbers. A **matrix** is a two dimensional rectangle of numbers. A matrix consists of rows, indexed from the top to the bottom and of columns, indexed from the left to the right. As is described in the  figure.

A matrix with $n$ rows and $m$ columns is said to be an "$m$ by $n$ matrix".
In numpy we will say that the **shape** of the matrix is $(m,n)$. We will also use the LaTeX notation $M_{m \times n}$ to indicate that $M$ is an $m \times n$ matrix.

In [36]:
# The .reshape command reorganized the elements of a matrix into a new shape
A = np.array(range(6))
print 'A=',A
B=A.reshape(2,3)
print "B is a 2X3 matrix:\n",B
print "the shape of B is:",B.shape
print "The transpose of B is\n",B.T
print "the shape of B.T is:",B.T.shape

A= [0 1 2 3 4 5]
B is a 2X3 matrix:
[[0 1 2]
 [3 4 5]]
the shape of B is: (2, 3)
The transpose of B is
[[0 3]
 [1 4]
 [2 5]]
the shape of B.T is: (3, 2)


### Vectors as matrices.
When using matrix notation, vectors can be represented as either [row or column vectors](https://en.wikipedia.org/wiki/Row_and_column_vectors). In a matrix context, a vector $\vec{v}$ is denoted by a bold-face letter. ${\bf v}$ for a column vector and ${\bf v}^\top$ for row vector:
* By default a vector is represented as a **column vectors** which is a matrix consisting of a single column:
$$
\begin{equation}
{\bf v}=
	\begin{bmatrix}
	  v_1 \\
      v_2 \\
      \vdots \\
	  v_d
	\end{bmatrix}
\end{equation}
$$

* If $\vec{v}$ is a column vector then the **transpose** of $\vec{v}$, denoted by $\vec{v}^\top$ is a **row vector** which is a matrix consists of a single row:
$$
\begin{equation}
{\bf v}^{\top}=
	\begin{bmatrix}
	  v_1 & v_2 & \cdots & v_d
	\end{bmatrix}
\end{equation}
$$


#### A matrix as a collection of vectors

Matrices can be represented as a collection of vectors. For example, consider the $2\times 3$ matrix ${\bf A}=\begin{bmatrix}
	  a_{11} & a_{12} & a_{13}\\
	  a_{21} & a_{22} & a_{23}	
	\end{bmatrix}$

We can represent ${\bf A}$ in one of two ways:
* As a row of column vectors:
$$ {\bf A} = \begin{bmatrix} {\bf c}_1 , {\bf c}_2 , {\bf c}_3 \end{bmatrix}\;\;
\mbox{where}\;\;
   {\bf c}_1=\begin{bmatrix} a_{11}\\ a_{21} \end{bmatrix},
   {\bf c}_2=\begin{bmatrix} a_{12}\\ a_{22} \end{bmatrix}, 
   {\bf c}_3=\begin{bmatrix} a_{13}\\ a_{23} \end{bmatrix}$$
* As a column of row vectors:
$$
{\bf A} = \begin{bmatrix} {\bf r}_1 \\ {\bf r}_2 \end{bmatrix}\;\;
\mbox{where}\;\;
   {\bf r}_1=\begin{bmatrix} a_{11}, a_{12}, a_{13} \end{bmatrix},
   {\bf r}_2=\begin{bmatrix} a_{21}, a_{22}, a_{23} \end{bmatrix}, 
$$

In [48]:
A=np.array(range(6)).reshape(2,3)

In [49]:
print "Splitting A into columns:"
Columns=np.split(A,3,axis=1)
for i in range(len(Columns)):
    print 'column %d'%i
    print Columns[i]

A_recon=np.concatenate(Columns,axis=1)
print 'reconstructing the matrix from the columns:'
print A_recon
print 'Checking that the reconstruction is equal to the original'
print A_recon==A

Splitting A into columns:
column 0
[[0]
 [3]]
column 1
[[1]
 [4]]
column 2
[[2]
 [5]]
reconstructing the matrix from the columns:
[[0 1 2]
 [3 4 5]]
Checking that the reconstruction is equal to the original
[[ True  True  True]
 [ True  True  True]]


In [50]:
print "Splitting A into rows:"
Rows=np.split(A,2,axis=0)

for i in range(len(Rows)):
    print 'row %d'%i
    print Rows[i]

A_recon=np.concatenate(Rows,axis=0)

print 'reconstructing the matrix from the rows:'
print A_recon
print 'Checking that the reconstruction is equal to the original'
print A_recon==A

Splitting A into rows:
row 0
[[0 1 2]]
row 1
[[3 4 5]]
reconstructing the matrix from the rows:
[[0 1 2]
 [3 4 5]]
Checking that the reconstruction is equal to the original
[[ True  True  True]
 [ True  True  True]]


#### Numpy functions
Beyond the commands `reshape`, `split` and `concatanate` numpy has a rich set of functions to manipulate arrays, for a complete list see [Numpy Array Manipulation routines](https://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html)

### Matrix Addition and Subtraction

#### Adding or subtracting a scalar value to a matrix

To learn the basics, consider a small matrix of dimension $2 \times 2$, where $2 \times 2$ denotes the number of rows $\times$ the number of columns.  Let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$.  Consider adding a scalar value (e.g. 3) to the A.
$$
\begin{equation}
	A+3=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}+3
	=\begin{bmatrix}
	  a_{11}+3 & a_{12}+3 \\
	  a_{21}+3 & a_{22}+3 	
	\end{bmatrix}
\end{equation}
$$
The same basic principle holds true for A-3:
$$
\begin{equation}
	A-3=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}-3
	=\begin{bmatrix}
	  a_{11}-3 & a_{12}-3 \\
	  a_{21}-3 & a_{22}-3 	
	\end{bmatrix}
\end{equation}
$$
Notice that we add (or subtract) the scalar value to each element in the matrix A.  A can be of any dimension.

This is trivial to implement, now that we have defined our matrix A:

In [38]:
result = A + 3
#or
result = 3 + A
print result

[[3 4 5]
 [6 7 8]]


#### Adding or subtracting two matrices
Consider two small $2 \times 2$ matrices, where $2 \times 2$ denotes the \# of rows $\times$ the \# of columns.  Let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$ and $B$=$\bigl( \begin{smallmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{smallmatrix} \bigr)$.  To find the result of $A-B$, simply subtract each element of A with the corresponding element of B:

$$
\begin{equation}
	A -B =
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix} -
	\begin{bmatrix} b_{11} & b_{12} \\
	  b_{21} & b_{22}
	\end{bmatrix}
	=
	\begin{bmatrix}
	  a_{11}-b_{11} & a_{12}-b_{12} \\
	  a_{21}-b_{21} & a_{22}-b_{22} 	
	\end{bmatrix}
\end{equation}
$$

Addition works exactly the same way:

$$
\begin{equation}
	A + B =
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix} +
	\begin{bmatrix} b_{11} & b_{12} \\
	  b_{21} & b_{22}
	\end{bmatrix}
	=
	\begin{bmatrix}
	  a_{11}+b_{11} & a_{12}+b_{12} \\
	  a_{21}+b_{21} & a_{22}+b_{22} 	
	\end{bmatrix}
\end{equation}
$$

An important point to know about matrix addition and subtraction is that it is only defined when $A$ and $B$ are of the same size.  Here, both are $2 \times 2$.  Since operations are performed element by element, these two matrices must be conformable- and for addition and subtraction that means they must have the same numbers of rows and columns.  I like to be explicit about the dimensions of matrices for checking conformability as I write the equations, so write

$$
A_{2 \times 2} + B_{2 \times 2}= \begin{bmatrix}
  a_{11}+b_{11} & a_{12}+b_{12} \\
  a_{21}+b_{21} & a_{22}+b_{22} 	
\end{bmatrix}_{2 \times 2}
$$

Notice that the result of a matrix addition or subtraction operation is always of the same dimension as the two operands.

Let's define another matrix, B, that is also $2 \times 2$ and add it to A:

In [39]:
B = np.random.randn(2,2)
print B

[[-0.48811893  0.33068505]
 [-0.52175699 -1.1424373 ]]


In [40]:
result = A + B
result

ValueError: operands could not be broadcast together with shapes (2,3) (2,2) 

### Matrix Multiplication

#### Multiplying a scalar value times a matrix

As before, let $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{smallmatrix} \bigr)$.  Suppose we want to multiply A times a scalar value (e.g. $3 \times A$)

$$
\begin{equation}
	3 \times A = 3 \times \begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}
	=
	\begin{bmatrix}
	  3a_{11} & 3a_{12} \\
	  3a_{21} & 3a_{22} 	
	\end{bmatrix}
\end{equation}
$$

is of dimension (2,2).  Scalar multiplication is commutative, so that $3 \times A$=$A \times 3$.  Notice that the product is defined for a matrix A of any dimension.

Similar to scalar addition and subtration, the code is simple:

In [41]:
A * 3

array([[ 0,  3,  6],
       [ 9, 12, 15]])

#### Multiplying a matrix and a vector

Now, consider the $2 \times 1$ vector $C=\bigl( \begin{smallmatrix} c_{11} \\
  c_{21}
\end{smallmatrix} \bigr)$  

Consider multiplying matrix $A_{2 \times 2}$ and the vector $C_{2 \times 1}$.  Unlike the addition and subtraction case, this product is defined.  Here, conformability depends not on the row **and** column dimensions, but rather on the column dimensions of the first operand and the row dimensions of the second operand.  We can write this operation as follows

$$
\begin{equation}
	A_{2 \times 2} \times C_{2 \times 1} = 
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} 	
	\end{bmatrix}_{2 \times 2}
    \times
    \begin{bmatrix}
	c_{11} \\
	c_{21}
	\end{bmatrix}_{2 \times 1}
	=
	\begin{bmatrix}
	  a_{11}c_{11} + a_{12}c_{21} \\
	  a_{21}c_{11} + a_{22}c_{21} 	
	\end{bmatrix}_{2 \times 1}
\end{equation}
$$

In [42]:
# Let's redefine A and C to demonstrate matrix multiplication:
A = np.arange(6).reshape((3,2))
C = np.array([-1,1])

print A.shape
print C.shape
print np.dot(A,C.T)

(3, 2)
(2,)
[1 1 1]


#### Multiplying two matrices

Alternatively, consider a matrix C of dimension $2 \times 3$ and a matrix A of dimension $3 \times 2$

$$
\begin{equation}
	A_{3 \times 2}=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}_{3 \times 2}
	,
	C_{2 \times 3} = 
	\begin{bmatrix}
		  c_{11} & c_{12} & c_{13} \\
		  c_{21} & c_{22} & c_{23} \\
	\end{bmatrix}_{2 \times 3}
	\end{equation}
$$

Here, A $\times$ C is

$$
\begin{align}
	A_{3 \times 2} \times C_{2 \times 3}=&
	\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}_{3 \times 2}
	\times
	\begin{bmatrix}
	  c_{11} & c_{12} & c_{13} \\
	  c_{21} & c_{22} & c_{23} 
	\end{bmatrix}_{2 \times 3} \\
	=&
	\begin{bmatrix}
	  a_{11} c_{11}+a_{12} c_{21} & a_{11} c_{12}+a_{12} c_{22} & a_{11} c_{13}+a_{12} c_{23} \\
	  a_{21} c_{11}+a_{22} c_{21} & a_{21} c_{12}+a_{22} c_{22} & a_{21} c_{13}+a_{22} c_{23} \\
	  a_{31} c_{11}+a_{32} c_{21} & a_{31} c_{12}+a_{32} c_{22} & a_{31} c_{13}+a_{32} c_{23}
	\end{bmatrix}_{3 \times 3}	
\end{align}
$$

So in general, $X_{r_x \times c_x} \times Y_{r_y \times c_y}$ we have two important things to remember: 

* For conformability in matrix multiplication, $c_x=r_y$, or the columns in the first operand must be equal to the rows of the second operand.
* The result will be of dimension $r_x \times c_y$, or of dimensions equal to the rows of the first operand and columns equal to columns of the second operand.

Given these facts, you should convince yourself that matrix multiplication is not generally commutative, that the relationship $X \times Y = Y \times X$ does **not** hold in all cases.
For this reason, we will always be very explicit about whether we are pre multiplying ($X \times Y$) or post multiplying ($Y \times X$) the vectors/matrices $X$ and $Y$.

For more information on this topic, see this
http://en.wikipedia.org/wiki/Matrix_multiplication.

In [43]:
# Let's redefine A and C to demonstrate matrix multiplication:
A = np.arange(6).reshape((3,2))
C = np.random.randn(2,2)

print A.shape
print C.shape

(3, 2)
(2, 2)


We will use the numpy dot operator to perform the these multiplications.  You can use it two ways to yield the same result:

In [44]:
print A.dot(C)
print np.dot(A,C)

[[ 1.13518556  0.98445237]
 [ 4.76316301  5.16737804]
 [ 8.39114046  9.35030372]]
[[ 1.13518556  0.98445237]
 [ 4.76316301  5.16737804]
 [ 8.39114046  9.35030372]]


## Orthonormal matrices and change of Basis

## Matrix Division
The term matrix division is actually a misnomer.  To divide in a matrix algebra world we first need to invert the matrix.  It is useful to consider the analog case in a scalar work.  Suppose we want to divide the $f$ by $g$.  We could do this in two different ways:
$$
\begin{equation}
	\frac{f}{g}=f \times g^{-1}.
\end{equation}
$$
In a scalar seeting, these are equivalent ways of solving the division problem.  The second one requires two steps: first we invert g and then we multiply f times g.  In a matrix world, we need to think about this second approach.  First we have to invert the matrix g and then we will need to pre or post multiply depending on the exact situation we encounter (this is intended to be vague for now).

### Inverting a Matrix

As before, consider the square $2 \times 2$ matrix $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{smallmatrix} \bigr)$.  Let the inverse of matrix A (denoted as $A^{-1}$) be 

$$
\begin{equation}
	A^{-1}=\begin{bmatrix}
             a_{11} & a_{12} \\
		     a_{21} & a_{22} 
           \end{bmatrix}^{-1}=\frac{1}{a_{11}a_{22}-a_{12}a_{21}}	\begin{bmatrix}
		             a_{22} & -a_{12} \\
				     -a_{21} & a_{11} 
		           \end{bmatrix}
\end{equation}
$$

The inverted matrix $A^{-1}$ has a useful property:
$$
\begin{equation}
	A \times A^{-1}=A^{-1} \times A=I
\end{equation}
$$
where I, the identity matrix (the matrix equivalent of the scalar value 1), is
$$
\begin{equation}
	I_{2 \times 2}=\begin{bmatrix}
             1 & 0 \\
		     0 & 1 
           \end{bmatrix}
\end{equation}
$$
furthermore, $A \times I = A$ and $I \times A = A$.

An important feature about matrix inversion is that it is undefined if (in the $2 \times 2$ case), $a_{11}a_{22}-a_{12}a_{21}=0$.  If this relationship is equal to zero the inverse of A does not exist.  If this term is very close to zero, an inverse may exist but $A^{-1}$ may be poorly conditioned meaning it is prone to rounding error and is likely not well identified computationally.  The term $a_{11}a_{22}-a_{12}a_{21}$ is the determinant of matrix A, and for square matrices of size greater than $2 \times 2$, if equal to zero indicates that you have a problem with your data matrix (columns are linearly dependent on other columns).  The inverse of matrix A exists if A is square and is of full rank (ie. the columns of A are not linear combinations of other columns of A).

For more information on this topic, see this
http://en.wikipedia.org/wiki/Matrix_inversion, for example, on inverting matrices.

In [45]:
# note, we need a square matrix (# rows = # cols), use C:
C_inverse = np.linalg.inv(C)
print C_inverse

[[-1.67306383  1.88134971]
 [ 1.92923291 -1.15361704]]


Check that $C\times C^{-1} = I$:

In [46]:
print C.dot(C_inverse)
print "Is identical to:"
print C_inverse.dot(C)

[[ 1.  0.]
 [ 0.  1.]]
Is identical to:
[[  1.00000000e+00  -4.44089210e-16]
 [  2.22044605e-16   1.00000000e+00]]


## Transposing a Matrix

At times it is useful to pivot a matrix for conformability- that is in order to matrix divide or multiply, we need to switch the rows and column dimensions of matrices.  Consider the matrix
$$
\begin{equation}
	A_{3 \times 2}=\begin{bmatrix}
	  a_{11} & a_{12} \\
	  a_{21} & a_{22} \\
	  a_{31} & a_{32} 	
	\end{bmatrix}_{3 \times 2}	
\end{equation}
$$
The transpose of A (denoted as $A^{\prime}$) is
$$
\begin{equation}
   A^{\prime}=\begin{bmatrix}
	  a_{11} & a_{21} & a_{31} \\
	  a_{12} & a_{22} & a_{32} \\
	\end{bmatrix}_{2 \times 3}
\end{equation}
$$