# Linear Algebra and Python Basics

In this chapter, I will be discussing some linear algebra basics that will provide sufficient linear algebra background for effective programming in Python for our purposes.  We will be doing very basic linear algebra that by no means covers the full breadth of this topic.  Why linear algebra?  Linear algebra allows us to express relatively complex linear expressions in a very compact way.

Being comfortable with the rules for scalar and matrix addition, subtraction, multiplication, and division (known as inversion) is important for our class.

Before we can implement any of these ideas in code, we need to talk a bit about python and how data is stored.

## Python Primer

There are numerous ways to run python code.  I will show you two and both are easily accessible after installing Anaconda:

1. The Spyder integrated development environment.  The major advantages of Spyder is that it provides a graphical way for viewing matrices, vectors, and other objects you want to check as you work on a problem.  It also has the most intuitive way of debugging code.

    Spyder looks like this:
    ![](http://rlhick.people.wm.edu/site_pics/Spyder_1.png)
    Code can be run by clicking the green arrow (runs the entire file) or by blocking a subset and running it.
    In Windows or Mac, you can launch the Spyder by looking for the icon in the newly installed Program Folder Anaconda.  
    
2. The Ipython Notebook (now called Jupyter).  The major advantages of this approach is that you use your web browser for all of your python work and you can mix code, videos, notes, graphics from the web, and mathematical notation to tell the whole story of your python project. In fact, I am using the ipython notebook for writing these notes. 
    The Ipython Notebook looks like this:
    ![](http://rlhick.people.wm.edu/site_pics/Jupyter_1.png)
    In Windows or Mac, you can launch the Ipython Notebook by looking in the newly installed Program Folder Anaconda.

In my work flow, I usually only use the Ipython Notebook, but for some coding problems where I need access to the easy debugging capabilities of Spyder, I use it.  We will be using the Ipython Notebook interface (web browser) mostly in this class.

### Loading libraries

The python universe has a huge number of libraries that extend the capabilities of python. Nearly all of these are open source, unlike packages like stata or matlab where some key libraries are proprietary (and can cost lots of money).  In lots of my code, you will see this at the top:

In [1]:
%matplotlib inline
import sympy as sympy
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sbn
from scipy import *

This code sets up Ipython Notebook environments (lines beginning with `%`), and loads several libraries and functions.  The core scientific stack in python consists of a number of free libraries.  The ones I have loaded above include:

1. sympy: provides for symbolic computation (solving algebra problems)
2. numpy: provides for linear algebra computations
3. matplotlib.pyplot: provides for the ability to graph functions and draw figures
4. scipy: scientific python provides a plethora of capabilities
5. seaborn: makes matplotlib figures even prettier (another library like this is called bokeh).  This is entirely optional and is purely for eye candy.

##Matrix Division
The term matrix division is actually a misnomer.  To divide in a matrix algebra world we first need to invert the matrix.  It is useful to consider the analog case in a scalar work.  Suppose we want to divide the $f$ by $g$.  We could do this in two different ways:
$$
\begin{equation}
	\frac{f}{g}=f \times g^{-1}.
\end{equation}
$$
In a scalar seeting, these are equivalent ways of solving the division problem.  The second one requires two steps: first we invert g and then we multiply f times g.  In a matrix world, we need to think about this second approach.  First we have to invert the matrix g and then we will need to pre or post multiply depending on the exact situation we encounter (this is intended to be vague for now).

###Inverting a Matrix

As before, consider the square $2 \times 2$ matrix $A$=$\bigl( \begin{smallmatrix} a_{11} & a_{12} \\ a_{21} & a_{22}\end{smallmatrix} \bigr)$.  Let the inverse of matrix A (denoted as $A^{-1}$) be 

$$
\begin{equation}
	A^{-1}=\begin{bmatrix}
             a_{11} & a_{12} \\
		     a_{21} & a_{22} 
           \end{bmatrix}^{-1}=\frac{1}{a_{11}a_{22}-a_{12}a_{21}}	\begin{bmatrix}
		             a_{22} & -a_{12} \\
				     -a_{21} & a_{11} 
		           \end{bmatrix}
\end{equation}
$$

The inverted matrix $A^{-1}$ has a useful property:
$$
\begin{equation}
	A \times A^{-1}=A^{-1} \times A=I
\end{equation}
$$
where I, the identity matrix (the matrix equivalent of the scalar value 1), is
$$
\begin{equation}
	I_{2 \times 2}=\begin{bmatrix}
             1 & 0 \\
		     0 & 1 
           \end{bmatrix}
\end{equation}
$$
furthermore, $A \times I = A$ and $I \times A = A$.

An important feature about matrix inversion is that it is undefined if (in the $2 \times 2$ case), $a_{11}a_{22}-a_{12}a_{21}=0$.  If this relationship is equal to zero the inverse of A does not exist.  If this term is very close to zero, an inverse may exist but $A^{-1}$ may be poorly conditioned meaning it is prone to rounding error and is likely not well identified computationally.  The term $a_{11}a_{22}-a_{12}a_{21}$ is the determinant of matrix A, and for square matrices of size greater than $2 \times 2$, if equal to zero indicates that you have a problem with your data matrix (columns are linearly dependent on other columns).  The inverse of matrix A exists if A is square and is of full rank (ie. the columns of A are not linear combinations of other columns of A).

For more information on this topic, see this
http://en.wikipedia.org/wiki/Matrix_inversion, for example, on inverting matrices.

In [2]:
cd /Users/marie/Desktop

/Users/marie/Desktop


In [3]:
import csv
import numpy
result=np.array(list(csv.reader(open("Workbook1.csv","r"),delimiter=','))).astype('float')

In [4]:
identity=numpy.matrix(numpy.identity(2952), copy=False)

In [5]:
iresult=identity-result

In [6]:
# note, we need a square matrix (# rows = # cols), use C:
result_inverse = np.linalg.inv(iresult)
print(result_inverse)

[[  1.00030722e+00   3.96082112e-05   6.46519049e-05 ...,   2.54732755e-03
    6.88271215e-03   0.00000000e+00]
 [  3.17604228e-03   1.00048015e+00   1.15475094e-03 ...,   5.94464010e-02
    1.44597937e-01   0.00000000e+00]
 [  1.67850476e-03   2.61982680e-04   1.00069422e+00 ...,   3.75290714e-02
    8.89688100e-02   0.00000000e+00]
 ..., 
 [  1.00444182e-07   5.46776781e-09   3.68252201e-08 ...,   9.98996608e-01
   -1.71584915e-03  -0.00000000e+00]
 [ -6.06712556e-09  -1.65323615e-09  -2.79102587e-09 ...,   4.77725876e-05
    1.00011645e+00  -0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
    0.00000000e+00   1.00000000e+00]]


Check that $C\times C^{-1} = I$:

In [7]:
print(iresult.dot(result_inverse))
print("Is identical to:")
print(result_inverse.dot(iresult))

[[  1.00000000e+00  -2.26041068e-21   2.88479706e-21 ...,   5.05634899e-20
    6.33606383e-20   0.00000000e+00]
 [  6.94429239e-20   1.00000000e+00   2.30770557e-19 ...,   1.23309998e-17
    1.54056235e-17   0.00000000e+00]
 [  6.20368916e-20   5.94380969e-21   1.00000000e+00 ...,  -2.16840434e-19
    1.86482774e-17   0.00000000e+00]
 ..., 
 [  6.61744490e-24  -8.89219158e-24  -3.30872245e-24 ...,   1.00000000e+00
   -5.65411433e-17   0.00000000e+00]
 [ -3.30872245e-24  -5.99705944e-24   1.24077092e-24 ...,  -1.21125711e-19
    1.00000000e+00   0.00000000e+00]
 [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
    0.00000000e+00   1.00000000e+00]]
Is identical to:
[[  1.00000000e+00  -3.90304940e-17  -2.67622457e-17 ...,   5.20982996e-13
   -3.96345110e-12   0.00000000e+00]
 [ -3.53943252e-16   1.00000000e+00  -1.55533283e-16 ...,   8.34239622e-12
   -3.24411331e-11   0.00000000e+00]
 [ -1.98093955e-16  -1.58607394e-16   1.00000000e+00 ...,   3.80962276e-12
   -

In [None]:
import csv

with open('BLCI.csv', 'wb') as myfile:
    wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
    wr.writerow(mylist)