# Lists vs. Arrays

In [2]:
import numpy as np

In [3]:
L = [1,2,3]                 # creating python list
A = np.array([1,2,3])       # creating numpy array

In [4]:
for e in L:
    print e

1
2
3


In [5]:
for e in A:
    print e

1
2
3


In [6]:
L.append(4)        # adding element to list
L = L + [5]        # another way to add element to list
L

[1, 2, 3, 4, 5]

In [7]:
A + A              # adding two arrays element wise addition in numpy (vector addition)

array([2, 4, 6])

In [8]:
A * 2              # multiplying vector times a scalar

array([2, 4, 6])

In [9]:
A**2               # squaring a vector

array([1, 4, 9])

Something to keep in mind about numpy is that most functions act element wise. This just means that the function is applied to each element of the vector or matrix.

In [10]:
np.sqrt(A)         # taking square root of all elements in vector 

array([1.        , 1.41421356, 1.73205081])

In [11]:
np.log(A)          # element wise log

array([0.        , 0.69314718, 1.09861229])

In [12]:
np.exp(A)          # element wise exponential 

array([ 2.71828183,  7.3890561 , 20.08553692])

With numpy you can treat lists like a vector, a mathematical object. 

---
<br></br>
# Dot Products

Recall that there are two definitions of the dot product, and they are each equivalent. 

1: The first is the summation of the element wise multiplication of the two vectors:

$$a \cdot b = a^Tb = \sum_{d=1}^Da_db_d$$

Here $d$ is being used to index each component. Notice that the convention $a^Tb$ implies that the vectors are column vectors, which means that the result is a (1 x 1), aka a scalar. 

2: The second is the magnitude of $a$, times the magnitude of $b$, times the cosine of the angle between $a$ and $b$:

$$a \cdot b = |a||b|cos\theta_{ab}$$

This method is not very convenient unless we know each of the things on the right hand side to begin with. It would generally be used to find the angle itself. 

### Definition 1
Let's look at this in code.

In [13]:
a = np.array([1,2])
a

array([1, 2])

In [14]:
b = np.array([2,1])
b

array([2, 1])

If we wanted to use the direct definition of the dot product, we would want to loop through both arrays simultaneously, multiply each corresponding element together, and add it to the final sum. 

In [15]:
dot = 0
for e, f in zip(a,b):
    dot += e*f
dot                      # result is 4 as expected

4

Another interesting operation that you can do with numpy arrays is multiply two arrays together. We have already seen how to multiply a vector by a scalar. 

In [16]:
a * b             # element wise multiplication of a and b 

array([2, 2])

However, the above method could not be done with two arrays of different sizes. Now, if we summed the result of `a * b` we would end up with the dot product.

In [17]:
np.sum(a * b)           # this is the element wise multiplication of a and b, summed

4

An interesting thing about numpy is that the sum function is an instance method of the numpy array itself. So we could also write the above as:

In [18]:
(a * b).sum()

4

Now, while both of the above methods yield the correct answer, there is a more convenient way to calculate the dot product. Numpy comes packaged with a dot product function.

In [19]:
np.dot(a,b)

4

Like the `sum` function, the `dot` function is also an instance method of the numpy array, so we can call it on the object itself. 

In [20]:
a.dot(b)

4

This is also equivalent to:

In [21]:
b.dot(a)

4

### Definition 2
Let's now look at the alternative definition of the dot product, to calculate the angle between $a$ and $b$. For this we need figure out how to calculate the length of a vector. We can do this by taking the square root of the sum of each element squared. In other words, use pythagorean theorem.

In [22]:
amag = np.sqrt( (a * a).sum())
amag

2.23606797749979

Numpy actually has a function to do all of this work for us, since it is such a common operation. It is part of the linalg module in numpy, which also contains many other linear algebra functions. 

In [23]:
amag = np.linalg.norm(a)
amag

2.23606797749979

Now with this in hand, we are ready to calculate the angle. For clarity, the angle is defined as:

$$cos\theta_{ab} = \frac{a \cdot b}{|a||b|}$$

In [24]:
cosangle = a.dot(b) / (np.linalg.norm(a) * np.linalg.norm(b))
cosangle

0.7999999999999998

So the cosine of the angle is 0.8, and the actual angle is the arc cosine of 0.8:

In [25]:
angle = np.arccos(cosangle)
angle

0.6435011087932847

By default this is in radians. 

---
<br></br>
# Vectors and Matrices
A numpy array has already been shown to be like a vector: we can add them, multiply them by a scalar, and perform element wise operations like `log` or `sqrt`. So what is a matrix then? Think of it as a two dimensional array.

In [26]:
M = np.array([ [1,2], [3,4] ])        # creating a matrix. 1st index is row, 2nd index is col
M

array([[1, 2],
       [3, 4]])

In [27]:
M[0][0]                      # one way of accessing values from matrix

1

In [28]:
M[0,0]                       # another shorthand way of accessing value in matrix

1

There is an actual data type in numpy called matrix as well.

In [29]:
M2 = np.matrix([ [1,2], [3,4] ])
M2

matrix([[1, 2],
        [3, 4]])

This works somewhat similarly to a numpy array, but it is not exactly the same. Most of the time we just use numpy arrays, and in fact the official documentation actually recommends not using numpy matrix. If you see a matrix, it is a good idea to convert it into an array:

In [30]:
M3 = np.array(M2)
M3

array([[1, 2],
       [3, 4]])

Note that even though this is now an array, we still have convenient matrix operations. For example if we wanted to find the transpose of M:

In [31]:
M

array([[1, 2],
       [3, 4]])

In [32]:
M.T

array([[1, 3],
       [2, 4]])

To summarize, we have shown that a matrix is really just a 2-dimensional numpy array, and a vector is a 1-dimensional numpy array. So a matrix is really like a 2 dimensional vector. The more general way to think about this is that a matrix is a 2-dimensional mathematical object that contains numbers, and a vector is a 1-dimensional mathematical object that contains numbers. 

Sometimes you may see vectors represented as a 2-d object. For example, in a math textbook a column vector may be described as (3 x 1), and a row vector (1 x 3). Sometimes we may represent them like this in numpy. 

---
<br></br>
# Generating Matrices to Work With
Sometimes we just need arrays to try stuff on, like in this course. One way to do this is to use `np.array` and pass in a list:

In [33]:
np.array([1,2,3])

array([1, 2, 3])

However, this is inconvenient since each element needs to be typed in manually. What if we wanted arrays of different sizes?

Lets start by creating a vector of zeros.

In [34]:
Z = np.zeros(10)
Z

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

We can also create a 10 x 10 matrix of all zeros.

In [35]:
Z = np.zeros((10, 10))
Z

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

Notice that the function still takes in 1 input, a tuple containing each dimension. 

There is an equivalent function that creates an array of all ones.

In [36]:
O = np.ones((10, 10))
O

array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.]])

What if we wanted random numbers? We could use `np.random.random`.

In [37]:
R = np.random.random((10,10))
R

array([[0.96343737, 0.99854925, 0.69084484, 0.91755684, 0.48291783,
        0.02555759, 0.17516831, 0.3634088 , 0.90089222, 0.07023349],
       [0.6799274 , 0.63480312, 0.73337487, 0.86592241, 0.20406414,
        0.02725213, 0.77207388, 0.92917478, 0.62360092, 0.98288039],
       [0.75986937, 0.34825806, 0.49157663, 0.60938734, 0.6726383 ,
        0.61759291, 0.80473233, 0.83612947, 0.40102904, 0.35455528],
       [0.57768437, 0.24486975, 0.68563089, 0.31976352, 0.39188633,
        0.51674485, 0.02166198, 0.04836483, 0.82104172, 0.2798675 ],
       [0.02509366, 0.49916432, 0.84900719, 0.35300002, 0.79603557,
        0.98003246, 0.30070069, 0.31738701, 0.58189558, 0.41558848],
       [0.90817507, 0.25390452, 0.5802353 , 0.33638681, 0.74581657,
        0.74122223, 0.42611729, 0.70217425, 0.79854367, 0.64464519],
       [0.60290679, 0.80074689, 0.77560713, 0.48259351, 0.99744191,
        0.54100562, 0.61654673, 0.67541464, 0.6338071 , 0.93510352],
       [0.36316745, 0.23639003, 0.6218475

One thing that we can quickly see is that all of these values are greater than 0 and less than 1. Whenever we talk about random numbers, you should be interested in the probability distribution that the random numbers came from. This particular random functions gives us uniformly distributed numbers between 0 and 1. What if we wanted gaussian distributed numbers? Numpy has a function for that as well.

In [38]:
# G = np.random.randn((10, 10))          this will not work, sine randn does not take tuple

G = np.random.randn(10,10)
G

array([[-6.84528690e-01,  3.74425281e-01,  3.64887135e-01,
         4.65387038e-01,  1.22091445e-01,  1.11746439e+00,
        -1.28299114e+00,  3.56852937e-01, -1.01826519e-01,
        -1.99356953e-01],
       [ 5.96693450e-01,  4.52340082e-01,  9.24243609e-01,
         9.05454642e-01,  8.99825701e-01, -1.40152801e+00,
        -9.77843735e-01,  5.63217497e-01,  9.20192612e-01,
        -1.39819350e+00],
       [-5.27452306e-01,  1.34362200e+00, -2.58970516e+00,
         2.32503800e+00, -3.13726319e-01, -2.00401280e+00,
         2.22468552e+00,  4.84622765e-01, -5.37019815e-01,
         8.16361677e-01],
       [-1.11922312e+00, -6.17240801e-02, -1.18420771e+00,
        -2.25725488e-02,  1.36392258e+00, -8.40137167e-01,
         2.14541699e+00, -1.27132735e-01,  1.70068469e+00,
        -1.24107222e-01],
       [-1.27065763e+00,  1.04731278e+00, -1.12430009e+00,
        -1.49047587e+00,  7.56935650e-01, -1.50156030e-01,
         9.08003130e-01,  1.09982158e+00, -4.65886919e-01,
         7.

Numpy arrays also have convenient ways for us to calculate statistics of matrices.

In [39]:
G.mean()      # gives us the mean

0.039233961362970755

In [40]:
G.var()       # gives us the variance

1.1988098737697799

---
<br></br>
# Matrix Products
When you learn about matrix products in linear algebra, you generally learn about matrix multiplication. Matrix multiplication has a special requirement, and that is that the inner dimensions of the matrices you are multiplying must match. 

For example say we have matrix `A` that is **(2, 3)** and a matrix `B` that is **(3, 3)**, we can multiply A * B, since the inner dimension is 3, however we cannot multiply B * A, since the inner dimensions are 3 and 2, hence they do not match. 

Why do we have this requirement when we multiply matrices? Well lets look at the definition of matrix multiplication:

$$C(i,j) = \sum_{k=1}^KA(i,k)B(k,j)$$

So the (i,j)th entry of $C$ is the sum of the multiplication of all the corresponding elements of the ith row of A and the jth column of B. In other words, C(i,j) is the dot product of the ith row of A and the jth column of B. Because of this, we actually use the `dot` function in numpy! That does what we recognize as matrix multiplication! 

A very natural thing to want to do, both in math and in computing, is element by element multiplication! 

$$C(i,j) = A(i,j) * B(i,j)$$

For vectors, we already saw that an asterisk `*` operation does this. As you may have guessed, for 2-d arrays, the asterisk also does element wise multiplication. That means that when you use the `*` on multidimensional arrays, both of them have to be the exact same size. This may seem odd, since in other languages, the asterisk does mean real matrix multiplication. So we just need to remember that in numpy, the asterisk `*` does mean element by element multiplication, and the `dot` means matrix multiplication. 

Another thing that is odd is that when we are writing down mathematical equations, there isn't even a well defined symbol for element wise multiplication. Sometimes researchers use a circle with a dot inside of it, sometimes they use a circle with an x inside of it. But there does not seem to be a standard way to do that in math. 

---
<br></br>
# More Matrix Operations

The dot product is often referred to as the **inner product**. But we can also look at the **outer product**. An outer product is going to a be a **column vector** times a **row vector**. An inner product is going to a be a **row vector** times a **column vector**. For more information on this checkout my linear algebra walk through in the math appendix. 

In [41]:
a = np.array([1,2])
a

array([1, 2])

In [42]:
b = np.array([3,4])
b

array([3, 4])

Lets first look at the dot product:

In [43]:
np.dot(a,b)

11

Now the inner product:

In [44]:
np.inner(a,b)

11

We see that it is the same as the dot product. Now let's look at the outer product:

In [45]:
np.outer(a,b)

array([[3, 4],
       [6, 8]])

We can get the same result if we ensure that out `a` and `b` are proper matrices, and then use the dot product. Note, here we see the equivalence to the `inner` and `outer` methods above. 

In [46]:
a = np.array([[1,2]])
a

array([[1, 2]])

In [47]:
b = np.array([[3,4]])
b

array([[3, 4]])

In [48]:
b = b.T
b

array([[3],
       [4]])

In [49]:
np.dot(a,b)

array([[11]])

In [50]:
np.dot(b,a)

array([[3, 6],
       [4, 8]])

# Sum along certain axis

If we want to sum along all of the rows in a matrix we can use the following:

In [52]:
A = np.array([[1,1,1], [2,2,2], [3,3,3]])       
A.sum(axis=1)                                    # sum along each row, axis = 1 

array([3, 6, 9])

In [53]:
A.sum(axis=0)                                    # sum along each column, axis = 0

array([6, 6, 6])