#  Arithmetic with vectors and matrices

In this notebook we review some elementary arithmetic done with vectors and matrices (or arrays).   

------
This isn't a notebook you just read - you'll need to complete several coding portions of it (either individually or in groups) and think about the questions posed herein in order to build up your intuitive understanding of these algorithms, as well as your practical ability to use them via scikit-learn.  Whenever you see 'TODO' please do perform the requested task.

In other words, this is 'learning by discovery' notebook where you (either individually or in small groups) will start to build up your understanding of machine learning by doing real work and discussing it with your peers and instructors.  This is the best way to learn anything, far more effective than a book or lecture series.

#  1.  Arithmetic of vectors

## Basic concepts

Vectors are just generalized scalars - they are one dimensional arrays of numbers - and come in two forms: rows or columns.

A row vector is a long horizontal array of numbers like the one below.


$\mathbf{x}=\left[\begin{array}{ccc}
3.7 & 1 & 0.4\end{array}\right]$

The dimensions of this vector are 1 by 3 (or written 1 x 3 for short) - indicating it is a row vector with 3 entries.

The other type of vector is a column vector - a long vertical array of numbers like the one below.

$\mathbf{x}=\left[\begin{array}{c}
1\\
1\\
7\\
0.4
\end{array}\right]$

The dimensions of this vector are 4 by 1 (4 x 1 for short) - indicating it is a column vector with 4 entries.

We can flip or 'transpose' a column vector to make it a row vector, and vice versa.  This is denoted with a superscript 'T' - as in $\mathbf{x}^{T}$.

So for example if 

$\mathbf{x}=\left[\begin{array}{ccc}
3.7 & 1 & 0.4\end{array}\right]$  

is a 1x3 row vector then

$\mathbf{x}^{T}=\left[\begin{array}{c}
3.7\\
1\\
0.4
\end{array}\right]$

is a 3x1 column vector.  Whenever you transpose a vector you always switch its dimensions - so a 1xN vector becomes an Nx1 vector after transposing (and vice versa).

Note: a general 1xN row vector is denoted by 

$\mathbf{x}=\left[\begin{array}{cccc}
x_{1} & x_{2} & \cdots & x_{N}\end{array}\right]$

and its transpose - an Nx1 column vector - can be written using transpose notation as 


$\mathbf{x}=\left[\begin{array}{cccc}
x_{1} & x_{2} & \cdots & x_{N}\end{array}\right]^{T}$

Lets look at this functionality in Python's Numpy library - a great library for performing computations with vectors and matrices.

In [3]:
import numpy as np       # import statement for numpy
x = np.asarray([3.7,1,0.4])

Note that by default an array initialized in this way is *dimensionless* - which you can see by printing its shape as follows.

In [4]:
print np.shape(x)

(3,)


We can re-shape this array as a row vector by performing the following action.

In [7]:
x.shape = (1,3)     # this reshapes the array as a row vector
print np.shape(x)

(1, 3)


To shape this array as a column we use a similar line as follows.

In [8]:
x.shape = (3,1)
print np.shape(x)

(3, 1)


We can also transpose a numpy vector by writing --> x.T   Print this out we can see that the vector is indeed transposed.

In [11]:
print 'the original vector - is a column vector'
print x
print 'the transpose is a row vector'
print x.T

the original vector - is a column vector
[[ 3.7]
 [ 1. ]
 [ 0.4]]
the transpose is a row vector
[[ 3.7  1.   0.4]]


## Adding and subtracting vectors

Adding and subtracting vectors is just like adding and subtracting scalars - with one caveat: you can only add vectors that have the same dimensions.  So you *can't* add two row vectors of different lengths, or a row vector and column vector.  So for example to add two Nx1 column vectors 

$\mathbf{x}=\left[\begin{array}{c}
x_{1}\\
x_{2}\\
\vdots\\
x_{N}
\end{array}\right],\,\,\,\,\,\,\,\mathbf{y}=\left[\begin{array}{c}
y_{1}\\
y_{2}\\
\vdots\\
y_{N}
\end{array}\right]$

we add them *entry-wise* as 

$\mathbf{x}+\mathbf{y}={\left[\begin{array}{c}
x_{1}+y_{1}\\
x_{2}+y_{2}\\
\vdots\\
x_{N}+y_{N}
\end{array}\right]}$

The same holds for subtraction - its done entrywise.

Lets try some addition experiments in numpy.

In [12]:
x = np.array([1,4,2])
x.shape = (3,1)
y = np.array([0,3,8])
y.shape = (3,1)
print (x+y)

[[ 1]
 [ 7]
 [10]]


But if we try to add two vectors that aren't the same shape - we'll get into trouble - and numpy will throw an error reflecting this.

In [15]:
x = np.array([1,4])
x.shape = (2,1)
y = np.array([0,3,8])
y.shape = (3,1)
print (x+y)

ValueError: operands could not be broadcast together with shapes (2,1) (3,1) 

## Multiplying a vector by a scalar

We can multiply vectors by a scalar in a natural way - to multiply a vector by a scalar multiply each entry of the vector by that scalar.  

For example, to multiply the vector 


$\mathbf{x}=\left[\begin{array}{c}
2\\
4\\
1
\end{array}\right]$


by 3 we have


$3\cdot\mathbf{\mathbf{x}}=\left[\begin{array}{c}
3\cdot2\\
3\cdot4\\
3\cdot1
\end{array}\right]=\left[\begin{array}{c}
6\\
12\\
3
\end{array}\right]$

In Python we can compute this simply as 

In [51]:
x = np.array([2,4,1])
x.shape = (3,1)
print 3*x

[[ 6]
 [12]
 [ 3]]


This holds in general for a general Nx1 vector $\bf{x}$ - to multiply it by any scalar $\alpha$ we compute

$\alpha\cdot\mathbf{x}=\left[\begin{array}{c}
\alpha\cdot x_{1}\\
\alpha\cdot x_{2}\\
\vdots\\
\alpha\cdot x_{N}
\end{array}\right]$

## Multiplying two vectors - the inner product

Multiplying vectors generalizes the concept of multiplication of scalar values.  To multiply to Nx1 length vectors we multiply them entry-wise and add up the result - **giving a scalar value** (just like scalar multiplication).  That is 

$\mathbf{x}^{T}\mathbf{y}=\left[\begin{array}{c}
x_{1}\\
x_{2}\\
\vdots\\
x_{N}
\end{array}\right]^{T}\left[\begin{array}{c}
y_{1}\\
y_{2}\\
\vdots\\
y_{N}
\end{array}\right]=x_{1}y_{1}+x_{2}y_{2}+\cdots x_{N}y_{N}$


Using summation notation we can write this more compactly.  Summation notation zips up - notationally speaking - sums.  For example, to sum up $N$ numbers $a_1, a_2,...,a_N$ we can write 

$a_1 + a_2 + ... + a_N = \sum_{n=1}^{n=N}a_n$

We can write the mutliplication of two vectors compactly then as 

$\mathbf{x}^{T}\mathbf{y}=x_{1}y_{1}+x_{2}y_{2}+\cdots x_{N}y_{N}=\sum_{n=1}^{n=N}{x_ny_n}$

This vector multiplication is known as the **inner product**.  

Lets try computing the inner product using numpy and a few example vectors.  Note that in order to compute the inner product in numpy you must use the np.dot function.  

In [27]:
x = np.array([1,4,2])
x.shape = (3,1)
y = np.array([0,3,8])
y.shape = (3,1)
print (np.dot(x.T,y))[0][0]

28


Notice how writing x*y in Python does not give you a single - this gives a vector whose entries are the product of the entries of each vector.  e.g., 

In [22]:
print x*y

[[ 0]
 [12]
 [16]]


So another way to get the inner product is to sum this entry-wise vector multiplication.

In [26]:
print np.sum(x*y)
print np.dot(x.T,y)[0][0]

28
28


# TODO

### TODO

Verify whether the following relationship involving the inner product 

$\left(\mathbf{x}+\mathbf{y}\right)^{T}\mathbf{z}=\mathbf{x}^{T}\mathbf{z}+\mathbf{y}^{T}\mathbf{z}$

is true or false.  

Creating a few test vectors using numpy to check see if the relationship holds for them numerically.

## Multiplying two vectors - the outer product

Another way to multiply vectors is called the **outer product**.  This is defined for two Nx1 column vectors $\bf{x}$ and $\bf{y}$

$\mathbf{x}=\left[\begin{array}{c}
x_{1}\\
x_{2}\\
\vdots\\
x_{N}
\end{array}\right],\,\,\,\,\,\,\,\mathbf{y}=\left[\begin{array}{c}
y_{1}\\
y_{2}\\
\vdots\\
y_{N}
\end{array}\right]$


as


$\mathbf{x}\mathbf{y}^{T}=\left[\begin{array}{c}
x_{1}\\
x_{2}\\
\vdots\\
x_{N}
\end{array}\right]\left[\begin{array}{c}
y_{1}\\
y_{2}\\
\vdots\\
y_{N}
\end{array}\right]^{T}=\left[\begin{array}{cccc}
x_{1}y_{1} & x_{1}y_{2} & \cdots & x_{1}y_{N}\\
x_{2}y_{1} & x_{2}y_{2} &  & \vdots\\
\vdots &  & \ddots & \vdots\\
x_{N}y_{1} & \cdots & \cdots & x_{N}y_{N}
\end{array}\right]$


So for example, for two vectors 

$\mathbf{x}=\left[\begin{array}{c}
2\\
4\\
1
\end{array}\right]\,\,\,\,\mathbf{y}=\left[\begin{array}{c}
3\\
0\\
5
\end{array}\right]$

the outer product is 

$\mathbf{x}\mathbf{y}^{T}=\left[\begin{array}{ccc}
2\cdot3 & 2\cdot0 & 2\cdot5\\
4\cdot3 & 4\cdot0 & 4\cdot5\\
1\cdot3 & 1\cdot0 & 1\cdot5
\end{array}\right]=\left[\begin{array}{ccc}
6 & 0 & 10\\
12 & 0 & 20\\
3 & 0 & 5
\end{array}\right]$

In Python we can compute this outer product as 

In [28]:
x = np.array([2,4,1])
x.shape = (3,1)
y = np.array([3,0,5])
y.shape = (3,1)
print x*y.T

[[ 6  0 10]
 [12  0 20]
 [ 3  0  5]]


### TODO

Verify whether the following relationship involving the outer product 

$\left(\mathbf{x}+\mathbf{y}\right)\mathbf{z}^{T}=\mathbf{x}\mathbf{z}^{T}+\mathbf{y}\mathbf{z}^{T}$

is true or false.

Creating a few test vectors using numpy to check see if the relationship holds for them numerically.

# 2.  Arithmetic with matrices

This works analagously to that of vectors.  e.g., you can only add matrices of the same size, multiplication works as an inner product, etc.  Here we quickly review a few fundamental operations on matrices.

Suppose you have $P$ row vectors - each of length $N$

$\mathbf{x}_{1}=\left[\begin{array}{cccc}
x_{11} & x_{12} & \cdots & x_{1N}\end{array}\right]$

$\mathbf{x}_{2}=\left[\begin{array}{cccc}
x_{21} & x_{22} & \cdots & x_{2N}\end{array}\right]$

$\vdots$

$\mathbf{x}_{P}=\left[\begin{array}{cccc}
x_{P1} & x_{P2} & \cdots & x_{PN}\end{array}\right]$

Notce how the elements of each vector have 2 indices now - the first index tells us which vector the element belongs too.  The second indexes the element in the vector itself.  

If you stack these row vectors on top of each other you create a matrix with $P$ rows and $N$ columns - referred to as a $P \times N$ maatrix

$\mathbf{X}=\left[\begin{array}{cccc}
x_{11} & x_{12} & \cdots & x_{1N}\\
x_{21} & x_{22} & \cdots & x_{2N}\\
\vdots & \vdots & \ddots & \vdots\\
x_{P1} & x_{P2} & \cdots & x_{PN}
\end{array}\right]$

Whereas each of the individual row vectors was *one-dimensional*, this matrix has *two dimensions*.

### Adding matrices

If you have two matrices 

$\mathbf{X}=\left[\begin{array}{cccc}
x_{11} & x_{12} & \cdots & x_{1N}\\
x_{21} & x_{22} & \cdots & x_{2N}\\
\vdots & \vdots & \ddots & \vdots\\
x_{P1} & x_{P2} & \cdots & x_{PN}
\end{array}\right]$

and

$\mathbf{Y}=\left[\begin{array}{cccc}
y_{11} & y_{12} & \cdots & y_{1N}\\
y_{21} & y_{22} & \cdots & y_{2N}\\
\vdots & \vdots & \ddots & \vdots\\
y_{P1} & y_{P2} & \cdots & y_{PN}
\end{array}\right]$

then to add them - much like vectors - you just add element-wise.  i.e.,

$\mathbf{X}+\mathbf{Y}=\left[\begin{array}{cccc}
x_{11}+y_{11} & x_{12}+y_{12} & \cdots & x_{1N}+y_{1N}\\
x_{21}+y_{21} & x_{22}+y_{22} & \cdots & x_{2N}+y_{2N}\\
\vdots & \vdots & \ddots & \vdots\\
x_{P1}+y_{P1} & x_{P2}+y_{P2} & \cdots & x_{PN}+y_{PN}
\end{array}\right]$

Lets try a few computations in Python.

In [8]:
import numpy as np

# this is how you define a 3 x 3 matrix (or array) using numpy
X = np.array([[0,1,2],[1,4,2],[5,3,1]])
print X
print 'the shape of X is ' + str(np.shape(X))
print 'X is of type ' + str(type(X))

[[0 1 2]
 [1 4 2]
 [5 3 1]]
the shape of X is (3, 3)
X is of type <type 'numpy.ndarray'>


Note a few things.

First: in numpy we are using what is called an *array* - which is a slight generalization of a matrix.  There is a[matrix subclass of arrays in numpy - which basically allows for [more natural writing of matrix multiplication in Python](http://stackoverflow.com/questions/4151128/what-are-the-differences-between-numpy-arrays-and-matrices-which-one-should-i-u) - but its much more common to see people using array objects so thats what we'll use here.

Second: the numpy array is a specific kind of data structure.  If you just write this in Python

In [9]:
X = [[0,1,2],[1,4,2],[5,3,1]]
print X
print np.shape(X)
print 'X is of type ' + str(type(X))

[[0, 1, 2], [1, 4, 2], [5, 3, 1]]
(3, 3)
X is of type <type 'list'>


you get a 3 x 3 *list*, not a 3 x 3 array.  You can't perform all the matrix-like arithmetical operations (in particular multiplication) with lists.

You can however transform a list like this into an array by just doing

In [11]:
X2 = np.array(X)
print X2
print np.shape(X2)
print 'X is of type ' + str(type(X2))

[[0 1 2]
 [1 4 2]
 [5 3 1]]
(3, 3)
X is of type <type 'numpy.ndarray'>


In any case, how do you add two arrays in numpy?  Pretty straight-forward.

In [22]:
# create two 3x3 matrices
X = np.array([[0,1,2],[1,4,2],[5,3,1]])
Y = np.array([[3,3,3],[2,8,1],[0,2,1]])
print 'the matrix X'
print X 
print '\n'
print 'the matrix Y'
print Y
print '\n'

# add the two matrices
Z = X + Y
print 'the matrix X + Y'
print Z

the matrix X
[[0 1 2]
 [1 4 2]
 [5 3 1]]


the matrix Y
[[3 3 3]
 [2 8 1]
 [0 2 1]]


the matrix X + Y
[[ 3  4  5]
 [ 3 12  3]
 [ 5  5  2]]


But as with vectors, **you can only add mat mmatrices of like size**.  Try adding the matrices below and see what happens

In [23]:
X = np.array([[0,1,2],[1,4,2],[5,3,1]])
Y = np.array([[0,1],[2,3]])
Z = X+Y

ValueError: operands could not be broadcast together with shapes (3,3) (2,2) 

### Multiplying a matrix times a scalar

How do you think this works?  Say e.g., I want to multiply a matrix $\bf{X}$ times 2.23?  Use numpy to test it out!

In [None]:
# TODO: how does multiplying a matrix times a scalar work?  Use numpy to gain the intuition!




### Multiplying a matrix times a vector

A crucial concept for our purposes with regards to matrices is how to multiply a matrix times a vector - as many of the computations in machine learning algorithms require **matrix-vector products**.

How do they work?  

Suppose you want to multiply our $P \times N$ matrix $\bf{X}$ by an $N \times 1$ vector $\bf{w}$

$\mathbf{w}=\left[\begin{array}{c}
w_{1}\\
w_{2}\\
\vdots\\
w_{N}
\end{array}\right]$

Mathematically we write this multiplication as $\bf{X}\bf{w}$, and the result of this multiplication is a $P \times 1$ vector $\bf{y}$.  You can write this as 

$\bf{X}\bf{w} = \bf{y}$

What does $\bf{y}$ look like in terms of $\bf{X}$ and $\bf{w}$?  If we think of $\bf{X}$ as a concenatination of row vectors then we can write the product $\bf{y}$ simply as 

$\mathbf{X}\mathbf{w}=\left[\begin{array}{c}
\mathbf{x}_{1}^{T}\mathbf{w}_{\,}^{\,}\\
\mathbf{x}_{2}^{T}\mathbf{w}_{\,}^{\,}\\
\vdots\\
\mathbf{x}_{P}^{T}\mathbf{w}_{\,}^{\,}
\end{array}\right]=\left[\begin{array}{c}
\underset{n=1}{\overset{N}{\sum}}x_{1n}w_{n}\\
\underset{n=1}{\overset{N}{\sum}}x_{2n}w_{n}\\
\vdots\\
\underset{n=1}{\overset{N}{\sum}}x_{Pn}w_{n}
\end{array}\right]$

That is - each entry is an inner product of a row vector and the column vector $\bf{w}$.

NOTE: as with vectors and matrix addition, you can only multiply matrices of similar shape.  However unlike matrix addition you only need the *inner* dimensions to match in order to multiply.  For example with the example above we multiplied a $(P \times N)$ matrix ($\bf{X}$) by a $(N \times 1$) vector ($\bf{w}$) - the inner dimension has, which here was $N$, has to match - to get a $P \times 1$ vector ($\bf{w}$).

How does this look in numpy?  Lets dig into an example.

In [34]:
# multiply an array by a vector?  No sweat.
X = np.array([[12,3,-1],[0,1,2],[1,4,2],[5,3,1]])
w = np.array([0,1,3])
w.shape = (len(w),1)
y = np.dot(X,w)
print "X has shape = " + str(np.shape(X))
print "w has shape = " + str(np.shape(w))
print "y has shape = " + str(np.shape(y))
print 'y = ' + str(y)

X has shape = (4, 3)
w has shape = (3, 1)
y has shape = (4, 1)
y = [[ 0]
 [ 7]
 [10]
 [ 6]]


Note with numpy we didn't need to re-shape the vector $\bf{w}$ to have a second dimension.  At creation it has only one dimension but numpy will figure out the right way to multiply the matrix and vector in this case - even if mathematically the multiplication doesn't make sense.  However the output will also have only 1 dimension as well.

In [33]:
# multiply an array by a vector?  No sweat.
X = np.array([[12,3,-1],[0,1,2],[1,4,2],[5,3,1]])
w = np.array([0,1,3])
y = np.dot(X,w)
print "X has shape = " + str(np.shape(X))
print "w has shape = " + str(np.shape(w))
print "y has shape = " + str(np.shape(y))
print 'y = ' + str(y)

X has shape = (4, 3)
w has shape = (3,)
y has shape = (4,)
y = [ 0  7 10  6]


In [36]:
# TODO: What error do you get if you try to multiply a
# matrix and a vector that don't have matching inner-shape?




For further info on matrix multiplication [see this appendix document](http://media.wix.com/ugd/f09e45_5e6ded23bdae4f84aeeedec53a909a35.pdf) from [1].