In [72]:
%matplotlib inline

# Vectors (Column Vectors)

A vector is a column of numbers. Vectors are related to a **point** but have an essential geometric meaning.

Think of the vector $v$ as an arrow pointing from the origin to the
point $p$. Thus vectors have both magnitude and direction. Obviously the
order of the elements of $v$ matter.

Below is a figure illustrating a vector pointing from from the origin ($O$) to a point $A$ at $x=2$ and $y=3$.

![An example vector](https://goo.gl/qCxJp3)

The vector above would be written as

$\vec{v} = \begin{bmatrix}2\\3\end{bmatrix}$

## Elements of a Vector
The **elements** of the vector are known as **scalars.**

For our example above, the elements of the vector are $2$ and $3$.

## Dimension of a Vector
The **dimension** of a vector is the number of elements in
the vector.

If the elements of the vector come from the real numbers ($\mathbb{R}$), we
would indicate the **n-space** of the vector as
$\mathbb{R}^n$.

For our example above, the dimension of the vector is 2.

**Note:** We can only visualize two or three dimensional vectors. However, most of the vectors we will look at will have a much higher dimension so we will limit our visualizations to 2D. We can plot the projection of a higher-dimension vector on a 2D or 3D space.

## Some Example Vectors

#### Example
$v_1 = \begin{bmatrix} 2\\-5\end{bmatrix}$,
* Elements are $2$ and $-5$
* Dimension is 2

#### Example
$v_2=\begin{bmatrix}7\\9\end{bmatrix}$,
* Elements are $7$ and $9$
* Dimension is 2

#### Example
$v_3=\begin{bmatrix}-3\\4\\5\end{bmatrix}$.
* Elements are -3, 4, and 5
* Dimension is 3

#### Example
$v_4=\begin{bmatrix}3\\-7\\-2\end{bmatrix}$.
* Elements are 3, -7, and -2
* Dimension is 3

### Row vectors

We can also have [**row vectors**](https://en.wikipedia.org/wiki/Row_and_column_vectors). A row vector is the [**transpose**](https://en.wikipedia.org/wiki/Transpose) of a (column) vector. Similarly, a column vector is the transpose of a row vector.

**transpose** of a column vector. $$v_3^T=\begin{bmatrix}3&4&5\end{bmatrix}$$

## Vector Equality
Vectors are equal when every corresponding elements of the vectors are
equal. That is, the $i^{th}$ element in vector $a$ corresponds to the $i^{th}$ element in vector $b$ for all $i$.

## Representing a Vector in Python

Python does not have a native vector type, so what are the native Python data structures that we might use to represent an array? 

* List?
* Tuple?
* Dictionary?
* Set?


In [203]:
def vec_eq(v1, v2):
    """
    checks whether two vectors are equal
    
    v1: 
    v2:
    """
    if len(v1) != len(v2):
        raise ValueError("Vectors are of unequal length")
    for i in range(len(v1)):
        if v1[i] != v1[2]:
            return False
    return True

In [6]:
list(map(lambda a: a[0]==a[1], zip([1,2,5,8,7,2],[3,1,7,8,2,1])))

[False, False, False, True, False, False]

In [208]:
def alpha_x_v(alpha, v):
    """
    
    """
    newv = []
    for e in v:
        newv.append(alpha*e)
    return tuple(newv)

In [209]:
def alpha_x_v(alpha, v):
    """
    
    """
    return tuple(alpha*v for e in v)

In [210]:
alpha_x_v(3,[1,2,0.4])

([1, 2, 0.4, 1, 2, 0.4, 1, 2, 0.4],
 [1, 2, 0.4, 1, 2, 0.4, 1, 2, 0.4],
 [1, 2, 0.4, 1, 2, 0.4, 1, 2, 0.4])

In [211]:
def v_plus_v(v1, v2):
    if len(v1) != len(v2):
        raise ValueError("vectors are not of equal length")
    newv = []
    for i in range(len(v1)):
        newv.append((v1[i]+v2[i]))
    return tuple(newv)

#### Exercise: Using only `alpha_x_v` and `v_plus_v`, how would you solve the following problem?

\begin{equation}
5 \begin{bmatrix}3\\-7\\-2\end{bmatrix} - \begin{bmatrix}2\\12\\5\end{bmatrix}
\end{equation}

In [35]:
import math
import random
def l2_norm(v):
    s = 0
    for e in v:
        s += e**2
    return math.sqrt(s)
def l2_norm_v2(vt):
    s = 0
    for e in v:
        s += e*e
    return math.sqrt(s)

In [197]:
import numpy as np
import numpy.random as ra
v = ra.normal(100,20,50000)
vl = v.tolist()
vt = tuple(vl)

In [198]:
len(v)

50000

In [201]:
import math
import random
def l2_norm_v2(vt):
    s = 0
    for e in v:
        s += e*e
    return math.sqrt(s)

In [44]:
import numpy.linalg as la

In [48]:
%%timeit
la.norm(v)

17.3 µs ± 205 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)


## Inner Product



In [49]:
a = [1,2,3,4]
b = a
c = a[:]

In [50]:
a == b

True

In [51]:
a==c

True

In [52]:
b[0]=-1

In [53]:
b==c

False

In [54]:
a==b

True

In [58]:
def inner(v1,v2):
    if len(v1) != len(v2):
        raise ValueError('vectors are not the same length')
    s = 0
    for i in range(len(v1)):
        s += v1[i]*v2[i]
    return s
def inner_v2(v1,v2):
    if len(v1) != len(v2):
        raise ValueError('vectors are not the same length')
    return sum(map(lambda x:x[0]*x[1], zip(v1,v2)))

## Evaluating function performance with [`%timeit`](http://ipython.readthedocs.io/en/stable/interactive/magics.html#magic-timeit)


When writing our code we will have to make various design decisions. We need ways of evaluating the quality of our choices. There are a variety of metrics that might be considered, such as

* How easy is my code to understand?
* Does my choice require third-party dependencies (i.e., something not distributed with Python)? Can it run on any common operating system?
* Can I write the code in a reasonable amount of time?
* Is the code fast enough to be reasonable?

These are all, admittedly, relative questions. If I'm planning on sharing the code with others then maybe I want to follow Python style conventions. If the code is just for me, then perhaps it is fine to write in my own quirky Python style. If our code is only going to be used once, then maybe what we want to do is spend less time trying to optimize it for speed and just patiently wait for it to run.

There are tools that help us evaluate the style of our code as well as to evaluate the performance of our code. Here we will introduce you to the IPython "magic" `%timeit`

```
Options: -n<N>: execute the given statement <N> times in a loop. If this value is not given, a fitting value is chosen.

-r<R>: repeat the loop iteration <R> times and take the best result. Default: 3

-t: use time.time to measure the time, which is the default on Unix. This function measures wall time.

-c: use time.clock to measure the time, which is the default on Windows and measures wall time. On Unix, resource.getrusage is used instead and returns the CPU user time.

-p<P>: use a precision of <P> digits to display the timing result. Default: 3

-q: Quiet, do not print result.

-o: return a TimeitResult that can be stored in a variable to inspect
```

Within a notebook, we can apply `%timeit` to an entire cell

In [186]:
import time

In [187]:
%%timeit
time.sleep(0.01)
inner(vl,vl)

18.1 ms ± 465 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [188]:
%%timeit
#time.sleep(0.01)
inner(vl,vl)

4.42 ms ± 37.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


or a single line

In [176]:
%timeit inner(vl,vl)

4.52 ms ± 51.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


We can save the results into a `TimeItResult` object.

In [190]:
time_inner_list = %timeit -o -r 10 inner(vl,vl)

4.5 ms ± 91 µs per loop (mean ± std. dev. of 10 runs, 100 loops each)


In [191]:
time_inner_list.average


0.00450143289094558

In [189]:
%%timeit
inner_v2(vl,vl)

6.54 ms ± 134 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


## Using [numpy](http://www.numpy.org/)

The core Python language was not written with linear algebra or scientific computing in mind. The programs we are writing run very slow; we are only writing them to illustrate programming ideas. In reality, if we were writing code that relied on linear algebra we would use `numpy` which is a third-party package available for Python that provides high performance manipulation of vectors and matrices (and n-dimensional arrays in general).

For comparison, here is the performance of computing the inner product with numpy

In [192]:
time_inner_np = %timeit -o -r 10 np.inner(v,v)

ValueError: setting an array element with a sequence

In [193]:
%timeit -o -r 10 np.inner(v,v)

ValueError: setting an array element with a sequence

In [196]:
v

((4, 11), 'g')

In [194]:
%%timeit
np.inner(v,v)

ValueError: setting an array element with a sequence