# IMDS Computer Workshop 4
### *By Jeffrey Giansiracusa - Michaelmas 2023*


This worksheet covers the content of lectures:

    4.1 Vectors and vector spaces
    4.2 Dot product
    4.3 Projections

Key points for you to learn:

* Vectors, addition and scalar multiplication
* The dot product and how we use it to measure lengths and angles
* Projection of vectors onto a line


# Initialization code to run before you start your work

Click on the cell below and then type Shift-Return to execute it.

In [None]:
import numpy as np
import math
from mpl_toolkits import mplot3d
import matplotlib.pyplot as plt

from bokeh.io import output_notebook, show
from bokeh.plotting import figure
output_notebook()


# Input is a list of 2d vectors to be plotted.
def Plot2dVectors(list_of_endpoints):   
    p = figure(width=600, height=600, title="Vectors!")
    for vect in list_of_endpoints:
        xcoords = [0,vect[0]]
        ycoords = [0,vect[1]]
        p.line(xcoords, ycoords, line_width=2)
        p.circle([vect[0]], [vect[1]], color='red', size=6)
    show(p)

# Input is a list of 3d vectors
# This will draw the vectors with shadows and vertical rise indicators
def Plot3dVectors(list):       
    ax = plt.axes(projection = '3d')
    for vect in list:
        ax.plot([0,vect[0]], [0,vect[1]], [0,vect[2]],color='blue')
        ax.plot([0,vect[0]], [0,vect[1]], [0,0],color='grey')
        ax.plot([vect[0],vect[0]], [vect[1],vect[1]], [0,vect[2]], '--', color='grey')
    ax.plot([0,0], [-10,10], 'g--')
    ax.plot([-10,10], [0, 0], 'g--')
    plt.draw()
    plt.show()    

def Plot3dDots(list):       
    limit=20
    plt.figure(figsize=(10, 8), dpi=80)
    ax = plt.axes(projection = '3d')
    ax.set_xlim(-limit,limit)
    ax.set_ylim(-limit,limit)
    ax.set_zlim(-limit,limit)

    # Draw the shadow
    ax.scatter3D([item[0] for item in list], [item[1] for item in list], [-20 for item in list], color='grey')

    # Draw the coordinate axes
    ax.plot([0,0], [-limit,limit], [0,0], 'g--')
    ax.plot([-limit,limit], [0, 0], [0,0], 'g--')
    ax.plot([0,0], [0, 0], [-limit,limit], 'g--')

    # Now draw the points
    ax.scatter3D([item[0] for item in list], [item[1] for item in list], [item[2] for item in list], c=[item[2] for item in list])

    plt.draw()
    plt.show()   

### If you have Holoviews and Plotly installed and working,
then you can get an interactive 3d view using the fancy version defined below.

In [None]:

import holoviews as hv
from holoviews import dim, opts
hv.extension('plotly')

def FancyPlot3dVectors(list):
    xcoords=[]
    ycoords=[]
    zcoords=[]
    colorlist=[]
    for vect in list:
        steps = np.mgrid[0:100]*0.01
        x = steps*vect[0]
        y = steps*vect[1]
        z = steps*vect[2]
        xcoords += [val for val in x]
        ycoords += [val for val in y]
        zcoords += [val for val in z]
        colorlist += [(1-val) for val in steps]
    xcoords = np.array(xcoords)
    ycoords = np.array(ycoords)
    zcoords = np.array(zcoords)
    return hv.Scatter3D((xcoords, ycoords, zcoords)).opts(cmap='fire', color=colorlist, size=5, width=800, height=800)


# -----------------------------------------------------
# Now some quick basics in Python


We can easily do basic mathematical operations:

* addition and subtraction: + and -
* multiplication:  *
* division: /
* powers x^y:  math.pow(x,y)
* useful functions:  math.exp(), math.sqrt(), math.log(), etc
* trig functions: math.sin(), math.cos(), math.tan(), math.asin(), math.acos(), math.atan()  (all in radians)


### Vectors

Python lists are okay for storing data, but to do mathematics gracefully, it is helpful to convert them into vectors for use with NumPy. As we saw earlier, the command np.array(...) turns a list of numbers into a NumPy vector.

In [None]:
U = np.array( [1,2] )   
V = np.array( [2,10] )

A = [5.1, 6.8, 9, 10]
W = np.array(A)

print(U,V,W)

Now we can do operations with vectors, such as addition and multiplication by a scalar.

In [None]:
print('U =',U)
print('V =',V)
print('U+V =',U + V)
print('4U =',4*U)

A linear combination of vectors is thus very easy with NumPy:

In [None]:
U = np.array([1,2, 3])   
V = np.array([4,5,6])
W = np.array([7,8,9])

10*U + 2*V - 5*W 

A useful trick with NumPy is to apply some mathematical operation to every element in a vector.

For example, 
* U+1 
adds 1 to each element of U, and
*  U\**2
squares each element of U.

In contrast, if you wanted to square every element in a list *without NumPy*, then you would need a for loop, like this:
* ListOfSquares = [x\**2 for x in ListOfNumbers]


We can also apply certain mathematical functions to each element in a vector.  For example if you want to take the cosine of every number in a vector, you can use

* np.cos(U)

Note that we have to use np.cos instead of math.cos because math.cos doesn't know how to work with vectors.

There is more.  If U+V means add each element of U to the corresponding element of V, can you guess what U\**V or U/V might do?  If you're not sure, go ahead and try it below.



In [None]:
# code here

### Use U @ V for the dot product of vectors
NumPy has an operator that automatically computes the dot product of two vectors (as long as they both have the same length).
Use U @ V to compute $\vec{U}\cdot \vec{V}$.

(Next week we'll see that the @ operator is also used for matrix multiplication.)

### Visualisations

In the initialisation block at the top, there are some easy-to-use plotting functions defined for you.

**Plot2dVectors(...)** takes a list of 2d (x,y) vectors as input (the vectors can be np.arrays or simply Python lists of numbes) and draws them in the plane.

**Plot3dvectors(...)** does the same in 3d.  

Making sense of 3d plots is sometimes difficult, and it can be useful to have a plot that you can rotate around and look at from different directions.  if **Plotly** is working for you, you can use the 'fancy' version instead

**FancyPlot3dVectors(...)**



In [None]:
# An example of 2d plotting

vect1 = [2, 5]
vect2 = [5,-4]
vect3 = [-4,0]

list_of_vectors = [vect1, vect2, vect3]
Plot2dVectors(list_of_vectors)



In [None]:
# An example of 3d plotting.
# The input is a list containing some number of 3d vectors [x,y,z] (as Python lists or np.array objects)

Plot3dVectors( [[7,7,3], [5,-5,2], [-8,1,5], [-5, -10, 5]] )

In [None]:
FancyPlot3dVectors( [[7,7,3], [5,-5,2], [-8,1,5], [-5, -10, 5]] )


---
#  Now it is your turn to do work on some exercises 


---
### Exercise 1

1. What is the length of the vector $\vec{v}=(1,2)$?  Use the dot product.
2. What is the angle between the vectors $\vec{u}=(3,0)$ and $\vec{v}$? Use the dot product. 
3. What is the angle between the vectors $\vec{u}$ and $\vec{w}=(-1,2)$?
4. Now find the angle between the 3d vectors (5,1,0) and (-4,2,7).

In [None]:
# Write some code here

u = ...
v = ...
w = ...



4. Write a function **Proj(vect1, vect2)** that calculates and returns the projection of vect1 onto vect2.  Make sure it works in both 2d and 3d (and higher dimensions).
5. Use your function to calculate the projection of $\vec{u}$ onto $\vec{v}$, and the projection of $\vec{w}$ onto $\vec{v}$.

In [None]:
def Proj(vect1, vect2):
    # put some code here
    ...
    return ...


Here is some code to draw a plot and check that you code is working correctly in 2d.

In [None]:
def Plot2dProjection(vect1, vect2):   
    p = figure(width=800, height=600, title="Vectors!")
    vect3 = Proj(vect1, vect2)
    for vect in [vect1, vect2, vect3]:
        xcoords = [0,vect[0]]
        ycoords = [0,vect[1]]
        p.line(xcoords, ycoords, line_width=2, color='blue')
        p.circle([vect[0]], [vect[1]], color='red', size=6)
    p.line([vect3[0], vect1[0]], [vect3[1], vect1[1]], line_width=2, color='grey')
    show(p)

In [None]:
# Test your projection function here.
Plot2dProjection(..., ...)

---
# Exercise 2

The equation $L_1: 5x - 2y = 0$ defines a line in the plane $\mathbb{R}^2$.  

 1. Find a few points on $L_1$. 
 2. Use your Proj function to find the distance from the point (10,10) to $L_1$.
 

In [None]:
# code here


 Now consider the line $L_2: 4x + 3y = 8$. Note that this line doesn't pass through the origin.
 
 3. Find a couple of points on this line.

If $\vec{v}=(a,b)$ is a vector that lies on the line $L_2$, then translating by -$\vec{v}$ should give us a line that passes through the origin.  

5. Find the distance from the point (10,10) to the line $L_2$. To do this, you can translate everything by $-\vec{v}$ and then use the same method as you used in part 2. 

---
# Exercise 3

We have some hippos in the zoo and the vet comes to check on them.  She measures weight W (in tonnes) and body circumferences C (in meters).  In a healthy hippo we expect these numbers to be related:
$$ 3W = 2C. $$

The hippos in give the following (W,C) vectors.

* Hippo 1: (5, 7.5)
* Hippo 2: (6, 7)
* Hippo 3: (4.5, 6.75)
* Hippo 4: (6.6, 9.9)

Tasks: 
1. Plot these data points.
2. The vet suspects that one hippo is unwell! Which one?  

*Note:  This is a little different from the usual linear regression setup where one variable is independent and the other is dependent.  Here W and C are related, but we shouldn't consider one dependent.  Instead, think of this as being about distance to a line.*

3. Use projection to find out what its weight and circumference should probably be.
4. Can you find a vector $\vec{v}$ such that a hippo is healthy if the dot product $\vec{v} \cdot (W,C)$ is approximately zero?

In [None]:
h1 = [5, 7.4]
h2 = [6.3, 7]
h3 = [4.5, 6.75]
h4 = [6.6, 9.7]

# code here


In [None]:
[100/78, 200/160]

---

# Exercise 4

In machine learning and data science, we often have vectors where the length doesn't matter much but the direction is important.  In a situation like this, if we want to measure how similar two vectors are, we might just look at the angle between the vectors.  In fact, we can use the cosine of the angle as a measure of the similariy.  A cosine of 1 means the vectors point in the same direction, 0 means they are orthogonal, and -1 means they point in opposite directions.  This is called **cosine similarity**.

One example where cosine similarity gets used frequently is in text analysis.  We might compare the relative frequencies of words or letters.  Does one document use consonants more than vowels when compared to another?  Suppose one text has 100 consonants and 78 vowels, while another has 200 consonants and 160 vowels.  The second text is twice as long, so the distance between the points (100,78) and (200,160) might be large, but the ratios 100/78 = 1.28 and 200/160 = 1.25 are very close.  Eventhough their lengths are very different, these two vectors point in nearly the same direction and so their cosine siilarity is close to 1.  


Here is an example.  Consider the following poems.

### 1 

*Some primal termite knocked on wood

And tasted it, and found it good!

And that is why your Cousin May

Fell through the parlor floor today.*

### 2

*When I was One,

I had just begun.

When I was Two,

I was nearly new.

When I was Three

I was hardly me.

When I was Four,

I was not much more.

When I was Five,

I was just alive.

But now I am Six,

I'm as clever as clever,

So I think I'll be six now for ever and ever. *

### 3

And death shall have no dominion.

Dead man naked they shall be one

With the man in the wind and the west moon;

When their bones are picked clean and the clean bones gone,

They shall have stars at elbow and foot;

Though they go mad they shall be sane,

Though they sink through the sea they shall rise again;

Though lovers be lost love shall not;

And death shall have no dominion.

And death shall have no dominion.

Under the windings of the sea

They lying long shall not die windily;

Twisting on racks when sinews give way,

Strapped to a wheel, yet they shall not break;

Faith in their hands shall snap in two,

And the unicorn evils run them through;

Split all ends up they shan’t crack;

And death shall have no dominion.

And death shall have no dominion.

No more may gulls cry at their ears

Or waves break loud on the seashores;

Where blew a flower may a flower no more

Lift its head to the blows of the rain;

Though they be mad and dead as nails,

Heads of the characters hammer through daisies;

Break in the sun till the sun breaks down,

And death shall have no dominion.

### ----------------

For each of these poems, count how many times the letters d,e, and m occur.  For example, there are two d's in the first one.  Put these numbers into a vector for each poem.

P1 = (number of d's, number of e's, number of m's) of poem 1.
P2 = same for poem 2.
P2 = same for poem 3.

Note that one poem is significantly longer than the others, so we expect any letter count to be higher for it.

Calculate the cosine similarity between each pair of P1, P2, P3.


In [None]:
# code here
# It might be helpful to use the string.count method to count occurrences of each letter, as shown below.

a_string = "the quick brown fox"
print(a_string.count('q'))
