# SAVED TEXT

## Unit-vector representation
Since we can add, subtract and scale vectors, we can represent any vector in vector space as the sum of its unit vectors, each multiplied with their own scalar. This gives us another mean of representing vector. In the 2-D case, the $x_{1}$ component is represented by a scalar multiple of $\hat{\mathbf{i}}$, and the $x_{2}$ component is represented by a scalar multiple of $\hat{\mathbf{j}}$. For example, the two representations below correspond to the same vector:

$$\begin{bmatrix} 2 \\ 3 \end{bmatrix} = 2\mathbf{\hat{i}} +  3\mathbf{\hat{j}}$$

These concepts in scalar multiplication extend to the N-D case. In general, the unit-vector representation becomes innefficient in representing high-dimensional vectors.


## Geometric Inner Product
This becomes even clearer when we use the geometric definition of the innner produt, also applicable to vectors of any dimensions:

$$\mathbf{\vec{x}}\cdot \mathbf{\vec{y}} = |\mathbf{\vec{x}}||\mathbf{\vec{y}}|\cos(\theta)$$

Both the algebraic and geometric definition are equivalent. We can now define different cases: 
- If $\theta = 0$, the vectors are perfectly alligned and have the same direction, meaning the vectors are __parallel__. This is always the orientation that returns the __greatest__ inner product.
- If  $\theta = \pi$, the vectors have exactly the opposite orientation, meaning the vectors are __anti-parallel__. This is always the orientation that returns the __lowest__ inner product.
- If $\theta =  \frac{\pi}{2}$, the vectors are __orthogonal__, meaning the inner product is 0.

Sometimes, parallel and anti-parallel vectors are referred to within the same category: __collinear__.

Another important property to understand, which some of you may have already realised from the diagrams above, is that if two vectors are parallel, they are positive scalar multiples of each other, whereas if two vectors are anti-parallel, they are negative scalar multiples of each other. This also means collinear vectors share the same unit vector! We can generalize this notion as shown below, where if $\vec{\mathbf{x}}$ and $\vec{\mathbf{y}}$ are collinear:

$$ \vec{\mathbf{x}} = a\vec{\mathbf{y}} $$
for a given scalar, a, where:
- the vectors are parallel if a>0
- the vectors are anti-parallel if a<0

We now have simple means of classification of a pair of vectors. If their inner-product is 0, the two vectors are orthogonal. If one is the scalar multiple of the other, it is either paralle or anti-parallel, depending on the sign of the scalar. Another distinction between a pair/set of vectors is that, if all vectors in a set are orthogonal and have unit length, then they are further classified __orthonormal vectors__.


## Cosine Similarity
We now understand that the inner-product of two vectors represents how similar two vectors are, proportionally to their magnitudes. But what if we are uninterested in their magnitudes, and only interested in their allignment? Well, now we know that every vector has a respective unit vector to represent its direction with a magnitude of 1. To calculate the unit vector, we can simply divide the vector by its length. So, to calculate the similarity of two vectors, we can simply calculate the inner product of their unit vectors. A neat trick is that when we combine the geometric and algebraic definitions of the inner-product, we get what is known as __cosine similarity__:

$$\text{similarity} = \cos(\theta) = \frac{\vec{\mathbf{x}}\cdot \vec{\mathbf{y}}} {||\vec{\mathbf{x}}|||\vec{\mathbf{y}}||}$$

Meaning that the cosine of the angle between the two vectors is a useful measure of allingment of two vectors irrespective of their lengths, where:
- $\cos(\theta) = 1 $: parallel vectors
- $\cos(\theta) = -1 $: anti-parallel vectors
- $\cos(\theta) = 0 $: orthogonal vectors


## Vector Norms
Before, to calculate the length, or magnitude of a vector, we calculated what is known as __Euclidian distance__, which is the most common metric for vector magnitude. However, there are infinitely many other ways to measure the length of a vector, known as __vector norms__. Generally, for an N-D vector, we define vector norms as:

$$||\vec{\mathbf{x}}||_{p} = \left[ \sum_{i=1}^{N}|x_{i}|^{p} \right] ^{\frac{1}{p}}$$

Where $p=1,2,3,...,\infty$.

As you can probably tell from the definition, Euclidian distance is formally known as the 2-norm, denoted as $||\mathbf{\vec{x}}||_{2}$. However, since it is so common, we normally drop the '2'. Sometimes, the 1-norm is also used, although less frequently. So what are the differences between the different norms? By looking at the general equation for the p-norm, we see that we take the $p^{th}$ root of the sum of the vector values to the power of p. This means that the higher the value of p, the less weight smaller valued components will take. In fact, when we reach the $\infty$-norm, we simply get the largest component value of our vector.

Below, we compute some vector norms in Standard Python and in NumPy. As we cannot physically compute the $\infty$-norm, we will approximate it with a large enough value for p.



## Gram Schmidt Orthonormalization Process Intuition

### make our own version of this diagrams (this one has the right idea but isn't _exactly_ what we want)!

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/9/97/Gram%E2%80%93Schmidt_process.svg/1280px-Gram%E2%80%93Schmidt_process.svg.png" width="500px" height ="500px">

The goal is to create a vector orthogonal to the normalized one ($\mathbf{\hat{e_{1}}}$) based on another vector ($\mathbf{\vec{v_{2}}}$). This process works because the the inner-product gives the magnitude of the projection of one vector onto another, and multiplying with the vector ($\mathbf{\hat{e_{1}}}$) gives the direction of the projection. As shown above, by subtracting the projection, we make our vector orthogonal to the normalized vector ($\mathbf{\hat{e_{1}}}$). We then normalize the vector that is orthogonal to the first to get $\mathbf{\hat{e_{2}}}$. To make the 3rd vector orthonormal to the first two, we must subtract both their projections from the 3rd vector, $\mathbf{\vec{v_{3}}}$, then normalize it to get $\mathbf{\hat{e_{3}}}$. In general terms, to obtain the respective vector that is orthonormal to the already orthonormalized vectors in our set, we subtract their projections from the vector at hand, then normalize it.


# Rotations and Scaling Content
Generally, two of the most common types of linear transformations are __rotation__ and __scaling__ transformations. Scaling is a transformation that returns a scalar product of the input vector, whereas rotation is a transformation that does not affect the length of the vector, but rotates it in space. Examples of these are shown below.

$$\text{Scaling: } \begin{bmatrix}2 & 0 \\ 0 & 2\end{bmatrix} \times \begin{bmatrix}1 \\ 1\end{bmatrix} = \begin{bmatrix} 2 \\ 2\end{bmatrix}$$

$$\text{Rotation: } \begin{bmatrix}0 & -1 \\ 1 & 0\end{bmatrix} \times \begin{bmatrix}2 \\ 0\end{bmatrix} = \begin{bmatrix} 0 \\ 2\end{bmatrix}$$

# Saved Code

In [None]:
# Parallel Vectors Plotly

# Vector components
x1 = [0,1,2]
x2 = [0,1,2]


fig = go.Figure(data=[go.Scatter(
    x=x1, y=x2,
    mode='markers',
    marker = dict(size=[10,40,40], 
                  color=["black","orange","orange"]),
    )
])

fig.update_layout(
    title="Parallel Vectors ($x \cdot y = 4$)",
    xaxis_title="$x_{1}$",
    yaxis_title="$x_{2}$",
)
fig.add_trace(go.Scatter(x=[0, 1], y=[0, 1], marker_color="black"))
fig.add_trace(go.Scatter(x=[0, 2], y=[0, 2], marker_color="black"))

fig.update_layout(showlegend=False)
fig.show()

In [None]:
# Cosine simarility functions

def cosine_similarity(v1,v2):
    product = inner_product(v1,v2)
    similarity = product/(vector_length(v1)*vector_length(v2))
    return similarity

def cosine_similarity(v1,v2):
    similarity = np.dot(v1,v2)/(np.linalg.norm(v1)*np.linalg.norm(v2))
    return similarity

In [None]:
# Vector Norms

# Defining our vector
vector1 = [1,4,9,15]

# Standard Python
def p_norm(v,p):
    norm = 0
    for val in v:
        norm += val**p
    norm = norm ** (1/p)
    return norm
print("Standard Python norms:")
print("1-norm:",p_norm(vector1,1))
print("2-norm:",p_norm(vector1,2))
print("3-norm:",p_norm(vector1,3))
print("inf-norm:",p_norm(vector1,100))
print()


# NumPy
print("NumPy norms:")
print("1-norm:",np.linalg.norm(vector1,1))
print("2-norm:",np.linalg.norm(vector1,2))
print("3-norm:",np.linalg.norm(vector1,3))
print("inf-norm:",np.linalg.norm(vector1,100))

In [None]:
# Vector Addition

# def vec_add(v1,v2):
#     resultant_vector = []
#     for v1_val,v2_val in zip(v1,v2):
#         resultant_vector.append(v1_val + v2_val)
#     return resultant_vector

In [None]:
# Vector Animation

import ipywidgets as widgets
from IPython.display import display
import matplotlib.pyplot as plt

%matplotlib nbagg

fig, ax = plt.subplots(1, figsize=(10,4))
plt.suptitle("Vector Example")

def update_plot(x1,x2):
    """
    This function updates our plot when we use the interactive widgets
    """
    ax.clear()
    # Adding points
    x = np.array([0, x1])
    y = np.array([0, x2])
    
    ax.plot(x,y,'-o',C='orange')#,label=units.format(x1,x2))
    ax.set_xlim([0,5])
    ax.set_ylim([0,5])
    plt.xlabel("$x_{1}$")
    plt.ylabel("$x_{2}$")
    plt.show()
    
x1 = widgets.FloatSlider(min=0, max=5, value=1, description="$x_{1}$")
x2 = widgets.FloatSlider(min=0, max=5, value=1, description="$x_{2}$")
    
widgets.interactive(update_plot, x1=x1, x2=x2)

In [None]:
# Matrix Python Addition and Subtraction

A = [[1,2],[3,2]]
B = [[3,1],[1,1]]
C = [[4,2],[2,2]]

# Standard Python
def mat_add(matrix1,matrix2):
    result = []
    for row1,row2 in zip(matrix1,matrix2): # iterating for row
        new_row = []
        for val1,val2 in zip(row1,row2): # iterating for each column
            new_row.append(val1+val2)
        result.append(new_row) # adding row after operation is done
    return result

def mat_subtract(matrix1,matrix2):
    result = []
    for row1,row2 in zip(matrix1,matrix2): # iterating for row
        new_row = []
        for val1,val2 in zip(row1,row2): # iterating for each column
            new_row.append(val1-val2)
        result.append(new_row) # adding row after operation is done
    return result

print("Standard Python:")
print("Addition:",mat_add(A,B))
print("Subtraction:",mat_subtract(C,B))
print()

In [None]:
# Matrix Python Transpose

A = [[1,2],[2,3],[3,4]]
B = [[2,2],[1,1]]

# Standard Python
def transpose(mat):
    mat_transpose = [] # initialise transposed matrix
    for idx2 in range(len(mat[0])): # iterate through columns
        new_row = []
        for idx1 in range(len(mat)): # iterate through rows
            new_row.append(mat[idx1][idx2])
        mat_transpose.append(new_row)
    return mat_transpose

print("Standard Python:")
print("A' =",transpose(A))
print("B' =",transpose(B))
print()



In [None]:
# Identity Matrix

A = [[1,2],[2,1]]

# Standard Python
def identity(dim):
    identity_matrix = []
    for i in range(dim):
        new_row = []
        for j in range(dim):
            if i==j:
                new_row.append(1)
            else:
                new_row.append(0)
        identity_matrix.append(new_row)
    return identity_matrix
I = identity(2)
print("Standard Python:")
print("2x2 identity matrix:",I)
print("Matrix product:",matrix_product(A,I))
print()

In [None]:
# Scaling


x,y = np.array([0,0]),np.array([0,0])
u = np.array([1,2])
v = np.array([1,2])

fig = ff.create_quiver(x, y, u, v,
                       scale=1,
                       arrow_scale=.2,
                       name='quiver',
                       line_width=3
                      )

# # updating layout for a larger range of values and displaying axes
fig.update_layout(yaxis=dict(range=[0,5]),
                  xaxis=dict(range=[0,5]),
                  title="Scaling",
                  xaxis_title="x1",
                  yaxis_title="x2"
                 )
# fig.update_layout(xaxis=dict(range=[0,5]))

fig.update_yaxes(nticks=20)
fig.update_xaxes(nticks=20)


fig.show()

In [None]:
# Rotation




x,y = np.array([0,0]),np.array([0,0])
u = np.array([2,0])
v = np.array([0,2])

fig = ff.create_quiver(x, y, u, v,
                       scale=1,
                       arrow_scale=.2,
                       name='quiver',
                       line_width=3
                      )

# # updating layout for a larger range of values and displaying axes
fig.update_layout(yaxis=dict(range=[0,5]),
                  xaxis=dict(range=[0,5]),
                  title="Rotation",
                  xaxis_title="x1",
                  yaxis_title="x2"
                 )
# fig.update_layout(xaxis=dict(range=[0,5]))

fig.update_yaxes(nticks=20)
fig.update_xaxes(nticks=20)


fig.show()

In [None]:
# Orthogonal Matrix Checker

# Defining our matrix
A = np.array([[0.5**0.5,0.5**0.5],[-0.5**0.5,0.5**0.5]]) # same matrix as in the example above
B = np.array([[1,1],[1,2]])

def is_orthogonal(matrix):
    for i in range(len(matrix)-1):
        for j in range(i+1,len(matrix)):
            product = np.dot(matrix[:,i],matrix[:,j])
            if product != 0:
                return False
    return True
print(is_orthogonal(A))
print(is_orthogonal(B))