# Linear Algebra

# Vectors

These are just basically lists from a computer science point of view. for example a Position in three dimensions of space and in one dimension of time.

![image.png](attachment:image.png)

 The model is shown in pink, the measured data is shown in orange and where they overlap is shown in green. The height of the pink and orange bars are the residuals. finding the values of sigma and mu for that graph can help us in fit a best model for that machine learning problem

These vectors can be used to find solutions for a system of equations

for example −

−2x+2y=20

5x+3y=6

In [1]:
import numpy  as np

In [2]:
x = np.linspace(-20, 20, 1000)

In [3]:
a = np.array([[-2, 2],[5, 3]])
b = np.array([20, 6])

In [4]:
np.dot(np.linalg.inv(a), b)

array([-3.,  7.])

### Operations with vectors

for example a house can be described a vector of area, rooms, bathrooms, price 


example : [120, 2, 1, 150000] in order

Vector operations are usually addition and multiplication by scalar

### Addition

let S and R be our vectors

In [5]:
S = np.array([[1], [2]])
R = np.array([[2], [3]])

In [6]:
S + R

array([[3],
       [5]])

In [7]:
R + S

array([[3],
       [5]])

usually in vector addition S + R is equal to R + S no matter which we do addition

![image.png](attachment:image.png)

Since we're just adding the components this is assosicative in nature i.e. (r + s) + t = r + (s + t)

###  Multplication with a Scalar

This is just scaling a vector in a the direction . if the scalar is positive then the vector is scaled int he same direction if the scalar is negative the vector is scaled in the opposite direction

In [8]:
3* S

array([[3],
       [6]])

In [9]:
-1 * S

array([[-1],
       [-2]])

The S is scaled in the opposite direction when we multiply with a negative scalar . This can be used in vector subtraction

### Modulus of a vector

let r = a * i + b * j where i and j are unit vector in the directions of x and y axis the size of r can be calculated by sqrt(a^2 + b^2) this is called as the modulus of r

![image-2.png](attachment:image-2.png)

This can be applied to vector with any number of components

r = [3, 2] then the modulus of the r is

In [10]:
r = np.array([3, 2])

In [11]:
np.linalg.norm(r)

3.605551275463989

### Dot Product

r. s can be found by multiplying the components of the r and s i.e. ri* si + rj * sj + ...

In [12]:
r = np.array([[3],[2]])
s = np.array([[-1], [2]])

In [13]:
np.dot(r.T, s)

array([[1]])

**Properties of dot product**

* dot product is commutative  $r . s = s . r$
* dot product is distributive $r.(s.t) = r.s + s.t$
* dot product is asssociative over scalar multiplication $r.(as) = a(r.s)$ a is a saclar
* r.r is square of it's size of it's modulus

r.s  = $|r|*|s|*cos(θ)$

θ is the angle between r and s


The dot product gives the alignment of two vectors

### Projection

![image.png](attachment:image.png)

her $ |s|*cos(θ) $ is called as the projection 


this can be used how similar two texts as it projects one vector on to another after we change the texts into vectors

### Dot product examples

The dot product can be used to calculate the size of a vector for any number of dimensions  s = [1 2 3 4]

In [14]:
s = np.array([[1], [2], [3], [4]])

The size of s will sqrt(30)

In [15]:
size_square = np.dot(s.T, s)

Not just size the dot product can be applied to n component vectors as 


        $a.b = a1b1 + a2b2 + a3b3 + a4b4 ... anbn$


In [16]:
r = np.array([[-5], [3], [2], [8]])
s = np.array([[1], [2], [-1], [0]])

In [17]:
rs = np.dot(r.T, s)
rs

array([[-1]])

### Changing basis

![image.png](attachment:image.png)

Here e1, e2 are called basis vectors. we're defining r on the basis of these two vectors . so r refered to vectors e1 and e2 is 3e1 + 4e2

let's take b1 = [2 1] , b2 = [-2 4] , r will have different value when referred to b1 and b2. In some cases this will make the computations faster

In [18]:
r = np.array([[3], [4]])
b1 = np.array([[2], [1]])
b2 = np.array([[-2], [4]])

In [19]:
r

array([[3],
       [4]])

vector_projection_b1 = r.b1/b1.b1
vector_projection_b2 = r.b2/b2.b2

In [20]:
vector_projection_b1 = np.dot(r.T, b1)/np.dot(b1.T, b1)
vector_projection_b2 = np.dot(r.T, b2)/np.dot(b2.T, b2)

In [21]:
vector_projection_b2,vector_projection_b1

(array([[0.5]]), array([[2.]]))

In [22]:
rb = b1* (vector_projection_b1) + b2*(vector_projection_b2)
rb

array([[3.],
       [4.]])

In terms of b1 and b2 as basis vectors r is [3, 4]

### Applications of changing basis

* Dimensionality Reduction techniques like PCA
* Matrix Factorization -> Changing basis allows for large matrices to decompose into smaller matrices
* Solving Linear Equations
* Efficient Computing

# Matrices

let eq 1 be 2a + 3b = 8 and 10a + 1b = 13

These can be solved by using matrices

In [23]:
A = np.array([[2, 3], [10, 1]])
B = np.array([8, 13])


In [24]:
A_inv = np.linalg.inv(A)

# Solve for X (which contains a and b)
X = np.dot(A_inv, B)

In [25]:
X

array([1.10714286, 1.92857143])

This way matrices can be used in solving system of equations

### How matrices transform space

when we apply a matrix to a vector usually the vector changes in magnitude and direction. this can be called as shearing or scaling based on the result after a applying a matrix

### Types of Matrix Transformations

let's say a matrix  $b = [[x], [y]]$ we can see how matrixes transformation work on this vector

In [26]:
b = np.array([[2], [1]])

In [27]:
# let x = [[1, 0], [0, 1]]
x = np.array([[1, 0], [0, 1]])

after we apply x to b the matrix b didn't change 

In [28]:
y = np.dot(x, b)
y

array([[2],
       [1]])

In [29]:
# let x = [[3, 0], [0, 2]]
x = np.array([[3, 0], [0, 2]])

In [30]:
y = np.dot(x, b)
y

array([[6],
       [2]])

after applying x to b we scaled to the component 1 of b is scaled 3 times while component 2 of b is scaled 2 times

In [31]:
# let x = [[-1, 0], [0, 2]]
x = np.array([[-1, 0], [0, 2]])

In [32]:
y = np.dot(x, b)
y

array([[-2],
       [ 2]])

after applying the matrix x to b the component 1 of b is flipped to other direction but retained the size in that direction while component 2 of b is scaled twice

In [33]:
# let x = [[0, 1], [1, 0]]
x = np.array([[0, 1], [1, 0]])

In [34]:
y = np.dot(x, b)
y

array([[1],
       [2]])

after applying  x = [[0, 1], [1, 0]] to b it roated the vector b by 90 degress

These transformations are usually Rotation, Scaling, Shearing. These can be used in creating variations of images

### Inverse of a matrix

Let $ Ar = S$ be a system of equations. We can find a matrix $A^{-1}$ which results in $A^{-1}A = I$  . By multiplying with $A^{-1}$ on both sides we get 

* $r = A^{-1}S$

n practice, for larger systems, one never solves a linear system by  hand as there are software packages that can do this for you - such as numpy in Python.

In [35]:
A = np.array([[1, 1, 3], [1, 2, 4], [1, 1, 2]])
Ainv = np.linalg.inv(A)
Ainv

array([[ 0., -1.,  2.],
       [-2.,  1.,  1.],
       [ 1., -0., -1.]])

we can also calculate the r value by using $solve$ method

In [36]:
s = np.array([[9], [7], [2]])


In [37]:
A = np.array([[4, 6, 2], [3, 4, 1],[2, 8, 13]])

r = np.linalg.solve(A, s)
r

array([[ 3.00000000e+00],
       [-5.00000000e-01],
       [-1.98254112e-17]])

### Determinants and inverses

Determinant of matrix can be said how it transforms the space for a given vector.

usually for a matrix $A = [[a, b], [c, d]]$ the determinant of a matrix can be said $|A| = ad - bc$ . If det of a matrix is 0 then the inverse of matrix is not possible. determinants can be calculated for every square matrix

In [38]:
A

array([[ 4,  6,  2],
       [ 3,  4,  1],
       [ 2,  8, 13]])

In [39]:
np.linalg.det(A)

-13.999999999999996

### Einstein summation convention

 ![image.png](attachment:image.png)

by using the above method we can calculate all the elements of $AB$ . this can be quite useful when we're coding as we can run three loops of i, j, k just to find the elements of the matix

This is represented as $C_{ik} = A_{ik} B_{ik}$ 

In [40]:
n = np.array([[1, 2, 3], [4, 5, 6]])
m = np.eye(2)

In [41]:
x = np.dot(m, n)
x

array([[1., 2., 3.],
       [4., 5., 6.]])

In [42]:
n = np.array([[2, -1], [0, 3], [1, 0]])
m = np.array([[0, 1, 4, -1],[-2, 0, 0, 2]])

In [43]:
x = np.dot(n,m)
x

array([[ 2,  2,  8, -4],
       [-6,  0,  0,  6],
       [ 0,  1,  4, -1]])

In [44]:
n = np.array([[1, 2, 3], [4, 0, 1]])
m = np.array([[1, 1, 0], [0, 1, 1], [1, 0, 1]])
x = np.dot(n,m)
x

array([[4, 3, 5],
       [5, 4, 1]])

### Transpose

The transpose of a matrix is a flipped version of the original matrix, where the rows and columns are switched

i.e. $A^{T}_{ij} = A_{ji}$

In [45]:
m

array([[1, 1, 0],
       [0, 1, 1],
       [1, 0, 1]])

Transpose of a matrix can be calculated by applying .T method to the matrix

In [46]:
m.T

array([[1, 0, 1],
       [1, 1, 0],
       [0, 1, 1]])

### Orthogonal Matrices

orthogonal matrix, or orthonormal matrix, is a real square matrix whose columns and rows are orthonormal vectors.

i.e. $A_{i} * A_{j} = 0$

* for Orthogonal matrices $A^{T} = A^{-1}$
* for an Orthogonal Matrix $AA^T = 1 or A^{T}A = 1$
* $|A| = +1/-1$
* where ever possible it is better to use orthogonal vectors as they make computation of determinants , inverse pretty easy 



example of an orthogonal vector A = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]

In [47]:
A = np.array([[1.,0, 0], [0, 1, 0], [0, 0, 1]])

In [48]:
A.T

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [49]:
np.linalg.inv(A)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [50]:
np.linalg.inv(A) == A.T

array([[ True,  True,  True],
       [ True,  True,  True],
       [ True,  True,  True]])

### The Gram Schmidt Process

![image.png](attachment:image.png)

### Eigen Vectors and Eigen Values

* **Eigen Vectors**: Eigenvectors are the directions that remain unchanged during a transformation, even if they get longer or shorter
* **Eigen Values** : Eigenvalues are the numbers that indicate how much something stretches or shrinks during that transformation.

![image.png](attachment:image.png)

The green, pink and orange vectors are examples of eigen vectors with a eigen value of 2 after uniform scaling is applied

### Calculating Eigen Vectors and Eigen Values

![image.png](attachment:image.png)

Step 1: Find the eigenvalues of the matrix A, using the equation det |(A – λI| =0, where “I” is the identity matrix of order similar to matrix A


Step 2: The value obtained in Step 2 are named as, λ1, λ2, λ3….


Step 3: Find the eigenvector (X) associated with the eigenvalue  λ1 using the equation, (A – λ1I) X = 0


Step 4: Repeat step 3 to find the eigenvector associated with other remaining eigenvalues λ2, λ3

These can be calculated using NumPy for a given matrix

In [51]:
A = np.array([[1, 0], [0, 2]])

In [52]:
values, vectors = np.linalg.eig(A)

In [53]:
values

array([1., 2.])

So for matrix $A$ 1, 2 are the eigen values

In [54]:
vectors

array([[1., 0.],
       [0., 1.]])

So for matrix $A$ vectors [1, 0] and [0, 1] are eigne vectors

In [55]:
A = np.array([[0, -1], [1, 0]])

In [56]:
values, vectors = np.linalg.eig(A)

In [57]:
values

array([0.+1.j, 0.-1.j])

In [58]:
vectors

array([[0.70710678+0.j        , 0.70710678-0.j        ],
       [0.        -0.70710678j, 0.        +0.70710678j]])

For the above matrix $A = [[0, -1], [1, 0]]$ there are no real eigen values and eigen vectors possible

In [59]:
A = np.array([[5, 4], [-4, -3]])
values, vectors = np.linalg.eig(A)
values

array([1.+2.98023224e-08j, 1.-2.98023224e-08j])

### Page Rank Algorithm


PageRank (developed by Larry Page and Sergey Brin) revolutionized web search by generating a
ranked list of web pages based on the underlying connectivity of the web. The PageRank algorithm is
based on an ideal random web surfer who, when reaching a page, goes to the next page by clicking on a
link. The surfer has equal probability of clicking any link on the page and, when reaching a page with no
links, has equal probability of moving to any other page by typing in its URL. In addition, the surfer may
occasionally choose to type in a random URL instead of following the links on a page. The PageRank is
the ranked order of the pages from the most to the least probable page the surfer will be viewing.

![image.png](attachment:image.png)

1. **Link Matrix Creation**: 
   - The first step is to construct the Link Matrix (also known as the stochastic matrix) where each element represents the probability of a user clicking from one page to another. Each column of this matrix corresponds to a page, and each row corresponds to the pages that it links to. The values in each column should sum to 1, representing a probability distribution.

2. **Initialization of Rank Vector**: 
   - Next, we initialize the rank vector, which represents the importance of each page. Initially, all pages are assigned equal rank, reflecting the assumption that each page is equally likely to be visited.

3. **Rank Update Process**: 
   - The ranks are updated iteratively using the formula:
     $$
     r^{(i+1)} = L \cdot r^{(i)}
     $$
     where $$ L $$ is the Link Matrix and $$ r^{(i)} $$ is the rank vector at iteration $$ i $$. This equation calculates the new rank vector based on the current ranks and the link structure.


4. **Iteration Until Convergence**: 
   - This process is repeated for a set number of iterations or until convergence is achieved, meaning that changes in the rank vector become negligible between iterations.

5. **Final Rank Vector**: 
   - The final rank vector is the PageRank values for each page, indicating their relative importance within the network.

In [60]:
L = np.array([[0, 0.5, 0, 0], 
              [1/3, 0, 0, 0.5], 
              [1/3, 0, 0, 0.5],  
              [1/3, 0.5, 1, 0]]) 
R = np.array([[0.25], [0.25], [0.25], [0.25]])
n = A.shape[0]


for i in range(100):
    r = (np.matmul(L, R)) 
    R = r/np.sum(R)



In [61]:
R

array([[0.11999863],
       [0.24000249],
       [0.24000249],
       [0.39999639]])

* **Damping Factor**: denoted by and usually in between 0.85 to 0.
Avoiding Infinite Loops: The damping factor helps prevent infinite loops in the ranking process by introducing a probability that allows users to jump to any page randomly. This simulates real user behavior more accurately, as users do not always follow links but may choose to navigate away from a page entirely 23.
Convergence of the Algorithm: A lower damping factor leads to quicker convergence of PageRank values, as it dampens the influence of links and allows for faster stabilization of rank scores across iterations

In [62]:
L = np.array([[0, 0.5, 0, 0], 
              [1/3, 0, 0, 0.5], 
              [1/3, 0, 0, 0.5],  
              [1/3, 0.5, 1, 0]]) 
R = np.array([[0.25], [0.25], [0.25], [0.25]])
n = A.shape[0]
d = 0.85

for i in range(100):
    r = d*(np.matmul(L, R))  + ((1-d)/n)
    R = r/np.sum(R)
R

array([[0.16870307],
       [0.26742538],
       [0.26742538],
       [0.41471742]])