<a href="https://colab.research.google.com/github/mattheweisenberg6/MAT421/blob/main/ModuleD.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**1.1 Introduction**

Linear algebra is a mathematical field used in many disciplines. It is used heavily in data science and machine learning, especially algorithms. Linear algebra plays a central role in solving data science problems using tools like vector spaces, orthogonality, eigenvalues, matrix decomposition, and linear regression.

**1.2 Elements of Linear Algebra**

Linear Combinations

Linear combinations are new vectors made by multiplying each vector by a constant then multiplying the result. The formal definition of a linear subspace is a subset U ⊆ V closeed under vector addition and vector multiplication. The span of a set of vectors also classifies as a linear subspace.

Orthogonality

We can use orthonormal bases in Python to simplify mathematical representations and learn more about the problem we want to solve. A list of vectors is orthonormal if u_i vectors are pairwise orthogonal with a norm of 1 each. We can define the inner product of two vectors x and y with the equation, <x,y> = x·y = ∑n x_i*y_i and ∥x∥ =√ ∑n (x_i) ^ 2. If we have X ⊆ Y as a linear subspace  with orthonormal basis (q1,...q_k), then we can define orthonormal projection of x ∈ X on Y to confirm the optimality of the vectors for the best approximation of orthonormality.

Gram-Schmidt Process

Gram-Schmidt algorithm is used to find the orthonormal basis. A vector x_i is added each time after the previously used orthonormal projection is removed. The states that for (x_1,...x_k) in R_n that is linearly independent, there must be an orthonormal basis such that (y_1,...y_k) has span(x_1,...x_k).

Eigenvectors and Eigenvalues

Given the equation: Ax = λx,
Where A is an n×n matrix, x is an n×1 column vector (x ≠ 0), and λ is a scalar. Any value of λ that satisfies this equation is called an eigenvalue of the matrix A. The corresponding vector x is known as an eigenvector associated with λ. An eigenvalue represents the factor by which an eigenvector is scaled during a transformation.


In [2]:
import numpy as np
from numpy.linalg import eig

#using python to calculate eigenvalue and eigenvector

a = np.array([[2, 2, 4],
              [7, 5, 20],
              [3, 6, 9]])

# eigenvalue function call
w,v=eig(a)
print('Eigenvalue:', w)
print('Eigenvector', v)

Eigenvalue: [19.59961085  0.58521612 -4.18482697]
Eigenvector [[ 0.21315557  0.94464188  0.03108826]
 [ 0.82315793 -0.01250217 -0.91239496]
 [ 0.52628482 -0.32786494  0.4081286 ]]


**1.3 Linear Regression**

Linear regression models rely on a linear relationship with their unknown parameters, making them generally easier to fit compared to non-linear models. The linear least squares problem can be efficiently solved using QR decomposition, which is derived through the Gram-Schmidt process to form an orthonormal basis. This decomposition is expressed as A = QR. Python provides convenient tools to compute A, Q, and R in the decomposition.

In [6]:
import pprint
import numpy
import scipy.linalg

A = numpy.array([[7, 42, 8],
                  [-12, 29, -78],
                  [25, 121, -40]])
Q, R = numpy.linalg.qr(A) # decomposing array A

print("A")
pprint.pprint(A)

print("Q")
pprint.pprint(Q)

print("R:")
pprint.pprint(R)

A
array([[  7,  42,   8],
       [-12,  29, -78],
       [ 25, 121, -40]])
Q
array([[-0.24474926, -0.20630898, -0.94738292],
       [ 0.41957016, -0.90341392,  0.08834117],
       [-0.8741045 , -0.37587217,  0.30767098]])
R:
array([[ -28.60069929, -103.87857897,    0.27971344],
       [   0.        ,  -80.34451339,   83.85070095],
       [   0.        ,    0.        ,  -26.77651415]])


Least squares problems can be solved using a similar approach, applying the least squares regression formula β = (AᵀA)⁻¹AᵀY. This method provides a solution when directly computing the matrix inverse to solve Ax = b is not feasible.

In [7]:
import numpy as np

x = np.linspace(0, 5, 50)
y = 2 + 3 * x + 2 * np.random.random(len(x)) # random x and y values

A = np.vstack([x, np.ones(len(x))]).T
y = y[:, np.newaxis]

beta = np.dot(np.linalg.inv(A.T @ A) @ A.T, y) # regression for A and T values
print(beta)


[[2.95973954]
 [3.22154439]]
