<a href="https://colab.research.google.com/github/eunterko/MAT421/blob/main/ModuleD_Section_1_1_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Chapter 1: Linear Algebra**

# *1.1 Introduction*

Linear algebra is mainly the study of mathematics through the use of vectors, matrices, and the various implications thereof. Because of this, linear algebra is an integral component of many data science or machine learning algorithms and methods. Many of the concepts introduced in linear algebra, both at a more basic and complex level, can be used to simplify the process of solving certain data science problems, as well as allowing us to tackle problems that we otherwise would be unable to solve. The following sections introduce the basics of linear algebra (section 1.2) and some more complicated applications (sections 1.3 and 1.4).

# *1.2 Elements of Linear Algebra*

Let's start off with a few basic categorizations involving vectors. A linear combination is a new vector formed by multiplying a vector with a constant and adding up the results. Linear combinations lead to linear subspaces, a subset of a vector space. Similarly, spans are linear subspaces that are comprised of all linear combinations of a set of vectors, while column spaces are the spans of the columns of matrices.

Vectors are linearly independent if none of them are linear combinations of any of the other vectors. Linear independence is used to define the basis of a vector space: the basis of a matrix is a set of vectors that are linearly independent and span the matrix. Vector spaces can have multiple bases, however they all have the same number of elements, the same dimension. 

Orthogonality and normality are another important component of linear algebra. Two vectors are orthogonal if their dot product is zero, and a vector is normal if its dot product with itself is one. These two concepts are often expressed together as orthonormality, which is also an element of a vector basis. Let's look at a simple example, the standard basis.

In [29]:
import numpy as np

# standard basis
v1 = np.array([1,0,0])
v2 = np.array([0,1,0])
v3 = np.array([0,0,1])

dot_orth1 = np.dot(v1,v2)
dot_orth2 = np.dot(v1,v3)
dot_orth3 = np.dot(v2,v3)

print("The dot product of v1 and v2 is", dot_orth1)
print("The dot product of v1 and v3 is", dot_orth2)
print("The dot product of v2 and v3 is", dot_orth3)

dot_norm1 = np.dot(v1,v1)
dot_norm2 = np.dot(v2,v2)
dot_norm3 = np.dot(v3,v3)

print("\nThe norm of v1 is", dot_norm1)
print("The norm of v2 is", dot_norm2)
print("The norm of v3 is", dot_norm3)

The dot product of v1 and v2 is 0
The dot product of v1 and v3 is 0
The dot product of v2 and v3 is 0

The norm of v1 is 1
The norm of v2 is 1
The norm of v3 is 1


Using orthogonality, we can formulate the Gram-Schmidt process, a means of generating a basis. Mainly, that given a set of linearly independent vectors, there exists an orthonormal basis set of vectors for the span of those linearly independent vectors.

Eigenvalues and eigenvectors make use of these prior methods to solve the equation Ax = Lx, where x is the nonzero eigenvector, A is a square matrix, and L is the eigenvalue. In general, if A is an m by m matrix, there exist at most m eigenvalues. Furthermore, the easiest way to find the eigenvalues of the matrix A is to diagonalize it: using the Gram-Schmidt process, use one eigenvector to form the orthonormal basis U, and then calculate U^T(A)U to find the eigenvalues along the diagonal.

# *1.3 Linear Regression*

Linear regression is a method for solving models that depend linearly on their unknowns. Let's first look at QR decomposition, which solves the linear least squares problem, written as A = QR, where R is an upper triangular matrix. Python thankfully has the function qr(A) in scipy.linalg, which can take a numpy matrix and perform QR decomposition. Let's look at an example:

In [26]:
import numpy as np
import scipy.linalg

A = np.array([[1, 9, -7], [-6, 5, 2], [-8, -4, 3]])  
Q,R = scipy.linalg.qr(A)

print("A is \n", A)

print("Q is \n", Q)

print("R is \n", R)

# checking A = QR
print("QR is \n", np.dot(Q,R))

A is 
 [[ 1  9 -7]
 [-6  5  2]
 [-8 -4  3]]
Q is 
 [[-0.09950372 -0.80894303  0.57940503]
 [ 0.59702231 -0.51437246 -0.61561784]
 [ 0.79602975  0.28466147  0.53413901]]
R is 
 [[-10.04987562  -1.09454091   4.27865992]
 [  0.         -10.99099541   5.48784067]
 [  0.           0.          -3.68465386]]
QR is 
 [[ 1.  9. -7.]
 [-6.  5.  2.]
 [-8. -4.  3.]]


More generally, with least squares problems, we are trying to solve Ax = b, where A is a matrix, and x and b are vectors. Often times, this involves rather approximating b with Ax, or vice versa. If A is a square matrix, we can more simply use the matrix inverse to solve this problem. As we saw above, QR decomposition can be used if A is a square matrix, but it can also be used is A is not square.

In [27]:
import numpy as np
import scipy.linalg

A = np.array([[1, 9], [-6, 5], [-8, -4]])  
Q,R = scipy.linalg.qr(A)

print("A is \n", A)

print("Q is \n", Q)

print("R is \n", R)

# checking A = QR
print("QR is \n", np.dot(Q,R))

A is 
 [[ 1  9]
 [-6  5]
 [-8 -4]]
Q is 
 [[-0.09950372 -0.80894303  0.57940503]
 [ 0.59702231 -0.51437246 -0.61561784]
 [ 0.79602975  0.28466147  0.53413901]]
R is 
 [[-10.04987562  -1.09454091]
 [  0.         -10.99099541]
 [  0.           0.        ]]
QR is 
 [[ 1.  9.]
 [-6.  5.]
 [-8. -4.]]


This now brings us to linear regression. Given an input data set, we find a linear function to fit the data. We rewrite this system as the minimization of the norm squared of a least squares problem, at which point we can use the methods discussed above to solve the problem, and find the linear fit.