#Introduction

Linear algebra is a field of mathematics that is widely used in various disciplines. Linear algebra plays an important role in data science and machine
learning. A solid understanding of linear algebra concepts can enhance the
understanding of many data science and machine learning algorithms. This
chapter introduces basic concepts for data science and includes vector spaces,
orthogonality, eigenvalues, matrix decomposition and further expanded to include linear regression and principal component analysis where linear algebra
plays a central role for solving data science problems. More advanced concepts
and applications of linear algebra can be found in many references $[1, 2, 3, 4]$.

#Elements of Linear Algebra


##Linear Spaces

###Linear Combinations

In [68]:
import numpy as np
v = np.array([[3.3],[11]])
w = np.array([[-2],[-40]])
a = 1.5
b = 4.9
u = a*v + b*w
u

array([[  -4.85],
       [-179.5 ]])

In [69]:
import numpy as np
x = np.array([[4, 8, 1],
              [2, 11, 0],
              [1, 7.4, 0.2]])
y = ([3.65, 1.55, 3.42])
result = np.linalg.solve(x, y)
result

array([-2.19148936,  0.5393617 ,  8.10106383])

###Linear Independence and Dimension


In [70]:
import sympy
import numpy as np
matrix = np.array([[2,1,7],[2,9,11]]) 
_, inds = sympy.Matrix(matrix).T.rref()
print(inds)

(0, 1)


This means vetor 0 and vector 1 are linearly independent.

In [71]:
matrix = np.array([[0,1,0,0],[0,0,1,0],[0,1,1,0],[1,0,0,1]]) 
_, inds = sympy.Matrix(matrix).T.rref()
print(inds)

(0, 1, 3)


This means vetor 0, vector 1, and vector 3 are lineatly independent while vector 2 is linearly dependent.

##Orthogonality

Inner product

In [72]:
a = np.array([1.6,2.5,3.9])
b = np.array([4,1,11])
np.inner(a, b)

51.8

Norm

In [73]:
from numpy import linalg as LA
c = np.array([[ 1.3, -7.2, 12.1],
              [-1, 0, 4]])
LA.norm(c)

14.728883189162714

Orthogonality

In [74]:
v1 = np.array([1,-2, 4])
v2 = np.array([2, 5, 2])
dot_product = np.dot(v1,v2)
if dot_product == 0:
  print('v1 and v2 are orthorgonal')
else: print('v1 and v2 are not orthorgonal')

v1 and v2 are orthorgonal


In [75]:
n1 = LA.norm(v1)
n2 = LA.norm(v2)
if n1 == 1 and n2 == 1:
  print('v1 and v2 are orthornormal')
else: print('v1 and v2 are not orthornormal')


v1 and v2 are not orthornormal


##Gram-Schmidt Process

In [76]:
import numpy as np
def gs(X):
    Q, R = np.linalg.qr(X)
    return Q

In [77]:
X = np.array([[3,-1,0],[1.8,11.3,-7.5], [4,13/4,-7/3]])
gs(X)

array([[-0.56453245,  0.40891852, -0.71699983],
       [-0.33871947, -0.90691722, -0.25053997],
       [-0.75270993,  0.10142386,  0.65049286]])

##Eigenvalues and Eigenvectors

In [78]:
import numpy as np
from numpy.linalg import eig
a = np.array([[2.1, -5/2, 11.4], 
              [1, 3, 5],
              [2.4, 3.5, 7.4]])
u,v=eig(a)
print('E-value:', u)
print('E-vector', v)

E-value: [12.05944603 -1.69840975  2.13896373]
E-vector [[-0.63272391 -0.94998594  0.88313257]
 [-0.42654701 -0.10917553 -0.45885676]
 [-0.64631115  0.29258746 -0.09760805]]


#Linear Regression

##QR Decomposition

In [79]:
import numpy as np
from numpy.linalg import qr
m = np.array([[1/2, -2.8, 5/3], 
              [2.5, 3, 9],
              [8.3, 4, -5.2]])
q, r = qr(m)
print('Q:', q)
print('R:', r)

n = np.dot(q, r)
print('QR:', n)

Q: [[-0.0575855   0.87080492 -0.4882445 ]
 [-0.28792749 -0.48276155 -0.82706653]
 [-0.95591928  0.09295198  0.27852874]]
R: [[-8.6827415  -4.5262202   2.28345698]
 [ 0.         -3.51473053 -3.3768627 ]
 [ 0.          0.         -9.70568907]]
QR: [[ 0.5        -2.8         1.66666667]
 [ 2.5         3.          9.        ]
 [ 8.3         4.         -5.2       ]]


##Least-squares Problems


Use direct inverse method

In [80]:
import numpy as np
from scipy import optimize
import matplotlib.pyplot as plt
plt.style.use('seaborn-poster')
x = np.linspace(0, 10, 500)
y = 1/2 + x * np.random.random(len(x))
A = np.vstack([x, np.ones(len(x))]).T
y = y[:, np.newaxis]
lst_sqr = np.dot((np.dot(np.linalg.inv(np.dot(A.T,A)),A.T)),y)
print(lst_sqr)

[[0.5248105 ]
 [0.48146998]]


Use the pseudoinverse

In [81]:
pinv = np.linalg.pinv(A)
lst_sqr = pinv.dot(y)
print(lst_sqr)

[[0.5248105 ]
 [0.48146998]]


Use numpy.linalg.lstsq

In [82]:
lst_sqr = np.linalg.lstsq(A, y, rcond=None)[0]
print(lst_sqr)

[[0.5248105 ]
 [0.48146998]]


Use optimize.curve_fit from scipy

In [83]:
x = np.linspace(0, 10, 500)
y = 1/2 + x * np.random.random(len(x))
def func(x, a, b):
    y = a*x + b
    return y
lst_sqr = optimize.curve_fit(func, xdata = x, ydata = y)[0]
print(lst_sqr)

[0.55132487 0.30482114]


## Linear Regression


In [84]:
import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([5.3, 15.2, 25.8, 35.4, 45.5, 54.9]).reshape((-1, 1))
y = np.array([4.7, 20.4, 31/2, 33.2, 22, 38.6])
model = LinearRegression().fit(x, y)
r_sq = model.score(x, y)
print('coefficient of determination:', r_sq)

coefficient of determination: 0.7007120636613271


In [85]:
print('intercept:', model.intercept_)

intercept: 5.763991287587402


In [86]:
print('slope:', model.coef_)

slope: [0.54813867]


In [87]:
y_pred = model.predict(x)
print('predicted response:', y_pred, sep='\n')

predicted response:
[ 8.66912625 14.09569911 19.90596904 25.1681003  30.70430089 35.85680441]


#Principal Component Analysis

##Singular Value Decomposition

In [88]:
from numpy import array
from scipy.linalg import svd

A = array([[3, -2, 5], 
           [1, 0, -3], 
           [4, 6, -1]])

print('Matrix A:')
print(A)
U, sigma, VT = svd(A)

print('The m × m orthogonal matrix:')
print(U)

print('The m × n diagonal matrix:')
print(sigma)

print('The n × n orthogonal matrix:')
print(VT)

Matrix A:
[[ 3 -2  5]
 [ 1  0 -3]
 [ 4  6 -1]]
The m × m orthogonal matrix:
[[-0.38248394  0.86430926  0.32661221]
 [ 0.23115377 -0.25273925  0.93951626]
 [ 0.89458033  0.43484753 -0.10311964]]
The m × n diagonal matrix:
[7.54629306 6.24447219 2.24945061]
The n × n orthogonal matrix:
[[ 0.35275906  0.81264401 -0.46386502]
 [ 0.6533104   0.14099937  0.74384454]
 [ 0.66988548 -0.56544574 -0.48116998]]


##Principal Component Analysis

Covariance Matrix

In [89]:
A = array([[3, -2, 5], 
           [1, 0, -3], 
           [4, 6, -1]])
covMatrix = np.cov(A,bias=True)
print('Covariance matrix of A:')
print(covMatrix)

Covariance matrix of A:
[[ 8.66666667 -2.66666667 -7.66666667]
 [-2.66666667  2.88888889  4.33333333]
 [-7.66666667  4.33333333  8.66666667]]


Principal Component Analysis

In [90]:
import numpy as np
from numpy.linalg import eig

X = np.random.randint(10,50,100).reshape(20,5) 
X_meaned = X - np.mean(X , axis = 0)
covMatrix = np.cov(X_meaned, rowvar = False)
val,vec = eig(covMatrix)

s_index = np.argsort(val)[::-1]
s_val = val[s_index]
s_vec = vec[:,s_index]
 
n_components = 8
vec_sub = s_vec[:,0:n_components]

#Transform the data 
X_reduced = np.dot(vec_sub.transpose(), X_meaned.transpose()).transpose()
X_reduced

array([[-14.2097599 ,  11.1638481 ,   0.28399481,   6.72774374,
         -6.5223102 ],
       [ 25.06735619,   0.65710691,   6.85236627,  -0.17476694,
         -6.8806175 ],
       [-21.61411193, -13.0377964 , -12.16006542,   5.05948329,
          5.30216639],
       [-27.78196313,  -9.86254508,  -6.96332075,   2.28424895,
         -1.19146972],
       [ -0.22967199,  24.5661952 ,   2.66240043,   2.63708048,
          5.03380919],
       [ 11.15926987,  17.71804406,   2.51916061,   5.85025062,
         -5.6923201 ],
       [ 11.5482325 ,  -9.11311102,   9.98129819, -10.96725047,
         11.84124727],
       [  5.57296606,  -7.13463359, -25.32274739,   2.06609699,
         -0.24750938],
       [-17.22161772, -11.97678995,  13.20560437, -14.20500239,
        -11.55139844],
       [-13.62600723,  17.70093422,   8.90913338,   2.01548555,
          3.03422397],
       [-26.51289728,  -8.97206962,   4.14445894,   7.02884273,
         -2.02967354],
       [ -4.03180426,  25.35080447, -18.627

Total Variance

In [91]:
B = covMatrix
total_var_matr = B.trace()
print('Total variance of A:')
print(total_var_matr)

Total variance of A:
716.65
