   # <u><i>Linear Algebra</i></u>

## What is Linear Algebra ?

#### Linear algebra is a field of mathematics that is universally agreed to be a prerequisite to a deeper understanding of machine learning.

Although linear algebra is a large field with many complex theories and findings, the nuts and bolts tools and notations taken from the field are practical for machine learning practitioners.

## Role of Linear Algebra in Machine Learning

<b>1)</b> When we learn linear algebra, we improve the awareness or instinct that plays such an important role in machine learning. We will now be able to provide more perspectives.<br>
<b>2)</b> Linear algebra helps in creating better machine learning algorithms.We can use our learning of linear algebra to build better supervised as well as unsupervised machine learning algorithms.<br>
<b>3)</b> Statistics are very important to organize and integrate data in machine learning. If we want to understand statistical concepts in a better way, we need to first know how linear algebra works. Linear algebra has methods, operations, and notations that can help integrate advanced statistical topics like multivariate analysis into our project.

## Matrix

A matrix is a rectangular arrangement of numbers into rows and columns.<br>
Matrix is a way of writing similar things together to handle and manipulate them as per our requirements easily. In Data Science, it is generally used to store information while training various machine learning algorithms.

Technically, a matrix is a 2-D array of numbers (as far as Data Science is concerned). For example look at the matrix A below.


1 &nbsp; 2 &nbsp; 3<br>
4 &nbsp; 5 &nbsp; 6<br>
7 &nbsp; 8 &nbsp; 9<br>
Generally, rows are denoted by ‘i’ and column are denoted by ‘j’.  The elements are indexed by ‘i’th row and ‘j’th column.We denote the matrix by some alphabet e.g.  A and its elements by A(ij).

In above matrix

A12 =  2

To reach to the result, go along first row and reach to second column.



Matrices which have a single row are called row vectors, and those which have a single column are called column vectors. A matrix which has the same number of rows and columns is called a square matrix. In some contexts, such as computer algebra programs, it is useful to consider a matrix with no rows or no columns, called an empty matrix.

### Matrix Creation :

Moving ahead let's learn creation of a matrix using NumPy. There are three methods:

<b>Method 1:</b> Using NumPy array to form a matrix.

<b>Method 2:</b> Using NumPy's inbuilt matrix function.

<b>Method 3:</b> Using miscellaneois functions such as zeros(), ones(), etc.

<b>* Method I:</b> Using array and reshape to convert array into matrix

In [33]:
print(np.array([5,6,8,45,12,52]).reshape(2,3))

[[ 5  6  8]
 [45 12 52]]


<b>* Method II:</b> Using matrix function

In [34]:
print(np.matrix([[1,2],[3,4]]))

[[1 2]
 [3 4]]


<b>* Method III:</b> Using misc. functions

In [35]:
print(np.eye(3)) # Identity matrix

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [36]:
print( np.zeros( (4,3) ) )

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]


In [37]:
print(np.ones( (3,3), dtype = np.float64 ))

[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]


### Terms related to Matrix:

<b>1) Order of matrix – </b> If a matrix has 3 rows and 4 columns, order of the matrix is 3*4 i.e. row*column.

<b>2) Square matrix – </b>The matrix in which the number of rows is equal to the number of columns.

<b>3) Diagonal matrix –</b> A matrix with all the non-diagonal elements equal to 0 is called a diagonal matrix.

<b>4) Upper triangular matrix –</b> Square matrix with all the elements below diagonal equal to 0.

<b>5) Lower triangular matrix – </b>Square matrix with all the elements above the diagonal equal to 0.

<b>6) Scalar matrix – </b>Square matrix with all the diagonal elements equal to some constant k.

<b>7) Identity matrix – </b> Square matrix with all the diagonal elements equal to 1 and all the non-diagonal elements equal to 0.

<b>8) Column matrix – </b> The matrix which consists of only 1 column. Sometimes, it is used to represent a vector.

<b>9) Row matrix –</b>  A matrix consisting only of row.

<b>10) Trace – </b>It is the sum of all the diagonal elements of a square matrix.

### Rank of a Matrix

The rank of a matrix is defined as the maximum number of linearly independent column vectors in the matrix or the maximum number of linearly independent row vectors in the matrix. Both definitions are equivalent.

The rank tells us a lot about the matrix.It is useful in letting us know if we have a chance of solving a system of linear equations: when the rank equals the number of variables we may be able to find a unique solution.

In [9]:
# Importing numpy as np
import numpy as np
 
A = np.array([[6, 1, 1],
              [4, -2, 5],
              [2, 8, 7]])
 
# Rank of a matrix
print("Rank of A:", np.linalg.matrix_rank(A))
 

Rank of A: 3


### Inverse of Matrix

The concept of inverse of a matrix is a multidimensional generalization of the concept of reciprocal of a number:

  <b>*</b> the product between a number and its reciprocal is equal to 1.

  <b>*</b> the product between a square matrix and its inverse is equal to the identity matrix.
  

In [24]:
# Importing numpy as np
import numpy as np
 
A = np.array([[6, 1, 1],
              [4, -2, 5],
              [2, 8, 7]])

# Inverse of matrix A
print("\nInverse of A:\n", np.linalg.inv(A))


Inverse of A:
 [[ 0.17647059 -0.00326797 -0.02287582]
 [ 0.05882353 -0.13071895  0.08496732]
 [-0.11764706  0.1503268   0.05228758]]


Let's check whether we are getting AA<sup>-1</sup>=I


In [30]:
import numpy as np

A.dot(np.linalg.inv(A))

array([[1.00000000e+00, 0.00000000e+00, 6.93889390e-18],
       [2.77555756e-17, 1.00000000e+00, 4.85722573e-17],
       [8.32667268e-17, 1.11022302e-16, 1.00000000e+00]])

And the result is as we expected. Ones in the diagonal and zeros (or very close to zero) elsewhere.

<b>Note</b> also that only square matrices can have an inverse. The definition of an inverse matrix is based on the identity matrix 
[
I
]
, and it has already been established that only square matrices have an associated identity matrix.

### Determinant of a Matrix

The <b>determinant of a matrix</b> is a number that is specially defined only for square matrices. Determinants are mathematical objects that are very useful in the analysis and solution of systems of linear equations.

In [22]:
import numpy as np
 
A = np.array([[6, 1, 1],
              [4, -2, 5],
              [2, 8, 7]])
 
# Determinant of a matrix
print("\nDeterminant of A:", np.linalg.det(A))


Determinant of A: -306.0


### Trace of a Matrix

 It is the sum of all the diagonal elements of a square matrix.

In [23]:
# Importing numpy as np
import numpy as np
 
A = np.array([[6, 1, 1],
              [4, -2, 5],
              [2, 8, 7]])
 

# Trace of matrix A
print("\nTrace of A:", np.trace(A))
 


Trace of A: 11


## Basic Matrix Algebra

### 1) Addition of Matrices

<b>Matrix Addition</b> is the operation of adding two matrices by adding the corresponding entries together. <br><br>
<b>Note: </b>Two matrices must have an equal number of rows and columns to be added.

<h5> Properties of Matrix Addition </h5><br><br>


![image.png](attachment:image.png)


Adding matrices is very simple. Just add each element in the first matrix to the corresponding element in the second matrix.

![image.png](attachment:image.png)

In [32]:
# import numpy as np

A = np.array([[2, 4], [5, -6]])
B = np.array([[9, -3], [3, 6]])
C = A + B      # element wise addition
print(C)

[[11  1]
 [ 8  0]]


### 2) Substraction of Matrices

<b>Matrix Substraction</b> is the operation of substracting two matrices by substracting the corresponding entries together.

<b>Note: </b>Two matrices must have an equal number of rows and columns to be substracted.

![image.png](attachment:image.png)

In [12]:
import numpy as np

A = np.array([[2, 4], [5, -6]])
B = np.array([[9, -3], [3, 6]])
C = A - B      # element wise substraction
print(C)

[[ -7   7]
 [  2 -12]]


### 3) Multiplication of Matrices

The main condition of <u>matrix multiplication</u> is that the number of columns of the 1st matrix must <b>equal</b> to the number of rows of the 2nd one.<br><br>
As a result of multiplication you will get a new matrix that has the same quantity of rows as the 1st one has and the same quantity of columns as the 2nd one.

![image.png](attachment:image.png)

#### Properties of Matrix Multiplication :

![image.png](attachment:image.png)

In [13]:
import numpy as np

A = np.array([[3, 6, 7], [5, -3, 0]])
B = np.array([[1, 1], [2, 1], [3, -3]])
C = A.dot(B)
print(C)

[[ 36 -12]
 [ -1   2]]


### 4) Transpose of Matrix

In linear algebra, the <b>transpose of a matrix</b> is an operator which flips a matrix over its diagonal; that is, it switches the row and column indices of the matrix A by producing another matrix, often denoted by A<sup>T</sup>.

In [14]:
import numpy as np

A = np.array([[1, 1], [2, 1], [3, -3]])
print(A.transpose())

[[ 1  2  3]
 [ 1  1 -3]]


## Access matrix elements, rows and columns

In [15]:
import numpy as np

A = np.array([[1, 4, 5, 12],
    [-5, 8, 9, 0],
    [-6, 7, 11, 19]])

#  First element of first row
print("A[0][0] =", A[0][0])  

# Third element of second row
print("A[1][2] =", A[1][2])

# Last element of last row
print("A[-1][-1] =", A[-1][-1])     

A[0][0] = 1
A[1][2] = 9
A[-1][-1] = 19


In [16]:
import numpy as np

A = np.array([[1, 4, 5, 12], 
    [-5, 8, 9, 0],
    [-6, 7, 11, 19]])

print("A[:,0] =",A[:,0]) # First Column
print("A[:,3] =", A[:,3]) # Fourth Column
print("A[:,-1] =", A[:,-1]) # Last Column (4th column in this case)


A[:,0] = [ 1 -5 -6]
A[:,3] = [12  0 19]
A[:,-1] = [12  0 19]


In [17]:
import numpy as np

A = np.array([[1, 4, 5, 12], 
    [-5, 8, 9, 0],
    [-6, 7, 11, 19]])

print("A[0] =", A[0]) # First Row
print("A[2] =", A[2]) # Third Row
print("A[-1] =", A[-1]) # Last Row (3rd row in this case)

A[0] = [ 1  4  5 12]
A[2] = [-6  7 11 19]
A[-1] = [-6  7 11 19]


## Linear Transformation

A <b>linear transformation</b> is a function from one vector space to another that respects the linear structure of each vector space. A linear transformation is also known as a linear operator or map. The range of the transformation may be the same as the domain, and when that happens, the transformation is known as an <b>endomorphism</b> or, if invertible, an <b>automorphism</b>. The two vector spaces must have the same underlying field.

The defining characteristic of a linear transformation <b>T : V --> W</b> is that, for any vectors v<sub>1</sub> 
and v<sub>2</sub> in V and scalars a and b of the linear field,

T(av<sub>1</sub> + bv<sub>2</sub>) = aT(v<sub>1</sub>) + bT(v<sub>2</sub>).<br><br>
Linear transformations are useful because they preserve the structure of a vector space.


![image.png](attachment:image.png)

## Matrix eigenvalues Functions

<b>Eigenvalues</b> are a special set of scalars associated with a linear system of equations (i.e., a matrix equation) that are sometimes also known as characteristic roots, characteristic values, proper values, or latent roots.<br><br>
<b>Eigenvectors</b> are a special set of vectors associated with a linear system of equations (i.e., a matrix equation) that are sometimes also known as characteristic vectors, proper vectors, or latent vectors.



### <u>Geometrical Interpretation</u>

Let <b>v</b> be a vector (shown as a point) and <b>A</b> be a matrix with columns <b>a<sub>1</sub></b> and <b>a<sub>2</sub></b> (shown as arrows). If we multiply <b>v</b> by <b>A</b>, then <b>A</b> sends <b>v</b> to a new vector <b>Av</b>.

![image.png](attachment:image.png)

If you can draw a line through the three points <b>(0,0)</b>, <b>v</b> and <b>Av</b>, then <b>Av</b> is just <b>v</b> multiplied by a number <b>lambda</b>; that is, Av = lambda v. In this case, we call <b>lambda</b> an eigenvalue and <b>v</b> an eigenvector. For example, here <b>(1,2)</b> is an eigvector and <b>5</b> an eigenvalue.



Below, change the columns of A and drag v to be an eigenvector. Note three facts:<br> <b>First,</b> every point on the same line as an eigenvector is an eigenvector. Those lines are <b>eigenspaces</b>, and each has an associated eigenvalue.<br> <b>Second,</b> if you place v on an eigenspace (either s<sub>1</sub> or s<sub>2</sub>) with associated eigenvalue lambda<1, then Av is closer to (0,0) than v; but when lambda > 1, it's farther.<br> <b>Third,</b> both eigenspaces depend on both columns of A: it is not as though a<sub>1</sub> only affects s<sub>1</sub>.

![image.png](attachment:image.png)

<br><br>
 <b>Geometrically,</b> an eigenvector, corresponding to a real nonzero eigenvalue, points in a direction in which it is stretched by the transformation and the eigenvalue is the factor by which it is stretched. If the eigenvalue is <b>negative</b>, the direction is <b>reversed</b>.

#### <b><u> numpy.linalg.eigh(a, UPLO=’L’) :</u></b> 
This function is used to return the eigenvalues and eigenvectors (conjugate symmetric) or a real symmetric matrix.Returns two objects, a 1-D array containing the eigenvalues of a, and a 2-D square array or matrix (depending on the input type) of the corresponding eigenvectors (in columns).

In [5]:

# eigh() function
 
from numpy import linalg as la
 
# Creating an array using array 
a = np.array([[1, -2j], [2j, 5]])
 
print("Array is :",a)
 
# calculating an eigen value
# using eigh() function
c, d = la.eigh(a)
 
print("Eigen value is :", c)
print("Eigen value is :", d)

Array is : [[ 1.+0.j -0.-2.j]
 [ 0.+2.j  5.+0.j]]
Eigen value is : [0.17157288 5.82842712]
Eigen value is : [[-0.92387953-0.j         -0.38268343+0.j        ]
 [ 0.        +0.38268343j  0.        -0.92387953j]]


#### <b><u>numpy.linalg.eig(a) :</u></b> 
This function is used to compute the eigenvalues and right eigenvectors of a square array.

In [4]:

# eig() function
 
from numpy import linalg as la
 
# Creating an array using diag 

a = np.diag((1, 2, 3))
 
print("Array is :",a)
 
# calculating an eigen value
# using eig() function
c, d = la.eig(a)
 
print("Eigen value is :",c)
print("Eigen value is :",d)

Array is : [[1 0 0]
 [0 2 0]
 [0 0 3]]
Eigen value is : [1. 2. 3.]
Eigen value is : [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [38]:
A = np.array([[1,0],[0,-2]])
print(A)


[[ 1  0]
 [ 0 -2]]


The function <b>la.eig</b> returns a tuple (eigvals,eigvecs) where eigvals is a 1D NumPy array of complex numbers giving the eigenvalues of A, and eigvecs is a 2D NumPy array with the corresponding eigenvectors in the columns:

In [39]:
results = la.eig(A)

The <b>eigenvalues</b> of  are:

In [40]:
print(results[0])

[ 1. -2.]


The corresponding <b>eigenvectors</b> are:

In [41]:
print(results[1])

[[1. 0.]
 [0. 1.]]


# <u>Singular-Value Decomposition</u>

<b>The Singular-Value Decomposition</b>, or <b>SVD</b> for short, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler.<br><br>

For the case of simplicity we will focus on the SVD for real-valued matrices and ignore the case for complex numbers.<br>

A = U . Sigma . V<sup>T</sup> <br>

Where A is the real m x n matrix that we wish to decompose, U is an m x m matrix, Sigma (often represented by the uppercase Greek letter Sigma) is an m x n diagonal matrix, and V<sup>T</sup> is the  transpose of an n x n matrix .<br>
The diagonal values in the Sigma matrix are known as the singular values of the original matrix A. The columns of the U matrix are called the left-singular vectors of A, and the columns of V are called the right-singular vectors of A.<br><br>
The SVD is used widely both in the calculation of other matrix operations, such as matrix inverse, but also as a data reduction method in machine learning. SVD can also be used in least squares linear regression, image compression, and denoising data.






## Calculation of Singular-value Decomposition

The SVD can be calculated by calling the svd() function.<br><br>

The function takes a matrix and returns the U, Sigma and V^T elements. The Sigma diagonal matrix is returned as a vector of singular values. The V matrix is returned in a transposed form, e.g. V.T.


In [19]:

# Singular-value decomposition
from numpy import array
from scipy.linalg import svd
# define a matrix
A = array([[1, 2], [3, 4], [5, 6]])
print(A)
# SVD
U, s, VT = svd(A)
print(U)
print(s)
print(VT)


[[1 2]
 [3 4]
 [5 6]]
[[-0.2298477   0.88346102  0.40824829]
 [-0.52474482  0.24078249 -0.81649658]
 [-0.81964194 -0.40189603  0.40824829]]
[9.52551809 0.51430058]
[[-0.61962948 -0.78489445]
 [-0.78489445  0.61962948]]


<b><i>By running the above example,it will first prints the defined 3×2 matrix, then the 3×3 U matrix, 2 element Sigma vector, and 2×2 V<sup>T</sup> matrix elements calculated from the decomposition.</i></b>

### SVD for Dimensionality Reduction

A popular application of SVD is for dimensionality reduction.<br><br>

Data with a large number of features, such as more features (columns) than observations (rows) may be reduced to a smaller subset of features that are most relevant to the prediction problem.<br><br>

The result is a matrix with a lower rank that is said to approximate the original matrix.<br><br>

To do this we can perform an SVD operation on the original data and select the top k largest singular values in Sigma. These columns can be selected from Sigma and the rows selected from V<sup>T</sup>.<br><br>

An approximate B of the original vector A can then be reconstructed.<br>
<b>B = U . Sigmak . V<sup>T</sup>k</b><br><br>
In natural language processing, this approach can be used on matrices of word occurrences or word frequencies in documents and is called Latent Semantic Analysis or Latent Semantic Indexing.<br><br>
In practice, we can retain and work with a descriptive subset of the data called T. This is a dense summary of the matrix or a projection.<br>
<b>T = U . Sigma k</b><br><br>
Further, this transform can be calculated and applied to the original matrix A as well as other similar matrices.<br>

<b>T = V^k . A</b>



In [20]:

from numpy import array
from numpy import diag
from numpy import zeros
from scipy.linalg import svd
# define a matrix
A = array([
	[1,2,3,4,5,6,7,8,9,10],
	[11,12,13,14,15,16,17,18,19,20],
	[21,22,23,24,25,26,27,28,29,30]])
print(A)
# Singular-value decomposition
U, s, VT = svd(A)
# create m x n Sigma matrix
Sigma = zeros((A.shape[0], A.shape[1]))
# populate Sigma with n x n diagonal matrix
Sigma[:A.shape[0], :A.shape[0]] = diag(s)
# select
n_elements = 2
Sigma = Sigma[:, :n_elements]
VT = VT[:n_elements, :]
# reconstruct
B = U.dot(Sigma.dot(VT))
print(B)
# transform
T = U.dot(Sigma)
print(T)
T = A.dot(VT.T)
print(T)

[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27 28 29 30]]
[[ 1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
 [11. 12. 13. 14. 15. 16. 17. 18. 19. 20.]
 [21. 22. 23. 24. 25. 26. 27. 28. 29. 30.]]
[[-18.52157747   6.47697214]
 [-49.81310011   1.91182038]
 [-81.10462276  -2.65333138]]
[[-18.52157747   6.47697214]
 [-49.81310011   1.91182038]
 [-81.10462276  -2.65333138]]


<b>Running the example first prints the defined matrix then the reconstructed approximation, followed by two equivalent transforms of the original matrix.</b><br><br>

The <b>scikit-learn</b> provides a TruncatedSVD class that implements this capability directly.<br>

The TruncatedSVD class can be created in which you must specify the number of desirable features or components to select. Once created, you can fit the transform (e.g. calculate V<sup>T</sup>k) by calling the fit() function, then apply it to the original matrix by calling the transform() function. The result is the transform of A called T in the above example.



In [21]:

from numpy import array
from sklearn.decomposition import TruncatedSVD
# define array
A = array([
	[1,2,3,4,5,6,7,8,9,10],
	[11,12,13,14,15,16,17,18,19,20],
	[21,22,23,24,25,26,27,28,29,30]])
print(A)
# svd
svd = TruncatedSVD(n_components=2)
svd.fit(A)
result = svd.transform(A)
print(result)


[[ 1  2  3  4  5  6  7  8  9 10]
 [11 12 13 14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27 28 29 30]]
[[18.52157747  6.47697214]
 [49.81310011  1.91182038]
 [81.10462276 -2.65333138]]


<b>Running the example first prints the defined matrix, followed by the transformed version of the matrix.</b>