# Linear algebra for data science 
## Why do we need linear algebra for data science ?

We will then rely on the concepts from linear algebra to manipulate these vectors and matrices and build algorithms, such as:

- Principal Component Analysis (PCA),
- Singular Value Decomposition (SVD)
- Support vector machines
- Kernel methods
- Gradient descent

## Vectors and matrices


In [1]:
import numpy as np

x = [5, 7, -2, 2]
y = [4, 5, 12, 0]

How to multiply by pairs ?

In [3]:
product = []
for i in range(len(x)):
    product.append(x[i] * y[i])
product

[20, 35, -24, 0]

In [5]:
np.array(x) * np.array(y)

array([ 20,  35, -24,   0])

### Vectors
A vector is essentially an ordered collection of numbers that are written in a column.

**Vector addition**

We can add two vectors as long as they have the same size. 

**Vector multiplication by a scalar**

We can also multiply a vector by a scalar. This is equivalent to multiplying each component by the respective scalar.

In [9]:
np.array(x) * 5

array([ 25,  35, -10,  10])

**Vector Hadamard product**

Recall the multiplication of two vectors that we saw at the beginning of this unit. What we computed in that example is called the Hadamard product. 

**Inner product**

While the Hadamard product can be quite useful, in linear algebra there is a much more commonly used product called the inner product, or dot product. We can compute the inner product between two vectors as long as the two vectors have the same length. It is defined as follows:

<u,v> = u1v1+u2v2+u3v3 + ... + unvn

### Matrices 

**Matrice addition**

by pair

**Matrice Transpose**

order change

**Matrice Hadamard Product**

by pair

**Matrice multiplication**

row 1*col1 ; row1 *col2;<br>
row2*col1; row2*col2;

**Properties of matrix multiplication**

The following properties are very important to keep in mind about matrix multiplication:

 - associative: A(BC)=(AB)C
 - distributive: A(B+C)=AB+AC
 - not commutative : AB≠BA


## Vectors and matrices in NumPy

### Matrices in NumPy


In [11]:
A = np.array([[1, 2], [2, 1]])
B = np.array([[1, 5], [0, 2]])
A + 5 * B

array([[ 6, 27],
       [ 2, 11]])

### Matrix multiplication using np.dot()

In [13]:
A * B

array([[ 1, 10],
       [ 0,  2]])

In [15]:
np.dot(A, B)

array([[ 1,  9],
       [ 2, 12]])

### Vectors as one-dimensional arrays

In [16]:
a = np.array([0, 1, 2])

In [17]:
a

array([0, 1, 2])

In [18]:
a.T

array([0, 1, 2])

In [22]:
np.dot(a,a)

5

In [19]:
b = np.array([[0], [1], [2]])

In [20]:
b

array([[0],
       [1],
       [2]])

In [23]:
b.T

array([[0, 1, 2]])

In [24]:
b.shape

(3, 1)

In [25]:
b.T.shape

(1, 3)

In [26]:
np.dot(b,b)
# Out: ValueError: shapes (3,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

ValueError: shapes (3,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

In [27]:
np.dot(b.T, b)

array([[5]])

In [28]:
np.dot(b, b.T)

array([[0, 0, 0],
       [0, 1, 2],
       [0, 2, 4]])

### NumPy broadcasting revisited

Broadcasting it's when one dimension of an array is = 1

In [30]:
a = np.array([[0], [7]])
b = np.array([[1, 2], [0, 6]])
c = np.array([[2, 2]])
a.shape

(2, 1)

In [31]:
b.shape

(2, 2)

In [32]:
c.shape

(1, 2)

In [33]:
a + c

array([[2, 2],
       [9, 9]])

a= [0]  It's broadcast as : [0,0] sames for c = [2,2] become [2,2] and the sum is by element : [0+2,0+2]<br>
\_\_\_[7]\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_[7,7]\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_[2,2]\_\______________________________[7+2,7+2]

In [34]:
a * b

array([[ 0,  0],
       [ 0, 42]])

## The inverse of a matrix 
### The identity matrix
The identity matrix is a square matrix that has 1’s on the diagonal and 0’s everywhere.

### Defining the inverse
Let A be a square n×n matrix. The inverse of A is the matrix A−1 with the property:<br>
A = [3,4] A-1 = [3,-4] <br>
\_\_\_\_[2,3]\_\_\_\_\_\_\_[-2,3]


In [36]:
A = np.array([[3, 4], [2, 3]])
V = np.array([[3,-4], [-2, 3]])
A*V

array([[  9, -16],
       [ -4,   9]])

In [37]:
np.dot(A, V)

array([[1, 0],
       [0, 1]])

### Linear independence
When the columns are linearly independent, the matrix is invertible, and when the columns are not linearly independent, the matrix is not invertible.
### The rank of a matrix
Actually, the maximum number of linearly independent columns of a matrix is known as the rank of the matrix and it is a concept that we encounter a lot in linear algebra.

***Theorem*** A square matrix is invertible if and only if it has full rank.

### Application: solving linear equations

A=bB

## The linalg module in NumPy



In [38]:
A = np.array([[3, 4], [2, 3]])
A

array([[3, 4],
       [2, 3]])

In [39]:
A.T

array([[3, 2],
       [4, 3]])

In [40]:
np.linalg.inv(A)

array([[ 3., -4.],
       [-2.,  3.]])

In [41]:
np.dot(A, np.linalg.inv(A))

array([[1., 0.],
       [0., 1.]])

In [42]:
np.linalg.matrix_rank(A)

2

In [43]:
B = np.array([[1, 0], [0, 0]])
B

array([[1, 0],
       [0, 0]])

In [44]:
np.linalg.inv(B) # Pas inversible

LinAlgError: Singular matrix

In [46]:
b = np.array([[1], [2]])
b = np.array([1, 2]) #both work
x = np.linalg.solve(A, b)
x

array([-5.,  4.])

In [47]:
np.dot(A, x)

array([1., 2.])

## Exercise : Solving a system of linear equations



In [58]:
 #Define A and b
A= np.array([[2,1,1], [-5,-3, 0],[1,1,-1]])
b=  np.array([-1, 1, -2]) 

In [54]:
#3. rank of A :
np.linalg.matrix_rank(A)#function : give the rank : 3 (full rank)

3

In [60]:
#3. inv of A
A_inv= np.linalg.inv(A)
A_inv

array([[-3., -2., -3.],
       [ 5.,  3.,  5.],
       [ 2.,  1.,  1.]])

In [61]:
#4.Use the inv of A
np.dot(A_inv,b)

array([  7., -12.,  -3.])

In [59]:
#5. Re-compute x
x = np.linalg.solve(A, b) #1 step
x

array([  7., -12.,  -3.])

In [63]:
#6. if b=  np.array([1, 1, -2]) 
A= np.array([[2,1,1], [-5,-3, 0],[1,1,-1],[0,5,-1]])
b=  np.array([-1, 1, -2,1]) 
x = np.linalg.solve(A, b)
x # The matrix is not inversible

LinAlgError: Last 2 dimensions of the array must be square