# Python Linear Algebra_Part 1: Vector & Matrice & Operation
## Full Day Workshop for user learn Data Science with Python
## 2018 Timothy CL Lam
This is meant for internal usage, part of contents copied externally, not for commercial purpose`

# Vectors and Vector Arithmetic
## What is Vector
- A vector is a tuple of one or more values called scalars.
- Vectors are built from components, which are ordinary numbers. 
- You can think of
a vector as a list of numbers, and vector algebra as operations performed on the
numbers in the list.

Below is a sample code for a vector with 3 elements

In [1]:
# create a vector
from numpy import array
# define vector
v = array([1, 2, 3])
print(v)

[1 2 3]


## Vector Arithmetic
A simple vector-vector arithmetic (addition, substraction, multiplication)

Below is the example denes two vectors with three elements each, then subtracts the rst from the
second

In [2]:
# vector subtraction
from numpy import array
# define first vector
a = array([1, 2, 3])
print(a)
# define second vector
b = array([0.5, 0.5, 0.5])
print(b)
# subtract vectors
c = a - b
print(c)

[1 2 3]
[ 0.5  0.5  0.5]
[ 0.5  1.5  2.5]


## Vector Dot Product
- We can calculate the sum of the multiplied elements of two vectors of the same length to give a
scalar. 
- This is called the dot product

In [3]:
# vector dot product
from numpy import array
# define first vector
a = array([1, 2, 3])
print(a)
# define second vector
b = array([1, 2, 3])
print(b)
# multiply vectors
c = a.dot(b)
print(c)

[1 2 3]
[1 2 3]
14


## Vector-Scalar Multiplication
- A vector can be multiplied by a scalar, in eect scaling the magnitude of the vector. 
- To keep
notation simple, we will use lowercase s to represent the scalar value.

$$c = s * v$$

In [4]:
# vector-scalar multiplication
from numpy import array
# define vector
a = array([1, 2, 3])
print(a)
# define scalar
s = 0.5
print(s)
# multiplication
c = s * a
print(c)

[1 2 3]
0.5
[ 0.5  1.   1.5]


## Vector Norms
- Calculating the length or magnitude of vectors is often required either directly as a regularization
method in machine learning
- or as part of broader vector or matrix operations
- The L1 norm that is calculated as the sum of the absolute values of the vector.
- The L2 norm that is calculated as the square root of the sum of the squared vector values.
- The max norm that is calculated as the maximum vector values.

## L1 Norm
- The length of a vector can be calculated using the L1 norm, 
- This
length is sometimes called the taxicab norm or the Manhattan norm

![image.png](attachment:image.png)


- The L1 norm is calculated as the sum of the absolute vector values, 
- where the absolute value
of a scalar uses the notation ja1j. In effect, 
- the norm is a calculation of the Manhattan distance
from the origin of the vector space


### Operation of L1

- The L1 norm of a vector can be calculated in NumPy using the norm() function 
- with a
parameter to specify the norm order
- In this case **1**

In [2]:
# vector L1 norm
from numpy import array
from numpy.linalg import norm
# define vector
a = array([1, 2, 3])
print(a)
# calculate norm
l1 = norm(a, 1)
print(l1)

[1 2 3]
6.0


- The L1 norm is often used when fitting machine learning algorithms as a regularization
method, 
- e.g. a method to keep the coecients of the model small, and in turn, the model less
complex.

## L2 Norm
- The L2 norm calculates the distance of the vector coordinate from the origin of the vector
space. 
- As such, it is also known as the Euclidean norm as it is calculated as the Euclidean
distance from the origin. 
- The result is a positive distance value. 
- The L2 norm is calculated as
the square root of the sum of the squared vector values.

![image.png](attachment:image.png)

### Operation of L2

- The L1 norm of a vector can be calculated in NumPy using the norm() function 
- with a
parameter to specify the norm order
- In this case **Default Parameter**

In [3]:
# vector L2 norm
from numpy import array
from numpy.linalg import norm
# define vector
a = array([1, 2, 3])
print(a)
# calculate norm
l2 = norm(a)
print(l2)

[1 2 3]
3.7416573867739413


- Like the L1 norm, the L2 norm is often used when fitting machine learning algorithms as a
regularization method 
- By far, the L2 norm is more commonly used than other vector norms
in machine learning.

## Max Norm

- The length of a vector can be calculated using the maximum norm, also called max norm
- The max norm is calculated as returning the maximum value of the vector

![image.png](attachment:image.png)

### Operation of Max Norm

- The L1 norm of a vector can be calculated in NumPy using the norm() function 
- with a
parameter to specify the norm order
- In this case **Inf**

In [None]:
# vector max norm
from math import inf
from numpy import array
from numpy.linalg import norm
# define vector
a = array([1, 2, 3])
print(a)
# calculate norm
maxnorm = norm(a, inf)
print(maxnorm)

# Matrices and Matrix Arithmetic
- A matrix is a two-dimensional array of scalars with one or more columns and one or more rows.
- A matrix is a two-dimensional array (a table) of numbers

In [5]:
# create matrix
from numpy import array
A = array([[1, 2, 3], [4, 5, 6]])
print(A)

[[1 2 3]
 [4 5 6]]


In [6]:
# matrix addition
from numpy import array
# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)
# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)
# add matrices
C = A + B
print(C)

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 2  4  6]
 [ 8 10 12]]


## Matrix Multiplication (Hadamard Product)

- Two matrices with the same size can be multiplied together, 
- and this is often called element-wise
matrix multiplication or the Hadamard product. 
- **It is not the typical operation meant when
referring to matrix multiplication**
- therefore a different operator is often used, such as a circle 
![image.png](attachment:image.png)

In [7]:
# matrix Hadamard product
from numpy import array
# define first matrix
A = array([
[1, 2, 3],
[4, 5, 6]])
print(A)
# define second matrix
B = array([
[1, 2, 3],
[4, 5, 6]])
print(B)
# multiply matrices
C = A * B
print(C) 

[[1 2 3]
 [4 5 6]]
[[1 2 3]
 [4 5 6]]
[[ 1  4  9]
 [16 25 36]]


## Matrix-Matrix Multiplication
- Matrix multiplication, also called the matrix dot product is more complicated than the previous
operations 

![image.png](attachment:image.png)


- it involves a rule as not all matrices can be multiplied together
- The number of columns (n) in the rst matrix (A) must equal the number of rows (m) in the second matrix (B).

![image.png](attachment:image.png)

- The matrix multiplication operation can be implemented in NumPy using the dot() function.
- It can also be calculated using the newer @ operator, since Python version 3.5.

In [9]:
# matrix dot product
from numpy import array
# define first matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# define second matrix
B = array([
[1, 2],
[3, 4]])
print(B)
# multiply matrices
C = A.dot(B)
print(C)
# multiply matrices with @ operator Python 3.5 is needed
#D = A @ B
#print(D)

[[1 2]
 [3 4]
 [5 6]]
[[1 2]
 [3 4]]
[[ 7 10]
 [15 22]
 [23 34]]


# Types of Matrices
- Square, symmetric, triangular, and diagonal matrices that are much as their names suggest.
- Identity matrices that are all zero values except along the main diagonal where the values
are 1.
- Orthogonal matrices that generalize the idea of perpendicular vectors and have useful
computational properties

### Square Matrix
![image.png](attachment:image.png)

- Square matrices are readily added and multiplied together and are the basis of many simple
linear transformations, 
- such as rotations (as in the rotations of images).

### Symmetric Matrix
![image.png](attachment:image.png)
- A symmetric matrix is always square and equal to its own transpose. 
- The transpose is an
operation that flips the number of rows and columns

### Triangular Matrix
Upper ![image.png](attachment:image.png)

- NumPy provides functions to calculate a triangular matrix from an existing square matrix.
- The tril() function to calculate the lower triangular matrix from a given matrix 
- The
triu() to calculate the upper triangular matrix from a given matrix

In [10]:
# triangular matrices
from numpy import array
from numpy import tril
from numpy import triu
# define square matrix
M = array([
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
print(M)
# lower triangular matrix
lower = tril(M)
print(lower)
# upper triangular matrix
upper = triu(M)
print(upper)

[[1 2 3]
 [1 2 3]
 [1 2 3]]
[[1 0 0]
 [1 2 0]
 [1 2 3]]
[[1 2 3]
 [0 2 3]
 [0 0 3]]


### Diagonal Matrix
- A diagonal matrix is one where values outside of the main diagonal have a zero value, 

![image.png](attachment:image.png)
- where the
main diagonal is taken from the top left of the matrix to the bottom right. 
- A diagonal matrix
is often denoted with the variable D and may be represented as a full matrix or as a vector of
values on the main diagonal.

- A diagonal matrix does not have to be square. In the case of a rectangular matrix, 
- the
diagonal would cover the dimension with the smallest length

![image.png](attachment:image.png)

- NumPy provides the function diag() that can create a diagonal matrix from an existing
matrix, 
- Transform a vector into a diagonal matrix. The example below defines a 3 * 3 square
matrix
- Extracts the main diagonal as a vector, and then 
- Creates a diagonal matrix from the
extracted vector

In [11]:
# diagonal matrix
from numpy import array
from numpy import diag
# define square matrix
M = array([
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
print(M)
# extract diagonal vector
d = diag(M)
print(d)
# create diagonal matrix from vector
D = diag(d)
print(D)

[[1 2 3]
 [1 2 3]
 [1 2 3]]
[1 2 3]
[[1 0 0]
 [0 2 0]
 [0 0 3]]


### Identity Matrix
- An identity matrix is a square matrix that does not change a vector when multiplied. 
- The
values of an identity matrix are known. 
- All of the scalar values along the main diagonal (top-left
to bottom-right) have the value one, while all other values are zero

![image.png](attachment:image.png)

- In NumPy, an identity matrix can be created with a specific size using the identity()
function. 
- The example below creates an I3 identity matrix

In [12]:
# identity matrix
from numpy import identity
I = identity(3)
print(I)

[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


### Orthogonal Matrix
- Two vectors are orthogonal when their dot product equals zero. 
- The length of each vector is 1
then the vectors are called orthonormal because they are both orthogonal and normalized
- This is intuitive when we consider that one line is orthogonal with another if it is perpendicular
to it. 
- An orthogonal matrix is a type of square matrix whose columns and rows are orthonormal
unit vectors, 
- e.g. perpendicular and have a length or magnitude of 1
![image.png](attachment:image.png)

- Orthogonal matrices are used a lot for linear transformations, such as reflections and
permutations. 
- A simple 2  2 orthogonal matrix is listed below, which is an example of a
reflection matrix or coordinate reflection

In [13]:
# orthogonal matrix
from numpy import array
from numpy.linalg import inv
# define orthogonal matrix
Q = array([
[1, 0],
[0, -1]])
print(Q)
# inverse equivalence
V = inv(Q)
print(Q.T)
print(V)
# identity equivalence
I = Q.dot(Q.T)
print(I)

[[ 1  0]
 [ 0 -1]]
[[ 1  0]
 [ 0 -1]]
[[ 1.  0.]
 [-0. -1.]]
[[1 0]
 [0 1]]


- Running the example first prints the orthogonal matrix, the inverse of the orthogonal matrix,
- the transpose of the orthogonal matrix are then printed and are shown to be equivalent
- Finally, the identity matrix is printed which is calculated from the dot product of the orthogonal
matrix with its transpose
- **Orthogonal matrices are useful tools as they are computationally cheap and stable to calculate their inverse as simply their transpose**

## Sparse Matrices
- A sparse matrix is a matrix that is comprised of mostly zero values. 
- Sparse matrices are distinct
from matrices with mostly non-zero values, which are referred to as dense matrices
- The interest in sparsity arises
because its exploitation can lead to enormous computational savings 
- and because
many large matrix problems that occur in practice are sparse
- The sparsity of a matrix can be quantied with a score, 
- which is the number of zero values
in the matrix divided by the total number of elements in the matrix
![image.png](attachment:image.png)

In [37]:
# sparsity calculation
from numpy import array
from __future__ import division
from numpy import count_nonzero
# create dense matrix
A = array([
[1, 0, 0, 1, 0, 0],
[0, 0, 2, 0, 0, 1],
[0, 0, 0, 2, 0, 0]])
print(A)
# calculate sparsity
sparsity = 1.0 - count_nonzero(A) / A.size
print(sparsity)

[[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
0.722222222222


### Example of Sparse Matrix
- Natural language processing for working with documents of text.
- Recommender systems for working with product usage within a catalog.
- Computer vision when working with images that contain lots of black pixels.
- If there are 100,000 words in the language model, then the feature vector has length
100,000, 
- but for a short email message almost all the features will have count zero.

### Problem of High Sparsity 
- An example of a very large matrix that is too large to be stored in memory is a link matrix
that shows the links from one website to another. 
- An example of a smaller sparse matrix might
be a word or term occurrence matrix for words in one book against all known words in English.
- In both cases, the matrix contained is sparse with many more zero values than data values
- It is wasteful to use general methods of linear algebra on such problems,


### How to Handle Sparsity
- The solution to representing and working with sparse matrices is to use an alternate data
structure to represent the sparse data. 


Multiple Data Structure includes:
- **Dictionary of Keys**. A dictionary is used where a row and column index is mapped to
a value.
- **List of Lists**. Each row of the matrix is stored as a list, with each sublist containing the
column index and the value.
- **Coordinate List**. A list of tuples is stored with each tuple containing the row index,
column index, and the value.

There are also data structures that are more suitable for performing effcient operations; 

- **Compressed Sparse Row**. The sparse matrix is represented using three one-dimensional
arrays for the non-zero values, the extents of the rows, and the column indexes.
- **Compressed Sparse Column**. The same as the Compressed Sparse Row method except
the column indices are compressed and read rst before the row indices

In [39]:
# sparse matrix
from numpy import array
from scipy.sparse import csr_matrix
# create dense matrix
A = array([
[1, 0, 0, 1, 0, 0],
[0, 0, 2, 0, 0, 1],
[0, 0, 0, 2, 0, 0]])
print(A)
# convert to sparse matrix (CSR method)
S = csr_matrix(A)
print(S)
# reconstruct dense matrix
B = S.todense()
print(B)

[[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
  (0, 0)	1
  (0, 3)	1
  (1, 2)	2
  (1, 5)	1
  (2, 3)	2
[[1 0 0 1 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]


# Matrix Operations
- The Transpose operation for 
ipping the dimensions of a matrix.
- The Inverse operations used in solving systems of linear equations.
- The Trace and Determinant operations used as shorthand notation in other matrix
operations.

### Transpose
- A defined matrix can be transposed, which creates a new matrix with the number of columns
and rows flipped.
- We can transpose a matrix in NumPy by calling the T attribute
![image.png](attachment:image.png)

In [14]:
# transpose matrix
from numpy import array
# define matrix
A = array([
[1, 2],
[3, 4],
[5, 6]])
print(A)
# calculate transpose
C = A.T
print(C)

[[1 2]
 [3 4]
 [5 6]]
[[1 3 5]
 [2 4 6]]


### Inverse
- Matrix inversion is a process that finds another matrix that when multiplied with the matrix,
- **results in an identity matrix**

![image.png](attachment:image.png)

In [15]:
# invert matrix
from numpy import array
from numpy.linalg import inv
# define matrix
A = array([
[1.0, 2.0],
[3.0, 4.0]])
print(A)
# invert matrix
B = inv(A)
print(B)
# multiply A and B
I = A.dot(B)
print(I)

[[1. 2.]
 [3. 4.]]
[[-2.   1. ]
 [ 1.5 -0.5]]
[[1.0000000e+00 0.0000000e+00]
 [8.8817842e-16 1.0000000e+00]]


### Trace
- A trace of a square matrix is the sum of the values on the main diagonal of the matrix (top-left
to bottom-right).
- We can calculate the trace of a matrix in NumPy using the trace() function
- Alone, the trace operation is not interesting, 
- but it offers a simpler notation and it is used
as an element in other key matrix operations

In [17]:
# matrix trace
from numpy import array
from numpy import trace
# define matrix
A = array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print(A)
# calculate trace
B = trace(A)
print(B)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
15


### Determinant
- The determinant of a square matrix is a scalar representation of the volume of the matrix.
- The determinant of a square matrix is calculated from the elements of the matrix. 
- More
technically, the determinant is the product of all the eigenvalues of the matrix. 
- The intuition for the determinant is
that it describes the way a matrix will scale another matrix when they are multiplied together.
- E.g. a determinant of 1 preserves the space of the other matrix. 
- A determinant of 0
indicates that the matrix cannot be inverted
- In NumPy, the determinant of a matrix can be calculated using the det() function.

In [18]:
# matrix determinant
from numpy import array
from numpy.linalg import det
# define matrix
A = array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print(A)
# calculate determinant
B = det(A)
print(B)

[[1 2 3]
 [4 5 6]
 [7 8 9]]
0.0


### Rank
- The rank of a matrix is the estimate of the number of linearly independent rows or columns in
a matrix. 
- An intuition for rank is to consider it the number of dimensions spanned by all of the vectors
within a matrix. 
- For example, a rank of 0 suggest all vectors span a point, 
- a rank of 1 suggests
all vectors span a line,
- a rank of 2 suggests all vectors span a two-dimensional plane. 
- The rank
is estimated numerically, often using a matrix decomposition method. 
- A common approach is to
use the Singular-Value Decomposition or SVD for short. 
- NumPy provides the matrix rank()
function for calculating the rank of an array.

In [19]:
# vector rank
from numpy import array
from numpy.linalg import matrix_rank
# rank
v1 = array([1,2,3])
print(v1)
vr1 = matrix_rank(v1)
print(vr1)
# zero rank
v2 = array([0,0,0,0,0])
print(v2)
vr2 = matrix_rank(v2)
print(vr2)

[1 2 3]
1
[0 0 0 0 0]
0


- The next example makes it clear that the rank is not the number of dimensions of the
matrix
- but the number of linearly independent directions.

In [21]:
# matrix rank
from numpy import array
from numpy.linalg import matrix_rank
# rank 0
M0 = array([
[0,0],
[0,0]])
print(M0)
mr0 = matrix_rank(M0)
print(mr0)
# rank 1
M1 = array([
[1,2],
[1,2]])
print(M1)
mr1 = matrix_rank(M1)
print(mr1)
# rank 2
M2 = array([
[1,2],
[3,4]])
print(M2)
mr2 = matrix_rank(M2)
print(mr2)

[[0 0]
 [0 0]]
0
[[1 2]
 [1 2]]
1
[[1 2]
 [3 4]]
2
