<a href="https://colab.research.google.com/github/bipinthecoder/machine-learning-basics/blob/main/ml_vectors_matrices_arrays.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Why Numpy?** 

- NumPy provides efficient storage(less memory) and better ways of handling data for Mathematical Operations
- NumPy is meant for creating homogeneous n-dimensional arrays (n = 1..n). Unlike Python lists, all elements of a NumPy array should be of same type
- The dimensions of the array can be changed at runtime as long as the multiplicity factor produces the same number of elements. For example, a 2 * 5 matrix can be converted into 5 * 2 and a 1 * 4 into 2 * 2. This can be done by calling the NumPy .reshape(...) function on the arrays.
- As .reshape(x,y) can convert an array into multi dimensional array, similarly, its possible to create a single dimensional array from any any multi dimensional array using the function .ravel()
- NumPy n-dimensional arrays makes it extremely easy to perform mathematical operations on it


In [None]:
import numpy as np

**Creating a Vector**

In [None]:
#Vector as a row
vector_row = np.array([1,2,3])

#Vector as a column
vector_col = np.array([[1],
                      [2],
                      [3]])

**Creating a Matrix**

In [None]:
matrix = np.array([[1,2],
                  [3,4],
                  [5,6]])
np.ndim(matrix)

2

Numpy has a matrix data structure which can be accessed using **np.mat**      However, this is rarely used as the de facto standard data structures of Numpy are arrays and most Numpy operations return arrays and not matrix objects.

In [None]:
matrix_object = np.mat([[1, 2],
                        [1, 2],
                        [1, 2]])
matrix_object

matrix([[1, 2],
        [1, 2],
        [1, 2]])

**Creating a Sparse Matrix**

A sparse matrix or sparse array is a matrix in which most of the elements are zero. Sparse matirces only store non-zero elements and assume all other values are zero. This helps in significant computational savings.

Many dataset in real world has zero as most of the elements. Hence Sparse matrix will be of great use to store data with very few non-zero elements.

Matrices where most of the values are non-zero called **dense matrices**.

The amount of sparsity of a matrix can be calculated as follows:

**sparsity = count zero elements / total elements**
Ex: A small 3 x 6 sparse matrix.

     1, 0, 0, 1, 0, 0
    A = (0, 0, 2, 0, 0, 1)
     0, 0, 0, 2, 0, 0

The example has 13 zero values of the 18 elements in the matrix, giving this matrix a sparsity score of 0.722 or about 72%
  


**Working with Sparse Matrix**

The zero values can be ignored and only the data or non-zero values in the sparse matrix need to be stored or acted upon.

There are multiple data structures that can be used to efficiently construct a sparse matrix; three common examples are listed below.

    Dictionary of Keys -> A dictionary is used where a row and column index is mapped to a value.
    List of Lists -> Each row of the matrix is stored as a list, with each sublist containing the column index and the value.
    Coordinate List -> A list of tuples is stored with each tuple containing the row index, column index, and the value.

There are also data structures that are more suitable for performing efficient operations; two commonly used examples are listed below.

    Compressed Sparse Row -> The sparse matrix is represented using three one-dimensional arrays
    for the non-zero values,
    the extents of the rows, and the column indexes.
    Compressed Sparse Column -> The same as the Compressed Sparse Row method except the 
    column indices are compressed 
    and read first before the row indices.

The Compressed Sparse Row, also called **CSR format** or **Yale format**, is often used to represent sparse matrices in machine learning given the efficient access and matrix multiplication that it supports.

**Sparse Matrix representation in a 2-D Array format.**

2D array is used to represent a sparse matrix in which there are three rows named as 

    Row: Index of row, where non-zero element is located
    Column: Index of column, where non-zero element is located
    Value: Value of the non zero element located at index – (row,column)

In [None]:
sparseMatrix = [[0,0,3,0,4],[0,0,5,7,0],[0,0,0,0,0],[0,2,6,0,0]]
 
# initialize size as 0
size = 0
 
for i in range(4):
    for j in range(5):
        if (sparseMatrix[i][j] != 0):
            size += 1
 
# number of columns in compactMatrix(size) should
# be equal to number of non-zero elements in sparseMatrix
rows, cols = (3, size)
print(f"Size is {size}")
compactMatrix = [[0 for i in range(cols)] for j in range(rows)]
# print(compactMatrix)

k = 0
for i in range(4):
    for j in range(5):
        if (sparseMatrix[i][j] != 0):
            print(f"k is {k} and element is {sparseMatrix[i][j]}")
            compactMatrix[0][k] = i
            compactMatrix[1][k] = j
            compactMatrix[2][k] = sparseMatrix[i][j]
            k += 1
 
for i in compactMatrix:
    print(i)

Many linear algebra NumPy and SciPy functions that operate on NumPy arrays can transparently operate on SciPy sparse arrays. Further, machine learning libraries that use NumPy data structures can also operate transparently on SciPy sparse arrays, such as scikit-learn for general machine learning and Keras for deep learning.

A dense matrix stored in a NumPy array can be converted into a sparse matrix using the CSR representation by calling the csr_matrix() function.

In [None]:
#Importing sparse from scipy

# SciPy is a free and open-source Python library used for scientific computing and technical computing.
# SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions,
# FFT, signal and image processing, ODE solvers and other tasks common in science and engineering.

from scipy import sparse

matrix = np.array(([0, 0],
                  [0, 1],
                  [3, 0]))


#compressed sparse row (CSR) matrix

matrix_sparse = sparse.csr_matrix(matrix)
print(matrix_sparse)



  (1, 1)	1
  (2, 0)	3


A sparse matrix can be converted to dense matrix by using todense() method

In [None]:
dense = matrix_sparse.todense()
print(dense)

[[0 0]
 [0 1]
 [3 0]]


**Selecting Elements**

In [None]:
vector = np.array([1, 2, 3, 4, 5])

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

Select all elements in vector

In [None]:
vector[:]

array([1, 2, 3, 4, 5])

Select all elements up to and including third element in the vector

In [None]:
vector[:3]

array([1, 2, 3])

Select all elements after 3rd element

In [None]:
vector[3:]

array([4, 5])

Select last element

In [None]:
vector[-1]

5

Select first 2 rows and all columns of matrix

In [None]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])
matrix[:2,:]

array([[1, 2, 3],
       [4, 5, 6]])

Select all rows and second column of matrix

In [None]:
matrix[:, 1:2]

array([[2],
       [5],
       [8]])

**Describing the Matrix**

In [None]:
matrix = np.array([[1,2,3,4],
                   [5,6,7,8],
                   [9,10,11,12]
                   ])

In [None]:
#get the no of rows and cloumns
matrix.shape

(3, 4)

In [None]:
#get the no of elements (rows * coumns)
matrix.size

12

In [None]:
#get no of dimensions

In [None]:
matrix.ndim

2

**Numpy Vectorize**

vectorize class converts a function into a function that applies to all elements in an array or slice of array.

It acts like a for loop over the elements.


Numpy broadcasting can be used to used to perform operations when dimensions are not the same. The one dimensional array must match one dimension of the larger array. It is then repeated to make the shapes equal.

In [None]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

matrix = matrix + 100

matrix


array([[101, 102, 103],
       [104, 105, 106],
       [107, 108, 109]])

In [None]:
a = np.array([1, 2, 3])

b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])

c = a*b

print(c)

[[ 1  4  9]
 [ 4 10 18]
 [ 7 16 27]
 [10 22 36]]


In [None]:
z = np.zeros((3,3))
s = np.array([[1],[2],[3]])

print(s)

print(z)
z = z+s
print(z)

[[1]
 [2]
 [3]]
[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [2. 2. 2.]
 [3. 3. 3.]]


In [None]:
#using vectorize


matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

#creating a normal function to perform an operation
add_100 = lambda i : i + 100

#vectorizing the function
vectorized_add_100 = np.vectorize(add_100)

vectorized_add_100(matrix)



array([[101, 102, 103],
       [104, 105, 106],
       [107, 108, 109]])

**Finding Max and Min values in an array**

Numpy provides max and min functions to achieve this task

In [None]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 18, 9]])

print(f'max value in matrix is {np.max(matrix)}')
print(f'min value in matrix is {np.min(matrix)}')

max value in matrix is 18
min value in matrix is 1


![picture](https://drive.google.com/uc?export=view&id=168Gzl5e5I3PmgBJdRPx0V1m3yzT2f4f-)

Getting Maximum element in each column

In [None]:
np.max(matrix,axis=0)

array([ 7, 18,  9])

Getting Maximum element in each row


In [None]:
np.max(matrix, axis=1)

array([ 3,  6, 18])

**Average, Variance and Standard Deviation**

Calculating descriptive statistics of an array



In [None]:
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

#mean
print(f"Mean of matrix is {np.mean(matrix)}\n")

#variane

print(f"Variance if matrix is {np.var(matrix)}\n")

#Standard Deviation

print(f"S.D of matrix is {np.std(matrix)}\n")

Mean of matrix is 5.0

Variance if matrix is 6.666666666666667

S.D of matrix is 2.581988897471611



Descriptive statistics among single axis



In [None]:
#Mean value in each column

print(f"Mean value in each column is {np.mean(matrix, axis = 0)}\n")

print(f"Mean value in each row is {np.mean(matrix, axis=1)}")

Mean value in each column is [4. 5. 6.]

Mean value in each row is [2. 5. 8.]


**Reshaping Arrays**

In [None]:
# 4 * 3 matrix

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9],
                   [10, 11, 12]])

#Reshaping into 2 * 6

# matrix.reshape(2, 6)
np.reshape(matrix, (2, 6))

array([[ 1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12]])

In [None]:
from numpy.matrixlib import mat
np.reshape(matrix, 12)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

In [None]:
# -1 can be used in reshape to autofill the no. of columns

np.reshape(matrix, (1, -1))

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]])

**Transpose of a Mtrix**

The Transpose of a matrix is found by interchanging its rows into columns or columns into rows. The transpose of the matrix is denoted by using the letter “T” in the superscript of the given matrix

In [None]:
#Numpy T method is used for finding Transpose of a matrix

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

#Finding Transpose
matrix.T

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

A vector cannot be transposed as it is just a collection of values

In [None]:
np.array([1, 2, 3, 4, 5, 6, 7, 8]).T

array([1, 2, 3, 4, 5, 6, 7, 8])

However a row vector can be transposed to column vector by using extra pair of brackets

In [None]:
np.array([[1, 2, 3, 4, 5, 6, 7, 8]]).T

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

**Flattening a Matrix**

Transforming a Matrix into a 1-D Array

In [None]:
#Numpy flatten() method can be used to achieve this operation

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

flattened = matrix.flatten()
print(flattened)
print(flattened.shape)

[1 2 3 4 5 6 7 8 9]
(9,)


A row vector can be created using reshape instead of flatten()

In [None]:
new_flattened = np.reshape(matrix, (1, -1))
print(new_flattened)
print(new_flattened.shape) #Since row vector

[[1 2 3 4 5 6 7 8 9]]
(1, 9)


**Rank of a Matrix**

Rank of a matrix shows how many independed rows are present in a matrix. ie, if two rows can be written as a linear combination of any other row, they cannot be counted. Only the independent rows add to the rank of a matrix.

https://www.cliffsnotes.com/study-guides/algebra/linear-algebra/real-euclidean-vector-spaces/the-rank-of-a-matrix




In [None]:
#Numpy provides matrix_rank() to achieve this

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

np.linalg.matrix_rank(matrix) # Here linalg stands for linear algebra

2

**Determinant of a Matrix**

https://www.mathsisfun.com/algebra/matrix-determinant.html

https://www.youtube.com/watch?v=Ip3X9LOh2dk

https://mathworld.wolfram.com/Determinant.html

In [None]:
#Numpy provides det() method to achieve this task

matrix  = np.array([[1, 2, 3],
                    [2, 4, 6],
                    [7, 8, 9]])

np.linalg.det(matrix)

0.0

**Getting Diagonal of a Matrix**

In [None]:
import numpy as np

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

matrix.diagonal()

array([1, 5, 9])

We can get the diagonal values away from the main diagonal using "offset" parameter

In [None]:
matrix.diagonal(offset=1)

array([2, 6])

In [None]:
matrix.diagonal(offset=-1)

array([4, 8])

**Trace of a Matrix**

 The trace of a square matrix A, denoted tr(A),[1] is defined to be the sum of elements on the main diagonal (from the upper left to the lower right) of A

In [None]:
#We can use Numpy trace method to achieve this task.

matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

matrix.trace()

15

In [None]:
#Another way is to return the diagonal and use sum()

np.sum(matrix.diagonal())

15

**EigenValues and EigenVectors**

https://www.mathsisfun.com/algebra/eigenvalue.html

Eigenvector does not change direction in a transformation. ie, after a linear transformation is applied, they change only in scale and not in direction.

**what makes a transformation linear is the following geometric rule: The origin must remain fixed, and all lines must remain lines. 
As in one dimension, what makes a two-dimensional transformation linear is that it satisfies two properties:
f(v+w)=f(v)+f(w)
f(cv)=cf(v)** where v and w are vectors instead of numbers

For a square matrix A, an Eigenvector and Eigenvalue make this equation true:
**Av = λv**, where **v** is the Eigenvector and **λ** is the Eigenvalue and **A** is the matrix.

Use of Eigenvalue and Eigenvector

One of the cool things is we can use matrices to do transformations in space, which is used a lot in computer graphics.

In that case the eigenvector is "the direction that doesn't change direction" !

And the eigenvalue is the scale of the stretch:

    1 means no change,
    2 means doubling in length,
    −1 means pointing backwards along the eigenvalue's direction
    etc


In [None]:
#Numpy's linalg provides eig to achive this

matrix = np.array([[1, -1, 3],
                   [1, 1, 6],
                   [3, 8, 9]])

eigenvalues, eigenvectors = np.linalg.eig(matrix)

print(f'{eigenvalues}'+'\n')

print(eigenvectors)

[13.55075847  0.74003145 -3.29078992]

[[-0.17622017 -0.96677403 -0.53373322]
 [-0.435951    0.2053623  -0.64324848]
 [-0.88254925  0.15223105  0.54896288]]


**Finding Dot products**

https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces/dot-cross-products/v/vector-dot-product-and-vector-length

In [None]:
import numpy as np

vector_a = np.array([1, 2, 3])
vector_b = np.array([4, 5, 6])

print(vector_a @ vector_b)


#or


print(np.dot(vector_a, vector_b))



32
32


**Adding Subtracting Matrices**


We can use Numpy's add() and subtract() method  or simply use '+' or '-' operators

In [18]:
import numpy as np

matrix_a = np.array([[1, 1, 1],
                     [1, 1, 1],
                     [1, 1, 2]])

matrix_b = np.array([[1, 3, 1],
                     [1, 3, 1],
                     [1, 3, 8]])

#Addition
add_result = np.add(matrix_a, matrix_b) #or matrix_a + matrix_b
print(f'{add_result}\n')


#Subtraction
sub_result = np.subtract(matrix_a, matrix_b) ##or matrix_a - matrix_b
print(sub_result)

[[ 2  4  2]
 [ 2  4  2]
 [ 2  4 10]]

[[ 0 -2  0]
 [ 0 -2  0]
 [ 0 -2 -6]]


**Matrix Multiplication**

We can use Numpy dot(), matmul() or just '@' operator

In [20]:
matrix_a = np.array([[1, 1],
                     [1, 2]])

matrix_b = np.array([[1, 1],
                     [1, 2]])

dot_result = np.dot(matrix_a, matrix_b)
print(f'{dot_result}\n')

mat_mul_result = np.matmul(matrix_a, matrix_b)
print(f'{mat_mul_result}\n')

operator_result = matrix_a @ matrix_b

print(f'{operator_result}\n')

[[2 3]
 [3 5]]

[[2 3]
 [3 5]]

[[2 3]
 [3 5]]

