## 1.1 [Creating a Vector](https://github.com/f00-/machine-learning-with-python-cookbook-notes)

In [1]:
# Load library
import numpy as np

# Create a vector as a row
vector_row = np.array([1, 2, 3])

# Create a vector as a column
vector_column = np.array([[1],
                          [2],
                          [3]])

In [3]:
vector_row

array([1, 2, 3])

In [4]:
vector_column

array([[1],
       [2],
       [3]])

**see also**  
[Vectors, Math Is Fun](https://www.mathsisfun.com/algebra/vectors.html)

## 1.2 Creating a Matrix

In [5]:
# Load library
import numpy as np

# Create a matrix
matrix = np.array([[1, 2],
                   [1, 2],
                   [1, 2]])

In [6]:
matrix

array([[1, 2],
       [1, 2],
       [1, 2]])

**see also**  
[Matrix, Wolfram MathWorld](http://mathworld.wolfram.com/Matrix.html)

## 1.3 Creating a Sparse Matrix

In [7]:
# Load libraries
import numpy as np
from scipy import sparse

# Create a matrix
matrix = np.array([[0, 0],
                   [0, 1],
                   [3, 0]])

# Create compressed sparse row (CSR) matrix
matrix_sparse = sparse.csr_matrix(matrix)

A frequent situation in machine learning is having a huge amount of data; however, most of the elements in the data are zeros. For example, imagine a matrix where the columns are every movie on Netflix, the rows are every Netflix user, and the values are how many times a user has watched that particular movie. This matrix would have tens of thousands of columns and millions of rows! However, since most users do not watch most movies, the vast majority of elements would be zero.  

Sparse matrices only store nonzero elements and assume all other values will be zero, leading to significant computational savings. In our solution, we created a NumPy array with two nonzero values, then converted it into a sparse matrix. If we view the sparse matrix we can see that only the nonzero values are stored:

In [9]:
# View sparse matrix
print(matrix_sparse)

  (1, 1)	1
  (2, 0)	3


As mentioned, there are many different types of sparse matrices, such as compressed sparse column, list of lists, and dictionary of keys.   
**see also**  
[Sparse matrices, SciPy documentation](https://docs.scipy.org/doc/scipy/reference/sparse.html)  
[101 Ways to Store a Sparse Matrix](https://medium.com/@jmaxg3/101-ways-to-store-a-sparse-matrix-c7f2bf15a229)

## 1.4 Selecting Elements  
we need to select one or more elements in a vector or matrix.  
Like most things in Python, NumPy arrays are zero-indexed, meaning that the index of the first element is 0, not 1

In [11]:
# Load library
import numpy as np

# Create row vector
vector = np.array([1, 2, 3, 4, 5, 6])

# Create matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

In [12]:
# Select third element of vector
vector[2]

3

In [13]:
# Select second row, second column
matrix[1,1]

5

In [14]:
# Select all elements of a vector
vector[:]

array([1, 2, 3, 4, 5, 6])

In [15]:
# Select everything up to and including the third element
vector[:3]

array([1, 2, 3])

In [16]:
# Select everything after the third element
vector[3:]

array([4, 5, 6])

In [17]:
# Select the last element
vector[-1]

6

In [18]:
# Select the first two rows and all columns of a matrix
matrix[:2,:]

array([[1, 2, 3],
       [4, 5, 6]])

In [19]:
# Select all rows and the second column
matrix[:,1:2]

array([[2],
       [5],
       [8]])

## 1.5 Describing a Matrix  
we want to describe the shape, size, and dimensions of the matrix.

In [20]:
# Load library
import numpy as np

# Create matrix
matrix = np.array([[1, 2, 3, 4],
                   [5, 6, 7, 8],
                   [9, 10, 11, 12]])

# View number of rows and columns
matrix.shape

(3, 4)

In [21]:
# View number of elements (rows * columns)
matrix.size

12

In [22]:
# View number of dimensions
matrix.ndim

2

## 1.6 - Applying Operations to Elements  
**Problem**  
You want to apply some function to multiple elements in an array.  
**solution**   
Use NumPy's vectorize:

In [1]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# create function that adds 1000 to something
add_1000 = lambda i: i + 1000

# create vectorized function
vectorized_add_1000 = np.vectorize(add_1000)

# apply function to all elementsin matrix
vectorized_add_1000(matrix)

array([[1001, 1002, 1003],
       [1004, 1005, 1006],
       [1007, 1008, 1009]])

**Discusion**  

NumPy’s vectorize class converts a function into a function that can apply to all elements in an array or slice of an array. It’s worth noting that vectorize is essentially a for loop over the elements and does not increase performance. Furthermore, NumPy arrays allow us to perform operations between arrays even if their dimensions are not the same (a process called broadcasting). For example, we can create a much simpler version of our solution using broadcasting:


In [2]:
# add 1000 to all elements
matrix + 1000

array([[1001, 1002, 1003],
       [1004, 1005, 1006],
       [1007, 1008, 1009]])

## 1.7 - Finding Maximum and Minimum Values

**Problem**  
You need to find the maximum or minimum value in an array.

**Solution**
Use NumPy's max and min:


In [3]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# rreturn maximum element
np.max(matrix)

9

In [4]:
np.min(matrix)

1

**Discussion**

Often we want to know the maximum and minimum value in an array or subset of an array. This can be accomplished with the max and min methods. Using the axis parameter we can also apply the operation along a certain axis:


In [5]:
# find maximum element in each column
np.max(matrix, axis=0)

array([7, 8, 9])

In [6]:
# find maximum element in each row
np.max(matrix, axis=1)

array([3, 6, 9])

## 1.8 Calculating the Average, Variance, and Standard Deviation  

**Problem¶**  
You want to calculate some descriptive statistics about an array.

**Solution**  
Use NumPy's mean, var, and std:

In [7]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# return mean
np.mean(matrix)

5.0

In [8]:
# return variance
np.var(matrix)

6.666666666666667

In [9]:
# return standard deviation
np.std(matrix)

2.581988897471611

**Discussion**  
Just like with max and min, we can easily get descriptive statistics about the whole matrix or do calculations alon a single axis:

In [10]:
# find the mean value in each column
np.mean(matrix, axis=0)

array([4., 5., 6.])

## 1.9 Reshaping Arrays

**Problem**  
You want to change the shape (number of rows and columns) of an array without changing the element values.

**Solution**  
Use NumPy's reshape:

In [11]:
# load library
import numpy as np

# create 4x3 matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9],
                   [10, 11, 12]])

# reshape matrix into 2x6 matrix
matrix.reshape(2, 6)

array([[ 1,  2,  3,  4,  5,  6],
       [ 7,  8,  9, 10, 11, 12]])

**Discussion**  
reshape allows us to restructure an array so that we maintain the same data but it is organized as a different number of rows and columns. The only requirement is that the shape of the original and new matrix contain the same number of elements (i.e., the same size). We can see the size of a matrix using size:

In [13]:
matrix.size

12

One useful argument in reshape is -1, which effectively means “as many as needed,” so reshape(-1, 1) means one row and as many columns as needed:

In [14]:
matrix.reshape(1, -1)

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12]])

In [22]:
matrix.reshape(-1, 1)

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

In [15]:
matrix.reshape(12)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12])

## 1.10 Transposing a Vector or Matrix

**Problem**  
You need to transpose a vector or matrix

**Solution**   
Use the T method:

In [16]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# transpose matrix
matrix.T

array([[1, 4, 7],
       [2, 5, 8],
       [3, 6, 9]])

Transposing is a common operation in linear algebra where the column and row indices of each element are swapped. One nuanced point that is typically overlooked outside of a linear algebra class is that, technically, a vector cannot be transposed because it is just a collection of values:

In [17]:
# transpose vector
np.array([1, 2, 3, 4, 5, 6]).T

array([1, 2, 3, 4, 5, 6])

However, it is common to refer to transposing a vector as converting a row vector to a column vector (notice the second pair of brackets) or vice versa:

In [18]:
# transpose row vector
np.array([[1, 2, 3, 4, 5, 6]]).T

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6]])

## 1.11 Flattening a Matrix  

**Problem**  
You need to transform a matrix into a one-dimensional array.

**Solution**  
Use flatten:

In [19]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [4, 5, 6],
                   [7, 8, 9]])

# flatten matrix
matrix.flatten()

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

**Discussion**

flatten is a simple method to transform a matrix into a one-dimensional array. Alternatively, we can use reshape to create a row vector:

In [20]:
matrix.reshape(1, -1)

array([[1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [21]:
matrix.reshape(-1, 1)

array([[1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8],
       [9]])

## 1.12 Finding the Rank of a Matrix  

**Problem**  
You need to know the rank of a matrix

**Solution**  
Use NumPy's linear algebra method matrix_rank:

In [23]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 1, 1],
                   [1, 1, 10],
                   [1, 1, 15]])

# return matrix rank
np.linalg.matrix_rank(matrix)

2

[The Rank of a Matrix, CliffsNotes](https://www.cliffsnotes.com/study-guides/algebra/linear-algebra/real-euclidean-vector-spaces/the-rank-of-a-matrix)

## 1.13 Calculating the Determinant  

**Problem**  
You need to know the determinant of a matrix

**Solution**  
Use NumPy's linear algebra method det:

In [24]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [2, 4, 6],
                   [3, 8, 9]])

# return the determinant of matrix
np.linalg.det(matrix)

0.0

**See Also**  
[The determinant | Essence of linear algebra, chapter 5, 3Blue1Brown](https://www.youtube.com/watch?v=Ip3X9LOh2dk)  
[Determinant, Wolfram MathWorld ](http://mathworld.wolfram.com/Determinant.html)

## 1.14 Getting the Diagonal of a Matrix

**Problem**  
You need to get the diagonal elements of matrix.

**Solution**  
Use diagonal:

In [1]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [2, 4, 6],
                   [3, 8, 9]])

# return diagonal elements
matrix.diagonal()


array([1, 4, 9])

**Discussion**  
NumPy makes getting the diagonal elements of a matrix easy with diagonal. It is also possible to get a diagonal off from the main diagonal by using the offset parameter:

In [2]:
# return diagonal one above the main diagonal
matrix.diagonal(offset=1)

array([2, 6])

In [3]:
# return diagonal one below the main diagonal
matrix.diagonal(offset=-1)

array([2, 8])

## 1.15 Calculating the Trace of a Matrix

**Problem**  
You need to calculate the trace of a matrix

**Solution**  
Use trace:

In [4]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 2, 3],
                   [2, 4, 6],
                   [3, 8, 9]])

# return trace
matrix.trace()

14

**Discussion**  

The trace of a matrix is the sum of the diagonal elements and is often used under the hood in machine learning methods. Given a NumPy multidimensional array, we can calculate the trace using trace. We can also return the diagonal of a matrix and calculate its sum:

In [5]:
# return diagonal and sum elements
sum(matrix.diagonal())

14

## 1.16 Finding Eigenvalues and Eigenvectors

**Problem**  
You need to find the eigenvalues and eigenvectors of a square matrix.

**Solution**  
Use NumPy's linalg.eig:

In [25]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, -1, 3],
                   [1, 1, 6],
                   [3, 8, 9]])

# calculate eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(matrix)

# view eigenvalues
eigenvalues

array([13.55075847,  0.74003145, -3.29078992])

In [27]:
# view eigenvectors
eigenvectors

array([[-0.17622017, -0.96677403, -0.53373322],
       [-0.435951  ,  0.2053623 , -0.64324848],
       [-0.88254925,  0.15223105,  0.54896288]])

**See Also**  
[Eigenvectors and Eigenvalues Explained Visually, Setosa.io ](http://setosa.io/ev/eigenvectors-and-eigenvalues/)  
[Eigenvectors and eigenvalues | Essence of linear algebra, Chapter 10, 3Blue1Brown ](https://www.youtube.com/watch?v=PFDu9oVAE-g)


## 1.17 Calculating Dot Products  

**Problem**  
You need to calculate the dot product of two vectors.

**Solution**  
Use NumPy's dot:

In [28]:
# load library
import numpy as np

# create two vectors
vector_a = np.array([1, 2, 3])
vector_b = np.array([4, 5, 6])

# calculate dot product
np.dot(vector_a, vector_b)


32

In [29]:
# we can use the new @ operator:
vector_a @ vector_b

32

**See Also**  
[Vector dot product and vector length, Khan Academy ](https://www.khanacademy.org/math/linear-algebra/vectors-and-spaces/dot-cross-products/v/vector-dot-product-and-vector-length)  
[Dot Product, Paul’s Online Math Notes ](http://tutorial.math.lamar.edu/Classes/CalcII/DotProduct.aspx)


## 1.18 Adding and Subtracting Matricies

**Problem**  
You want to add or subtract two matricies

**Solution**  
Use NumPy's add and subtract:

In [30]:
# load library
import numpy as np

# create matricies
matrix_a = np.array([[1, 1, 1],
                     [1, 1, 1],
                     [1, 1, 2]])

matrix_b = np.array([[1, 3, 1],
                     [1, 3, 1],
                     [1, 3, 8]])

# add two matricies
np.add(matrix_a, matrix_b)

array([[ 2,  4,  2],
       [ 2,  4,  2],
       [ 2,  4, 10]])

In [31]:
# subtract two matrices
np.subtract(matrix_a, matrix_b)

array([[ 0, -2,  0],
       [ 0, -2,  0],
       [ 0, -2, -6]])

**Discussion**  

Alternatively, we can simply use the + and - operators:

In [32]:
# add two matricies
matrix_a + matrix_b

array([[ 2,  4,  2],
       [ 2,  4,  2],
       [ 2,  4, 10]])

## 1.19 Multiplying Matricies  

**Problem**  
You want to multiply two matrices.

**Solution**  
Use NumPy's dot:

In [6]:
# load library
import numpy as np

# create matrices
matrix_a = np.array([[1, 1],
                     [1, 2]])

matrix_b = np.array([[1, 3],
                     [1, 2]])

# multiply two matrices
np.dot(matrix_a, matrix_b)

array([[2, 5],
       [3, 7]])

**Discussion**  

Alternatively, in Python 3.5+ we can use the @ operator:

In [7]:
# multiply two matrices
matrix_a @ matrix_b

array([[2, 5],
       [3, 7]])

**See Also**  

Array vs Matrix Operations, MathWorks (https://www.mathworks.com/help/matlab/matlab_prog/array-vs-matrix-operations.html?requestedDomain=true)

## 1.20 Inverting a Matrix

**Problem**  
You want to calculate the inverse of a square matrix.

**Solution**  
Use NumPy's linear algebra inv method:

In [8]:
# load library
import numpy as np

# create matrix
matrix = np.array([[1, 4],
                  [2, 5]])

# calculate inverse of matrix
np.linalg.inv(matrix)

array([[-1.66666667,  1.33333333],
       [ 0.66666667, -0.33333333]])

**Discussion**  

The inverse of a square matrix, $A$, is a second matrix $A^{–1}$, such that:

$A * A^{-1} = I$

where $I$ is the identity matrix. In NumPy we can use linalg.inv to calculate $A^{–1}$ if it exists. To see this in action, we can multiply a matrix by its inverse and the result is the identity matrix:


In [9]:
matrix @ np.linalg.inv(matrix)

array([[1., 0.],
       [0., 1.]])

## 1.21 Generating Random Values

**Problem**  
You want to generate pseudorandom values.

**Solution**  
Use NumPy's random:

In [10]:
# load library
import numpy as np

# set seed
np.random.seed(0)

# generate three random floats between 0.0 and 1.0
np.random.random(3)

array([0.5488135 , 0.71518937, 0.60276338])

**Discussion**  

NumPy offers a wide variety of means to generate random numbers, many more than can be covered here. In our solution we generated floats; however, it is also common to generate integers:

In [11]:
# genereate three random integers between 1 and 10
np.random.randint(0, 11, 3)

array([3, 7, 9])

Alternatively, we can generate numbers by drawing them from a distribution:

In [13]:
# draw three numbers from a normal distribution with mean 0.0
# and standard deviation of 1.0
np.random.normal(0.0, 1.0, 3)

array([-0.13309028, -0.1730696 , -1.76165167])

In [14]:
# draw three numbers from a logistic distribution with mean 0.0 and scale of 1.0
np.random.logistic(0.0, 1.0, 3)

array([ 1.46416405, -0.08013416, -0.4356214 ])

In [15]:
# draw three numbers greater than or equal to 1.0 and less than 2.0
np.random.uniform(1.0, 2.0, 3)

array([1.83607876, 1.33739616, 1.64817187])

Finally, it can sometimes be useful to return the same random numbers multiple times to get predictable, repeatable results. We can do this by setting the `“seed”` (an integer) of the pseudorandom generator. Random processes with the same seed will always produce the same output. We will use seeds throughout this book so that the code you see in the book and the code you run on your computer produces the same results.