# Basics of linear algebra for machine learning

## 01 - Introduction to linear algebra
### Linear algebra
Linear algebra is about linear combinations: using arithmetic on columns of numbers (vectors) and arrays of numbers (matrices) to create new columns and arrays of numbers. It's been formalized in the 1800s to find unknowns in systems of linear equations. 

A linear equation is a series of terms and mathematical operations where some terms are unknown, for example:  
$y = 4 x + 1$

They are called linear equations because they describe a line on a two-dimensional graph. We can line up a system of equations with two or more unknowns:  
- $y = 0.1 x_{1} + 0.4 x_{2}$  
- $y = 0.3 x_{1} + 0.9 x_{2}$  
- $y = 0.2 x_{1} + 0.3 x_{2}$

where
- the column of $y$ values is a column vector of outputs from the equation
- the two columns of float values are the data columns $a_{1}$ and $a_{2}$ forming the matrix $A$
- the two unknown values $x_{1}$ and $x_{2}$ are the coefficients of the equation and form a vector of unknowns $b$ to be solved

summarized in linear algebra as  
$y = A \cdot b$

Such problems are challenging to solve because:
- there are usually more unknowns than there are equations to solve
- no single line can satisfy all of the equations without error

Interesting problems are often described by system with an infinite number of solutions. This is the core of linear algebra as it relates to machine learning. The rest of the operations are about making such problems easier to understand and solve.

### Numerical linear algebra
Implementations of vector and matrix operations were initially implemented in FORTRAN with libraries such as:
- LAPACK
- BLAS
- ATLAS

Popular packages used nowadays in Python for example build on top of these libraries.

### Linear algebra and statistics
- using vector and matrix notation (multivariate statistics)
- solving least squares and weighted least squares (linear regression)
- estimating means and variance of data matrices
- using the covariance matrix (multinomial Gaussian distributions)
- leveraging the concepts above for data reduction with principal component analysis

### Applications of linear algebra
- matrices in engineering (line of springs)
- graphs and networks (graph analysis)
- Markov matrices, population, economics (population growth)
- linear programming (simplex optimization method)
- Fourier series - linear algebra for functions (signal processing)
- linear algebra for statistics and probabilities (least squares for regression)
- computer graphics (translation, rescaling, rotation of images)

## 02 - Linear algebra and machine learning
Linear algebra is the mathematics of data. Often recommended as a prerequisite to machine learning, it can make more sense to first build context of the applied machine learning process.

### Reasons not to learn linear algebra
- it's not required in order to use machine learning as a tool to solve problems
- it's slow and might delay you achieving your goals
- it's a huge field and not all of it is relevant to machine learning

A breadth-first (results-first) approach can help build a skeleton and some context on which to build to deepen knowledge about how algorithms work or the math that underlies them.

### Linear algebra notation
You need to know how to read and write vector and matrix notation. It enables you to:
- describe operations on data precisely
- read descriptions of algorithms in textbooks
- implement machine learning algorithms faster and more efficiently
- interpret and implement new methods in research papers
- describe your own methods to other practitioners

### Linear algebra arithmetic
You need to know how to perform arithmetic operations: add, subtract and multiply scalars, vectors and matrices. Matrix multiplication and tensor multiplication are often non-intuitive at first. Understanding vector and matrix operations is required to effectively read and write matrix notation.

### Learn linear algebra for statistics
Linear algebra is heavily used in multivariate statistics. To read and interpret statistics, you need to know the notation and operations of linear algebra, such as vectors used for means and variance, or covariance matrices describing the relationships between multiple Gaussian variables. Principal component analysis also leverages such methods.

### Learn matrix factorization
Matrix factorization, is also called matrix decomposition. You need to know how to factorize a matrix and what it means. Matrix factorization is necessary for more complex operations in linear algebra (matrix inverse) and machine learning (least squares). Different matrix factorization exist, such as singular-value decomposition. To read and interpret higher-order matrix operations, matrix factorization is required.

### Learn linear least squares
Matrix factorization can be used to solve linear least squares. Problems where there is no line able to fit the data without error can be solved using the least squares method, called linear least squares in linear algebra. Linear least squares are used in regression models, and in a range of machine learning algorithms.

### One more reason
Seeing how the operations work on real data will help you develop a strong intuition for the methods. You will experience knowledge buzz and mind-expanding moments.

## 03 - Examples of linear algebra in machine learning
Linear algebra is concerned with vectors, matrices and linear transforms. It is foundational to machine learning from notations used to describe algorithms operation to their implementation in code. The relationship between linear algebra and machine learning is often left unexplained or abstract. Here are some examples of how linear algebra is leveraged in machine learning.

### Dataset and data files
Data is a matrix, which can be split into inputs (a matrix $X$) and outputs (a vector $y$. Each row has the same length (same number of columns): the data is vectorized and can be passed to a model one by one or in batch. The model can be pre-configured to expect rows of a fixed width.

### Images and photographs
An image is a table structure with a width and height and one-pixel value in each cell for black and white images or three pixel values (red, green and blue) for color images. Operations such as cropping, scaling, shearing are described using linear algebra notations and operations.

### One hot encoding
Categorical data can be one hot encoded so they are easier to work with and learn from by some machine learning techniques. One column is created for each category and a row for each example (e.g. if the categories are red, green and blue, we create a red column, a green column and a blue column). For each row in the dataset, we enter 1 in the column corresponding to the category and 0 in the others. Each row is encoded as a binary vector (0 or 1), which is an example of sparse representation.

### Linear regression
Linear regression is used to describe the relationship between variables. Solving the linear regression problem means finding a set of coefficients that gives the best prediction of the output variable when multiplied by each of the input variable and added together. It is usually solved using least squares optimization leveraging matrix factorization such as LU decomposition or singular-value decomposition.

It can be summarized using linear algebra notation:  
$y = A \cdot b$
where
- $y$ is the output variable
- $A$ is the dataset
- $b$ are the model coefficients

### Regularization
Simpler models often have smaller coefficient values. Regularization is leveraged to encourage a model to minimize the size of coefficients. Common implementations are the $L^{1}$ and $L^{2}$ forms. Both are a measure of the length of the coefficients as a vector, and leverage the vector norm.

### Principal component analysis
Modeling data with many features is challenging. Principal component analysis is a dimensionality reduction method used to create projections of high-dimensional data for visualization and training models. It uses a matrix factorization method; more robust implementations leverage eigendecomposition and singular-value decomposition.

### Singular-value decomposition
Singular-value decomposition is a dimensionality reduction method with applications in feature selection, visualization and noise reduction.

### Latent semantic analysis
Latent semantic analysis, also called latent semantic indexing, is a natural language processing method applied to document-term matrices (sparse representations of a text) and distill the representation down to its most relevant essence using matrix factorization methods such as singular-value decomposition.

### Recommender systems
The similarity between sparse customer behavior vectors leverages distance measures (e.g. Euclidean distance) or dot products. Matrix factorization methods such as single-value decomposition are used to distill user data to their essence for querying, searching and comparison.

### Deep learning
Artificial neural networks are nonlinear machine learning algorithms inspired by the way our brain processes information and have proved effective at a range of problems such as machine translation, photo captioning or speech recognition. Their execution leverages linear algebra structures (vectors, matrices and tensors of inputs and coefficients) multiplied and added together.

## 04 - Introduction to NumPy arrays
### NumPy n-dimensional array
NumPy is the preferred Python tool for linear algebra operations:
- the main structure is the `ndarray`, short for n-dimensional array
- data in an `ndarray` is referred to as an array
- data in an `ndarray` must be of the same type
- the type of an `ndarray` can be retrieved using the argument `.dtype` on the array
- the shape (ength of each dimension) of an `ndarray` can be retrieved using the argument `.shape` on the array
- the function `array()` is used to create an `ndarray`

In [2]:
import numpy as np
#from collections.abc import Callable
from typing import Callable, Union

# Create arrays of integer, float and mixed types
array_int = np.array([1, 2, 3])
array_float = np.array([1.09, 2.87, 3.654])
array_mixed = np.array([1, 2.5, 3])

# Print arrays
print(f"array_int = {array_int}")
print(f"array_float = {array_float}")
print(f"array_mixed = {array_mixed}")
    
# Get the shape of all arrays
print(f"\nType of array_int: {array_int.dtype}")
print(f"Type of array_float: {array_float.dtype}")
print(f"""Type of array_mixed: {array_mixed.dtype}
==> <array_mixed> was passed an array of mixed data types (integers and floats)
and NumPy forced all the `ndarray` to a float dtype""")

# Get the type of all arrays
print(f"\nShape of array_int: {array_int.shape}")
print(f"Shape of array_float: {array_float.shape}")
print(f"Shape of array_mixed: {array_mixed.shape}")

array_int = [1 2 3]
array_float = [1.09  2.87  3.654]
array_mixed = [1.  2.5 3. ]

Type of array_int: int64
Type of array_float: float64
Type of array_mixed: float64
==> <array_mixed> was passed an array of mixed data types (integers and floats)
and NumPy forced all the `ndarray` to a float dtype

Shape of array_int: (3,)
Shape of array_float: (3,)
Shape of array_mixed: (3,)


### Functions to create arrays
- `empty()` creates an array of random variables of the specified shape (NDH. the values are not really random, but rather uninitialized)
- `zeros()` creates an array of zeros of the specified shape
- `ones()` creates an array of ones variables of the specified shape

In [3]:
# Create arrays
array_empty = np.empty([3,3])
array_zeros = np.zeros([3,5])
array_ones = np.ones([3,5])

# Print arrays
print(f"array_empty =\n{array_empty}")
print(f"\narray_zeros =\n{array_zeros}")
print(f"\narray_ones =\n{array_ones}")

array_empty =
[[0.00000000e+000 0.00000000e+000 1.39067116e-309]
 [2.14263370e+160 5.92936410e-038 4.71865494e-090]
 [7.11289854e-038 1.38524311e-309 3.95252517e-322]]

array_zeros =
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]

array_ones =
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]


In [4]:
# From scratch
# Create arrays with values generated by a function
import random
import time
random.seed(time.process_time())

def create_array_with_func(m: int, n: int, function: Callable[..., Union[int, float]], *function_arguments) -> list:
    """Creates a matrix of size m x n with values generated by the function passed

    Args:
        m (int): the number of rows of the created matrix
        n (int): the number of columns of the created matrix
        function (function): the function to apply to generate the number

    Returns:
        list: a matrix of size m x n with values generated by the function passed
    """
    array = []
    for i in range(0, m):
        array.append([])
        for j in range(0, n):
            array[i].append(function(*function_arguments))
    return array

In [5]:
# Implement empty() from scratch
np.array(create_array_with_func(3, 3, np.random.uniform, 0, 7))

array([[1.15888136, 3.72483512, 4.68046778],
       [4.91495029, 1.92357603, 0.18129987],
       [5.23504097, 6.88588151, 2.71385321]])

In [6]:
# Implement zeros() from scratch
np.array(create_array_with_func(3, 5, lambda x: 0., None))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [7]:
# Implement ones() from scratch
np.array(create_array_with_func(3, 5, lambda x: 1., None))

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

### Combining arrays
Arrays can be stacked:
- vertically using `vstack()`: given two one-dimensional arrays of the same length, you get a new two-dimensional array with two rows
- horizontally using `hstack()`: given two one-dimensional arrays of potentially similar length, you get a new one-dimensional array

In [8]:
# Same length
print("Same length:")
# Creating arrays
array_s01 = np.array([1, 2, 3])
array_s02 = np.array([4, 5, 6])

# Stacking
stack_sv = np.vstack([array_s01, array_s02])
stack_sh = np.hstack([array_s01, array_s02])

# Printing results
print(f"Vertical stack\n{stack_sv}")
print(f"\nHorizontal stack\n{stack_sh}")

# Different length
print("\nDifferent length:")
# Creating arrays
array_d01 = np.array([1, 2, 3])
array_d02 = np.array([4, 5, 6, 7])

# Stacking
print(f"Vertical stack")
try:
    stack_dv = np.vstack([array_d01, array_d02])
    print(stack_dv)
except ValueError as e:
    print(f"ValueError: {e}")
stack_dh = np.hstack([array_d01, array_d02])

# Printing results
print(f"\nHorizontal stack\n{stack_dh}")

Same length:
Vertical stack
[[1 2 3]
 [4 5 6]]

Horizontal stack
[1 2 3 4 5 6]

Different length:
Vertical stack
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 3 and the array at index 1 has size 4

Horizontal stack
[1 2 3 4 5 6 7]


In [9]:
# From scratch
def stack_arrays(stack_type: str, arrays) -> list:
    output = []
    it = iter(arrays)
    length = len(next(it))
    
    if stack_type == "h":
        for array in arrays:
            for item in array:
                output.append(item)
    elif stack_type == "v":
        # Check if no array has a different dimension than the first one
        if any(len(l) != length for l in it):
            return("Can't vertically stack arrays of different dimensions.")
        else:
            for array in arrays:
                output.append(array)
            
    return np.array(output)

In [10]:
# Same length
print("Same length:")
# Creating arrays
array_s01 = [1, 2, 3]
array_s02 = [4, 5, 6]

# Stacking
stack_sv = stack_arrays("v", [array_s01, array_s02])
stack_sh = stack_arrays("h", [array_s01, array_s02])

# Printing results
print(f"Vertical stack\n{stack_sv}")
print(f"\nHorizontal stack\n{stack_sh}")

# Different length
print("\nDifferent length:")
# Creating arrays
array_d01 = np.array([1, 2, 3])
array_d02 = np.array([4, 5, 6, 7])

stack_dv = stack_arrays("v", [array_d01, array_d02])
stack_dh = stack_arrays("h", [array_d01, array_d02])

print(f"Vertical stack\n{stack_dv}")
print(f"\nHorizontal stack\n{stack_dh}")

Same length:
Vertical stack
[[1 2 3]
 [4 5 6]]

Horizontal stack
[1 2 3 4 5 6]

Different length:
Vertical stack
Can't vertically stack arrays of different dimensions.

Horizontal stack
[1 2 3 4 5 6 7]


## 05 - Index, slice and reshape NumPy arrays
Machine learning data is represented as arrays - and in Python, almost always as NumPy arrays.

### From list to arrays
#### One-dimensional list to array
- the `array()` function can convert a one-dimensional Python list to a NumPy array

In [11]:
# Create Python list
python_list = [11, 22, 33, 44, 55]

# Create NumPy array
numpy_array = np.array(python_list)

# Print results
print(f"NumPy array:\n{numpy_array}")
print(f"NumPy array type: {type(numpy_array)}")
print(f"NumPy array data type: {numpy_array.dtype}")
print(f"Shape: {numpy_array.shape}")

NumPy array:
[11 22 33 44 55]
NumPy array type: <class 'numpy.ndarray'>
NumPy array data type: int64
Shape: (5,)


#### Two-dimensional list to array
Two-dimensional data is more likely in machine learning: it corresponds to a table where rows represent observations and columns represent features.

You can convert a list of lists where each list is a new observation to a NumPy array using the `array()` function as well.

In [12]:
# Create Python list
python_list = [[11, 22],
               [33, 44],
               [55, 66]]

# Create NumPy array
numpy_array = np.array(python_list)

# Print results
print(f"NumPy array:\n{numpy_array}")
print(f"NumPy array type: {type(numpy_array)}")
print(f"NumPy array data type: {numpy_array.dtype}")
print(f"Shape: {numpy_array.shape}")

NumPy array:
[[11 22]
 [33 44]
 [55 66]]
NumPy array type: <class 'numpy.ndarray'>
NumPy array data type: int64
Shape: (3, 2)


### Array indexing
You can access data in a NumPy array using indexing.

#### One-dimensional indexing
Indexing works like in Python, using:
- the square bracket operators (`[]`)
- a zero-offset index for the value to retrieve
- a negative indexes to retrieve values offset from the end of the array

In [13]:
# Create NumPy array
python_list = [11, 22, 33, 44, 55]
numpy_array = np.array(python_list)

# Get data at 1st and 5th position
print(f"Value at 1st position (index 0): {numpy_array[0]}")
print(f"Value at 5th position (index 4): {numpy_array[4]}")
print(f"Value at 4th position (index -2): {numpy_array[-2]}")

Value at 1st position (index 0): 11
Value at 5th position (index 4): 55
Value at 4th position (index -2): 44


In [14]:
# From scratch
python_list = [11, 22, 33, 44, 55]

# Get data at 1st and 5th position
print(f"Value at 1st position (index 0): {python_list[0]}")
print(f"Value at 5th position (index 4): {python_list[4]}")
print(f"Value at 4th position (index -2): {python_list[-2]}")

Value at 1st position (index 0): 11
Value at 5th position (index 4): 55
Value at 4th position (index -2): 44


#### Two-dimensional indexing
- indexing two-dimensional data is similar to indexing one-dimensional data, except that a comma is used to separate the index for each dimension
- the column index value can be left blank to select all columns for the given row

In [15]:
# Create Python list
python_list = [[11, 22],
               [33, 44],
               [55, 66]]

# Create NumPy array
numpy_array = np.array(python_list)

# Get data in first row, second column third row, first column
print(f"Value at 1st row, 2nd column ([0, 1]): {numpy_array[0, 1]}")
print(f"Value at 3rd row, 1st column ([2, 0]): {numpy_array[2, 0]}")
print(f"All values of 2nd row ([0,]): {numpy_array[1,]}")

Value at 1st row, 2nd column ([0, 1]): 22
Value at 3rd row, 1st column ([2, 0]): 55
All values of 2nd row ([0,]): [33 44]


In [16]:
# From scratch
def get_value_from_index(indexes: list, arrays: list):
    if indexes[0] == ",":
        return arrays
    elif len(indexes) == 1:
        return arrays[indexes[0]]
    else:
        return get_value_from_index(indexes[1:], arrays[indexes[0]])
        

In [17]:
# Two dimensions
python_list = [[11, 22],
               [33, 44],
               [55, 66]]
print(f"Value at 1st row, 2nd column ([0, 1]): {get_value_from_index([0, 1], python_list)}")
print(f"Value at 3rd row, 1st column ([2, 0]): {get_value_from_index([2, 0], python_list)}")
print(f"All values of 2nd row ([0,]): {get_value_from_index([1, ','], python_list)}")

# n dimensions
python_list = [[[11, 12], [22, 34]],
               [[33, 13], [44, 35]],
               [[55, 14], [66, 36]]]

print(f"Value at 1st 1d, 2nd 2d, 2nd 3d ([0, 1, 1]): {get_value_from_index([0, 1, 1], python_list)}")
print(f"Value at 3rd 1d ([2,]): {get_value_from_index([2, ','], python_list)}")

Value at 1st row, 2nd column ([0, 1]): 22
Value at 3rd row, 1st column ([2, 0]): 55
All values of 2nd row ([0,]): [33, 44]
Value at 1st 1d, 2nd 2d, 2nd 3d ([0, 1, 1]): 34
Value at 3rd 1d ([2,]): [[55, 14], [66, 36]]


### Array slicing
The slice extends from the *from* index and ends one item before the *to* index: `data[from:to]`.

#### One-dimensional slicing
All data in an array dimension can be selected by specifying the slice with no indexes: `:`.

In [18]:
# Create NumPy array
python_list = [11, 22, 33, 44, 55]
numpy_array = np.array(python_list)

# Get all data using a slice with no indexes
print(f"Selecting all data with `:`:\n{numpy_array[:]}")

Selecting all data with `:`:
[11 22 33 44 55]


In [19]:
# From scratch
python_list = [11, 22, 33, 44, 55]
print(f"Selecting all data with `:`:\n{np.array(python_list[:])}")

Selecting all data with `:`:
[11 22 33 44 55]


#### Two-dimensional slicing
It is common to split your data into input variables $X$ and output variable $y$.


This can be done using slicing:
- we can select all rows and all columns except the last one by specifying `:` in the rows index and `:-1` in the columns index
- we can select all rows and the last column by specifying `:` in the rows index and `-1` in the columns index.

In [20]:
# Create Python list
python_list = [[11, 22, 33],
               [44, 55, 66],
               [77, 88, 99]]

# Create NumPy array
numpy_array = np.array(python_list)

# Get inputs
inputs_X = numpy_array[:, :-1]
output_y = numpy_array[:, -1]

# Print results
print(f"Inputs X:\n{inputs_X}")
print(f"Outputs y:\n{output_y}")

Inputs X:
[[11 22]
 [44 55]
 [77 88]]
Outputs y:
[33 66 99]


In [179]:
# From scratch
# Getting slices to slice each array
def get_slice(array, index):
    if index.startswith(":"):
        if len(index)==1:
            index_start = 0
            index_end = len(array)
        else:
            index_start = 0
            index_end = int(index[1:])
    elif index.endswith(":"):
        index_start = int(index[:-1])
        index_end = len(index)
    elif ":" in index:
        index_start, index_end = [int(i) for i in index.split(":")]
    elif "-" in index:
        index_start = int(index)
        index_end = len(array) - int(index)
    else:
        index_start = int(index)
        index_end = int(index) + 1
    return slice(index_start, index_end, 1)

# Getting slices to slice each array
# Tested on 2D and 3D arrays
def slice_array(output, arrays, *indexes):
    if len(indexes) >= 2:
        for array in arrays[get_slice(arrays, indexes[0])]:        
            slice_array(output, array[get_slice(array, indexes[1])], *indexes[2:])
    elif type(arrays[0]) == list:
        if len(arrays) > 1:
            output.append(arrays[0][get_slice(arrays[0], indexes[0])])
            print(f"Added {arrays[0][get_slice(arrays[0], indexes[0])]}")
        else:
            output.append(arrays[0][get_slice(arrays[0], indexes[0])][0])
            print(f"Added {arrays[0][get_slice(arrays[0], indexes[0])][0]}")

    elif type(arrays[0]) == int:
        if len(arrays) > 1:
            output.append(arrays)
            print(f"Added {arrays}")
        else:
            output.append(arrays[0])
            print(f"Added {arrays[0]}")
            
    return output
    

# Testing for 2D arrays
python_list = [[11, 22, 33],
               [44, 55, 66],
               [77, 88, 99]]

inputs_X = np.array(slice_array([], python_list, ":", ":-1", "1"))
output_y = np.array(slice_array([], python_list, ":", "-1", "1"))

# Print results
print(f"Inputs X:\n{inputs_X}")
print(f"Outputs y:\n{output_y}")


# Testing for 3D arrays
python_list = [[[11, 12], [22, 34]],
               [[33, 13], [44, 35]],
               [[55, 14], [66, 36]]]

inputs_X = np.array(slice_array([], python_list, ":", ":-1", "1"))
output_y = np.array(slice_array([], python_list, ":", "-1", "1"))

# Print results
print(f"Inputs X:\n{np.array(inputs_X)}")
print(f"Outputs y:\n{np.array(output_y)}")

Added [11, 22]
Added [44, 55]
Added [77, 88]
Added 33
Added 66
Added 99
Inputs X:
[[11 22]
 [44 55]
 [77 88]]
Outputs y:
[33 66 99]
Added 12
Added 13
Added 14
Added 34
Added 35
Added 36
Inputs X:
[12 13 14]
Outputs y:
[34 35 36]


#### Split train and test rows
It is common to split a loaded dataset into separate training and testing sets.

In NumPy, this can be done:
- you can select the training dataset slicing all columns by specifying `:` in the second dimension index: `train = data[:split, :]`
- you can select the testing dataset slicing all columns by specifying `:` in the second dimension index: `train = data[split:, :]`

In [182]:
# Create Python list
python_list = [[11, 22, 33],
               [44, 55, 66],
               [77, 88, 99]]

# Create NumPy array
numpy_array = np.array(python_list)

# Split into train and test sets
split = 2
train = numpy_array[:split, :]
test = numpy_array[split:, :]

# Print results
print(f"Train:\n{train}")
print(f"Test:\n{test}")

Train:
[[11 22 33]
 [44 55 66]]
Test:
[[77 88 99]]


In [181]:
# From scratch
# Define split_train_test() function
def split_train_test(table: list, threshold: int):
    train = table[:threshold]
    test = table[threshold:]
    return train, test

# Create Python list
python_list = [[11, 22, 33],
               [44, 55, 66],
               [77, 88, 99]]

# Split into train and test sets
threshold = 2
train, test = split_train_test(python_list, threshold)

# Print results
print(f"Train:\n{np.array(train)}")
print(f"Test:\n{np.array(test)}")

Train:
[[11 22 33]
 [44 55 66]]
Test:
[[77 88 99]]


### Array reshaping
You may need to reshape your data. Some libraries like scikit-learn require that a one-dimensional array of output variables ($y$) be shaped as a two-dimensional array with one column and outcomes for each column. Some algorithms like the long short-term memory recurrent neural network in Keras require inputs to be specified as a three-dimensional array representing samples, timesteps and features.

#### Data shape
The `shape` attribute returns a tuple of the length of each dimension of the array.

NDH: Shapes are returned from the outer list to the inner lists.

In [None]:
# Create NumPy arrays
array_01 = np.array([1, 2, 3, 4, 5])
array_02 = np.array([[1, 2, 3, 4, 5],
                     [6, 7, 8, 9, 10]])
array_03 = np.array([[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]],
                     [[11, 12, 13, 14, 15], [16, 17, 18, 19, 20]],
                     [[21, 22, 23, 24, 25], [26, 27, 28, 29, 30]]])

# Print shapes
print(f"Shape of array_01:\n{array_01.shape}")
print(f"Shape of array_02:\n{array_02.shape}")
print(f"corresponding to:\nRows:{array_02.shape[0]}\nColumns:{array_02.shape[1]}")
print(f"Shape of array_03:\n{array_03.shape}")

#### Reshape 1D to 2D array
It is common to need to reshape a one-dimensional array into a two-dimensional array with one column and multiple arrays:
- the `reshape()` function takes a single argument that specifies the new shape of the array
- this single argument is a tuple with the shape of the array as the first dimension and 1 for the second dimension

In [None]:
# Create NumPy arrays
array_01 = np.array([1, 2, 3, 4, 5])

# Print original shape
print(f"Original shape: {array_01.shape}")

# Reshape NumPy array
array_01 = array_01.reshape((array_01.shape[0], 1))

# Print new shape
print(f"New shape: {array_01.shape}")

#### Reshape 2D to 3D array
It is common to need to reshape two-dimensional data where each row represents a sequence into a three-dimensional array for algorithms that expect multiple samples of one or more timesteps and one or more features.

In [None]:
# Create NumPy array
array_01 = np.array([[1, 2, 3, 4, 5],
                     [6, 7, 8, 9, 10]])

# Print original shape
print(f"Original shape: {array_01.shape}")

# Reshape NumPy array
array_01 = array_01.reshape((array_01.shape[0], array_01.shape[1], 1))

# Print new shape
print(f"New shape: {array_01.shape}")

## 06 - NumPy array broadcasting
Arrays with different sizes cannot be added, subtracted or generally used in arithmetic. NumPy overcomes this with array **broadcasting**: duplicating the smaller array so that it has the dimensionality and size of the larger array.

### Limitation with array arithmetic
You can perform array arithmetic such as addition and subtraction on NumPy arrays:
- two arrays can be added together
- values at each index are added together
- arithmetic can only be performed on arrays that have the same dimensions and dimensions with the same size

In [186]:
# Create arrays
array_01 = np.array([1, 2, 3])
array_02 = np.array([1, 2, 3])

# Print sum
print(array_01 + array_02)

[2 4 6]


In [195]:
# From scratch
# Create arrays
list_01 = [1, 2, 3]
list_02 = [1, 2, 3]

# Print sum
output = [list_01[i] + list_02[i] for i in range(len(list_01))]
print(np.array(output))

[2 4 6]


### Array broadcasting

Broadcasting allows array arithmetic between arrays with a different shape or size. The technique was developed for NumPy but has since been adopted by other libraries such as Theano, TensorFlow and Octave.

### Broadcasting in NumPy
#### Scalar and one-dimensional array
If we have a one-dimensional array and a scalar $b$, then $b$ is broadcast across the one-dimensional array by duplicating it as many times as possible:

In [191]:
# Create array and scalar
array_01 = np.array([1, 2, 3])
scalar_01 = 3

# Print broadcast result
print(array_01 + scalar_01)

[4 5 6]


In [192]:
# From scratch
# Create array and scalar
list_01 = [1, 2, 3]
scalar_01 = 3

# Print broadcast result
output = []
for item in list_01:
    output.append(item + scalar_01)

print(np.array(output))

[4 5 6]


#### Scalar and two-dimensional array

If we have a two-dimensional array and a scalar $b$, then $b$ is broadcast across all dimensions of the two-dimensional array by duplicating it as many times as possible:

In [193]:
# Create array and scalar
array_01 = np.array([[1, 2, 3],
                     [1, 2, 3]])
scalar_01 = 3

# Print broadcast result
print(array_01 + scalar_01)

[[4 5 6]
 [4 5 6]]


In [194]:
# From scratch
# Create array and scalar
list_01 = [[1, 2, 3],
           [1, 2, 3]]

scalar_01 = 3

# Print broadcast result
output = []
for array in list_01:
    row = []
    for item in array:
        row.append(item + scalar_01)
    output.append(row)

print(np.array(output))

[[4 5 6]
 [4 5 6]]


#### One-dimensional and two-dimensional arrays
If we have a one-dimensional array and a two-dimensional array, then the one-dimensional array is broadcast across each row of the two dimensional array by creating a second copy to result in a new two-dimensional array:

In [196]:
# Create arrays
array_01 = np.array([[1, 2, 3],
                     [1, 2, 3]])
array_02 = np.array([2, 4, 6])

# Print broadcast result
print(array_01 + array_02)

[[3 6 9]
 [3 6 9]]


In [199]:
# From scratch
# Create array and scalar
list_01 = [[1, 2, 3],
           [1, 2, 3]]
list_02 = np.array([2, 4, 6])

# Print broadcast result
output = []
for array in list_01:
    row = []
    for i in range(len(array)):
        row.append(array[i] + list_02[i])
    output.append(row)

print(np.array(output))

[[3 6 9]
 [3 6 9]]


### Limitations of broadcasting
Broadcasting doesn't work for all cases and imposes a strict rule for broadcasting to be performed:
- the shape of each dimension in the arrays must be equal, or one has the dimension of size 1
- the dimensions are considered in reverse order, starting with the trailing dimension (e.g. looking at columns before rows in a two-dimensional case)
- NumPy will in effect pad missing dimensions with a size of 1 when comparing arrays (in the example below, the shape of `array_02` will effectively be interpreted by NumPy as `1, 3`

In [None]:
# Create arrays
array_01 = np.array([[1, 2, 3],
                     [1, 2, 3]])
array_02 = np.array([2, 4, 6])

# Print arrays shapes
print(array_01.shape)
print(array_02.shape)