# Matrix Multiplication and Neural Networks

Carlos Afonso, [NYC Data Science Academy](https://nycdatascience.com/)

## Introduction

**Goal**

Learn concepts and skills that are important for **Data Science** / **Machine Learning**:

* *Maths:* Vector multiplication (Dot product), **Matrix multiplication**
* *Coding:* (`Python`) Variables, Lists, Loops, Functions, Libraries (`Numpy`)
* *General:* Data Science, Machine Learning, **Neural Networks**

**Outline**

* Number multiplication
* Vector multiplication
    * Hands-on 1
* **Matrix multiplication**
    * Hands-on 2
* **Neural Networks**
* Data Science, Machine Learning, Neural Networks

## Number Multiplication | `Variables`

Let's start with the "usual" multiplication operation that we're all familiar with

Multiplying two numbers (scalars)

Example: $2 \times 3 = 6$

The same example in Python code:

In [40]:
2 * 3

6

The same example with **variables**:

In [41]:
a = 2
b = 3
a * b

6

In [42]:
a_times_b = a * b
a_times_b

6

In [43]:
# Use Python's print function to show a summary
# (Note: a line starting with # is a comment in Python)
print("a =", a)
print("b =", b)
print("a x b =", a_times_b)

a = 2
b = 3
a x b = 6


**Variables:**

* an important general concept in coding/programming
* used to hold (make reference to) values that we need to use in our program
* should have descriptive/meaningful names

## Vector Multiplication (Dot Product)

### Maths - Definitions

**Vector**

A **vector** is a "sequence" of numbers

* Example: $[2, 3]$

How do we multiply a vector by another vector?

* Example: how to multiply the vector $[2, 3]$ by the vector $[4, 5]$?

**Vector multiplication**

There are **different** methods for multiplying vectors and each is represented by a different operator/symbol. We are particularly interested in these two methods:

* **Element-wise** multiplication ($\odot$):

$[\color{red}{2}, \color{blue}{3}] \odot [\color{red}{4}, \color{blue}{5}] = [\color{red}{2 \times 4}, \color{blue}{3 \times 5}] = [\color{red}{8}, \color{blue}{15}]$

* **Dot Product** ($\cdot$):

$[\color{red}{2}, \color{blue}{3}] \cdot [\color{red}{4}, \color{blue}{5}] = (\color{red}{2 \times 4}) + (\color{blue}{3 \times 5}) = \color{red}{8} + \color{blue}{15} = 23$

**Notes**

* The Dot Product is the sum of the result of the element-wise multiplication.
* The Dot Product is the fundamental operation used in Matrix Multiplication.

### Code - Simple solution | `Lists`

The same Dot Product example in Python code:

In [45]:
# A vector can be represented as a list in Python
v = [2, 3]
r = [4, 5]

# We can access the elements of a list by their index
# - index 0 is the 1st element
# - index 1 is the 2nd element
v_dot_r = (v[0] * r[0]) + (v[1] * r[1])

# Show indexing
print("v[0] =", v[0])
print("v[1] =", v[1])

# Print result
print("v_dot_r =", v_dot_r)

v[0] = 2
v[1] = 3
v_dot_r = 23


### Code - General solution | `Loops`

The previous solution only works for lists/vectors of size 2.

We can use a `for` loop to create a more automated and general solution that works for lists/vectors of any size:

In [48]:
# First, let's test it with the same example as above
v = [2, 3]
r = [4, 5]

# Then, we can come back and test it with this other example
#v = [1, 2, 3]
#r = [4, 5, 6]

# Print v and r to have as reference in the output
print("v =", v)
print("r =", r)
print("-" * 25) # Print a separator

# A list to store the result of the element-wise multiplication
v_times_r = []

# Use a for loop to iterate through all the indexes/elements
# and perform element-wise multiplication, step-by-step
for i in range(len(v)):
    print('i =', i)
    print('v[i] =', v[i])
    print('r[i] =', r[i])
    print('v[i] * r[i] =', v[i] * r[i])
    v_times_r.append(v[i] * r[i])
    print("v_times_r =", v_times_r)
    print("-" * 25)

# The Dot Product is the sum of the result of the element-wise multiplication
v_dot_r = sum(v_times_r)
print("v_dot_r =", v_dot_r)

v = [2, 3]
r = [4, 5]
-------------------------
i = 0
v[i] = 2
r[i] = 4
v[i] * r[i] = 8
v_times_r = [8]
-------------------------
i = 1
v[i] = 3
r[i] = 5
v[i] * r[i] = 15
v_times_r = [8, 15]
-------------------------
v_dot_r = 23


### Code - Functional solution | `Functions`

We can define our own **function**:

In [26]:
def dot_product(v, r):
    v_times_r = []
    
    for i in range(len(v)):
        v_times_r.append(v[i] * r[i])
    
    v_dot_r = sum(v_times_r)
    
    return v_dot_r

In [49]:
# Example 1
v = [2, 3]
r = [4, 5]
dot_product(v, r)

23

In [50]:
# Example 2
v = [1, 1, 1]
r = [2, 2, 2]
dot_product(v, r)

6

**Functions**:

* an important general concept in coding/programming
* defined to and used for performing specific (well defined) actions that need to be performed multiple times
* should have descriptive/meaningful names

### Code - Numpy | `Libraries`

In [51]:
# Import numpy library
import numpy as np

In `numpy`, vectors are represented as `array` objects:

In [52]:
np.array([2, 3])

array([2, 3])

**Element-wise** multiplication of vectors (arrays) in `numpy`:

In [53]:
np.array([2, 3]) * np.array([4, 5])

array([ 8, 15])

**Dot product** of vectors (arrays) with `numpy`'s `dot` function:

In [54]:
np.dot(np.array([2, 3]), np.array([4, 5]))

23

`numpy` documentation:

* homepage: https://numpy.org/doc/stable/
* `dot` function: https://numpy.org/doc/stable/reference/generated/numpy.dot.html

## Hands-On 1: Dot Product

Compute the Dot Product of the vectors $[0, 1, 2]$ and $[3, 4, 5]$:

1. Manually, with "pen and paper" (now)
2. With Python code (home)

**Solution for Part 1**

$[\color{red}{0}, \color{green}{1}, \color{blue}{2}] \cdot [\color{red}{3}, \color{green}{4}, \color{blue}{5}] = (\color{red}{0 \times 3}) + (\color{green}{1 \times 4}) + (\color{blue}{2 \times 5}) = \color{red}{0} + \color{green}{4} + \color{blue}{10} = 14$

**Solution for Part 2**

In [33]:
# Use this code cell to write your solution for part 2


## Matrix Multiplication

### Maths - Definitions

**Matrix**

Above, we saw that a vector is a sequence of numbers (i.e, a 1-dimensional arrangement of numbers)

Now, a **matrix** is a sequence of vectors (i.e., a 2-dimensional arrangement of numbers)

If it helps, you can think of a matrix as a table of numbers with rows and columns (sort of like a table in Excel)

Here's an example of a 2 x 3 matrix (i.e., a matrix with 2 rows by 3 columns)

$\begin{bmatrix} 1 & 2 & 3 \\ 10 & 20 & 30 \end{bmatrix}$

**Matrix Multiplication**

Consider the matrices $\mathbf{X}$ and $\mathbf{Y}$:

$\mathbf{X} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$

$\mathbf{Y} = \begin{bmatrix} 10 & 20 \\ 30 & 40 \end{bmatrix}$

Both matrices, $\mathbf{X}$ and $\mathbf{Y}$, are 2 x 2 matrices (i.e., matrices with 2 rows by 2 columns).

The matrix multiplication $\mathbf{X} \mathbf{Y}$ results in another 2 x 2 matrix $\mathbf{Z}$:

$\mathbf{Z} = \mathbf{X} \mathbf{Y}$

$\mathbf{Z} = \begin{bmatrix} z_{11} & z_{12} \\ z_{21} & z_{22} \end{bmatrix}$

$\begin{bmatrix}
    z_{11} & z_{12} \\ 
    z_{21} & z_{22} 
\end{bmatrix} = \begin{bmatrix} 
                    1 & 2 \\ 
                    3 & 4 
                \end{bmatrix} 
                \begin{bmatrix} 
                    10 & 20 \\ 
                    30 & 40 
                \end{bmatrix}$

**Matrix Multiplication process** to calculate each of the elements of $\mathbf{Z}$:

* The dot product of rows (row vectors) of the first/left matrix by the columns (column vectors) of the second/right matrix 

***Step 1*** ($\color{orange}{z_{11}}$):

$\begin{bmatrix}
    \color{orange}{z_{11}} & z_{12} \\ 
    z_{21} & z_{22} 
\end{bmatrix} = \begin{bmatrix} 
                    \color{red}{1} & \color{red}{2} \\ 
                    3 & 4 
                \end{bmatrix} 
                \begin{bmatrix} 
                    \color{blue}{10} & 20 \\ 
                    \color{blue}{30} & 40 
                \end{bmatrix}$

$\color{orange}{z_{11}} 
    = \begin{bmatrix} \color{red}{1} & \color{red}{2} \end{bmatrix} 
      \begin{bmatrix} \color{blue}{10} \\ \color{blue}{30} \end{bmatrix} 
    = (\color{red}{1} \times \color{blue}{10}) + (\color{red}{2} \times \color{blue}{30}) 
    = \color{orange}{70}$

***Step 2*** ($\color{orange}{z_{12}}$):

$\begin{bmatrix}
    70 & \color{orange}{z_{12}} \\ 
    z_{21} & z_{22} 
\end{bmatrix} = \begin{bmatrix} 
                    \color{red}{1} & \color{red}{2} \\ 
                    3 & 4 
                \end{bmatrix} 
                \begin{bmatrix} 
                    10 & \color{blue}{20} \\ 
                    30 & \color{blue}{40} 
                \end{bmatrix}$

$\color{orange}{z_{12}} 
    = \begin{bmatrix} \color{red}{1} & \color{red}{2} \end{bmatrix} 
      \begin{bmatrix} \color{blue}{20} \\ \color{blue}{40} \end{bmatrix} 
    = (\color{red}{1} \times \color{blue}{20}) + (\color{red}{2} \times \color{blue}{40}) 
    = \color{orange}{100}$

***Step 3*** ($\color{orange}{z_{21}}$):

$\begin{bmatrix}
    70 & 100 \\ 
    \color{orange}{z_{21}} & z_{22} 
\end{bmatrix} = \begin{bmatrix}
                    1 & 2 \\
                    \color{red}{3} & \color{red}{4} 
                \end{bmatrix} 
                \begin{bmatrix} 
                    \color{blue}{10} & 20 \\ 
                    \color{blue}{30} & 40 
                \end{bmatrix}$

$\color{orange}{z_{21}} 
    = \begin{bmatrix} \color{red}{3} & \color{red}{4} \end{bmatrix} 
      \begin{bmatrix} \color{blue}{10} \\ \color{blue}{30} \end{bmatrix} 
    = (\color{red}{3} \times \color{blue}{10}) + (\color{red}{4} \times \color{blue}{30}) 
    = \color{orange}{150}$

***Step 4*** ($\color{orange}{z_{22}}$):

$\begin{bmatrix}
    70 & 100 \\ 
    150 & \color{orange}{z_{22}}  
\end{bmatrix} = \begin{bmatrix}
                    1 & 2 \\
                    \color{red}{3} & \color{red}{4} 
                \end{bmatrix} 
                \begin{bmatrix} 
                    10 & \color{blue}{20} \\ 
                    30 & \color{blue}{40} 
                \end{bmatrix}$

$\color{orange}{z_{22}} 
    = \begin{bmatrix} \color{red}{3} & \color{red}{4} \end{bmatrix} 
      \begin{bmatrix} \color{blue}{20} \\ \color{blue}{40} \end{bmatrix} 
    = (\color{red}{3} \times \color{blue}{20}) + (\color{red}{4} \times \color{blue}{40}) 
    = \color{orange}{220}$


**Result**

$\mathbf{Z} = \mathbf{X} \mathbf{Y} = \begin{bmatrix} 70 & 100 \\ 150 & 220 \end{bmatrix}$

**Summary**

$\mathbf{X} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$

$\mathbf{Y} = \begin{bmatrix} 10 & 20 \\ 30 & 40 \end{bmatrix}$

$\mathbf{Z} 
    = \mathbf{X} \mathbf{Y}
    = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} 
      \begin{bmatrix} 10 & 20 \\ 30 & 40 \end{bmatrix}
    = \begin{bmatrix} 
          (1 \times 10) + (2 \times 30) & (1 \times 20) + (2 \times 40) \\ 
          (3 \times 10) + (4 \times 30) & (3 \times 20) + (4 \times 40) 
      \end{bmatrix}
    = \begin{bmatrix} 70 & 100 \\ 150 & 220 \end{bmatrix}$

**Note**

The example above uses "square" (2 x 2) matrices. Here is a more general example, with "rectangular" matrices:

$\mathbf{Z} (\color{red}{2} \times \color{blue}{4}) = \mathbf{X} (\color{red}{2} \times \color{green}{3}) \mathbf{Y} (\color{green}{3} \times \color{blue}{4})$

$\begin{bmatrix}
z_{11} & z_{12} & z_{13} & z_{14} \\ 
z_{21} & z_{22} & z_{23} & z_{24} 
\end{bmatrix} = 
    \begin{bmatrix} 
    x_{11} & x_{12} & x_{13} \\ 
    x_{21} & x_{22} & x_{23}
    \end{bmatrix} 
    \begin{bmatrix} 
    y_{11} & y_{12} & y_{13} & y_{14} \\ 
    y_{21} & y_{22} & y_{23} & y_{24} \\ 
    y_{31} & y_{32} & y_{33} & y_{34} 
    \end{bmatrix}$

### Code - `Numpy`

In `numpy`, we can represent a matrix as a (2D) array object, and perform matrix multiplication with the `matmul` function.

In [34]:
X = np.array([[1, 2],
              [3, 4]])

Y = np.array([[10, 20],
              [30, 40]])

np.matmul(X, Y)

array([[ 70, 100],
       [150, 220]])

`numpy.matmul` documentation: https://numpy.org/doc/stable/reference/generated/numpy.matmul.html

## Hands-On 2: Matrix Multiplication

Given the matrices $\mathbf{X} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$ and $\mathbf{Y} = \begin{bmatrix} 5 & 6 \\ 7 & 0 \end{bmatrix}$, compute the matrix multiplication $\mathbf{Z} = \mathbf{X} \mathbf{Y}$:

1. Manually, with "pen and paper" (now)
2. With Python code (home)

**Solution for Part 1**

$\mathbf{Z} 
    = \mathbf{X} \mathbf{Y}
    = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} 
      \begin{bmatrix} 5 & 6 \\ 7 & 0 \end{bmatrix}
    = \begin{bmatrix} 
          (1 \times 5) + (2 \times 7) & (1 \times 6) + (2 \times 0) \\ 
          (3 \times 5) + (4 \times 7) & (3 \times 6) + (4 \times 0) 
      \end{bmatrix}
    = \begin{bmatrix} 19 & 6 \\ 43 & 18 \end{bmatrix}$

**Solution for Part 2**

In [35]:
# Use this code cell to write your solution for part 2


## Neural Networks

**Neural Network:**

* A network of connected neurons / nodes
* Maps inputs to outputs: `Input => Neural Network (model) => Output`
* **Value:** Use the model to predict outputs for new inputs (not used in training)

Examples of Neural Network **applications**:

* Detect objects in an image (digits, people, animals, etc)
* Classify emails as spam or not spam
* Detect signs of disease in medical images (e.g., X-rays)
* Time series predictions (sales, stocks, weather)

Each node implements a 2-step process to map inputs to an output:

* **Step 1:** Compute the weigthed sum of the inputs to the node
    * using **matrix multiplication (dot product)** as represented in the figure below

![Image explaining Matrix Multiplications in Neural Networks](neural-net.svg)

* **Step 2:** Apply an **activation function** to the result of step 1 to determine the node's output
    * Updates in the figure above to consider the activation function
        * $h_1 = f(w_{11} x_1 + w_{11} x_1)$
        * $y = g(z_1 h_1 + z_2 h_2)$


**Notes**

* **Activation functions** are usually **non-linear functions** that allow the model to capture more complex patterns in the data
* Neural networks use a "divide and conquer" strategy, with each node solving one small component of the larger problem.
    * An individual node may have a meaningful interpretation
* **Backpropagation:** Method used to backpropagate the errors during training to find the best weights for the neural network
* **Deep learning networks**: Neural networks with multiple hidden layers

## Data Science, Machine Learning, Neural Networks

**Goal:** Find **data insights** that are **valuable** and **actionable**

**Process:**

* `Question > Data > Analysis/Modeling > Insight > Action` -- More classic, human driven (classic statistics)
* `Data > Question > Analysis/Modeling > Insight > Action` -- More modern, data driven (machine learning, neural networks)


## Q & A

Questions?

## To Learn More

https://nycdatascience.com/

## Solutions

### Solution for "Hands-on 1: Dot Product"

Compute the Dot Product of the vectors $[0, 1, 2]$ and $[3, 4, 5]$:

1. Manually, with "pen and paper" (live)
2. With Python code (home)

**Solution for Part 1**

$[\color{red}{0}, \color{green}{1}, \color{blue}{2}] \cdot [\color{red}{3}, \color{green}{4}, \color{blue}{5}] = (\color{red}{0 \times 3}) + (\color{green}{1 \times 4}) + (\color{blue}{2 \times 5}) = \color{red}{0} + \color{green}{4} + \color{blue}{10} = 14$

**Solution for Part 2**

* Using the `dot_product` function that we defined above:

In [36]:
dot_product([0, 1, 2], [3, 4, 5])

14

* Using `numpy`'s `dot` function:

In [37]:
import numpy as np

np.dot(np.array([0, 1, 2]), np.array([3, 4, 5]))

14

### Solution for "Hands-On 2: Matrix Multiplication"

Given the matrices $\mathbf{X} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix}$ and $\mathbf{Y} = \begin{bmatrix} 5 & 6 \\ 7 & 0 \end{bmatrix}$, compute the matrix multiplication $\mathbf{Z} = \mathbf{X} \mathbf{Y}$:

1. Manually, with "pen and paper" (live)
2. With Python code (home)

**Solution for Part 1**

$\mathbf{Z} 
    = \mathbf{X} \mathbf{Y}
    = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} 
      \begin{bmatrix} 5 & 6 \\ 7 & 0 \end{bmatrix}
    = \begin{bmatrix} 
          (1 \times 5) + (2 \times 7) & (1 \times 6) + (2 \times 0) \\ 
          (3 \times 5) + (4 \times 7) & (3 \times 6) + (4 \times 0) 
      \end{bmatrix}
    = \begin{bmatrix} 19 & 6 \\ 43 & 18 \end{bmatrix}$

**Solution for Part 2**

Using `numpy`'s `matmul` function:

In [38]:
import numpy as np

X = np.array([[1, 2],
              [3, 4]])

Y = np.array([[5, 6],
              [7, 0]])

np.matmul(X, Y)

array([[19,  6],
       [43, 18]])