# Install the dependencies for this lesson

In [None]:
# Install matplotlib and scipy
!pip install matplotlib scipy

## 1. Introduction and Recap:

### Recap of Vectors and Matrices:
Remember, vectors are one-dimensional arrays while matrices are two-dimensional. They are foundational in linear algebra, and in our context, crucial for understanding dot products and convolutions.

### Importance of Dot Products and Convolutions in AI:
Dot products and convolutions form the heart of operations in neural networks, especially convolutional neural networks (CNNs). Understanding them is key to grasping the inner workings of these networks.


---
## 2. Dot Products:

### Definition:
The dot product of two vectors is a scalar value produced by multiplying corresponding entries of two vectors and summing up those products. 

### Calculation:
Given two vectors:
A = [a1, a2, ... , an]
B = [b1, b2, ... , bn]

The dot product is:
a1*b1 + a2*b2 + ... + an*bn

### Let's try it manually with two vectors:
A = [2, 3]
B = [4, 5]

Dot product = 2*4 + 3*5 = 8 + 15 = 23

In [None]:
# First, import numpy
import numpy as np

# Let's define our vectors
A = np.array([2, 3])
B = np.array([4, 5])

# Compute the dot product using numpy
dot_product = np.dot(A, B)
dot_product

---
### Interpretation:

As we can see, the dot product we computed manually matches the one we computed using numpy! The dot product gives us a scalar value, which can be interpreted as a measure of how much one vector "goes in the direction" of another vector. This concept is crucial when we dive deeper into neural networks and optimization techniques.

---
### Hands-on Exercise:

1. Compute the dot product of vectors [1, 2, 3] and [4, 5, 6] manually.
2. Validate your answer using numpy.
3. What is the geometric interpretation of the dot product when the angle between the vectors is 90 degrees? Why?


---
## Geometric Interpretation of the Dot Product:

The dot product has a neat geometric interpretation. Recall the definition:

For two vectors \( A \) and \( B \), the dot product is defined as:
$$ A \cdot B = |A| \times |B| \times \cos(\theta) $$

Where:
- A is the magnitude (or length) of vector `A`
- B is the magnitude of vector `B`
- θ is the angle between the vectors

### Case: When the angle θ is 90 degrees:

If two vectors are perpendicular to each other (i.e., the angle between them is 90 degrees or π/2 radians), then the cosine of 90 degrees is 0.

Thus, the dot product becomes:
$$ A \cdot B = |A| \times |B| \times 0 = 0 $$

This means that when two vectors are perpendicular, their dot product is 0. This is a crucial property in linear algebra and is referred to as the vectors being **orthogonal**.

### Let's explore this with an example:


In [None]:
# Define two vectors that are orthogonal in 2D space
A = np.array([1, 0])
B = np.array([0, 1])

# Compute the dot product using numpy
dot_product_orthogonal = np.dot(A, B)
dot_product_orthogonal

As we can see, the dot product of two orthogonal vectors is indeed 0.

### Visualization:

It's always helpful to visualize these concepts. Let's plot the two vectors to see their orthogonality.

In [None]:
import matplotlib.pyplot as plt

# Plot the two vectors
fig, ax = plt.subplots(figsize=(5,5))
ax.quiver(0, 0, A[0], A[1], angles='xy', scale_units='xy', scale=1, color='r', label="Vector A")
ax.quiver(0, 0, B[0], B[1], angles='xy', scale_units='xy', scale=1, color='b', label="Vector B")
ax.set_xlim(-2, 2)
ax.set_ylim(-2, 2)
ax.axvline(x=0, color='grey', lw=1)
ax.axhline(y=0, color='grey', lw=1)
ax.legend()
ax.set_title("Orthogonal Vectors A and B")
plt.show()


---
The red and blue arrows represent our vectors `A` and `B`, respectively. As observed, they are perpendicular to each other.

### Summary:

When the angle between two vectors is 90 degrees (making them perpendicular or orthogonal), their dot product is 0. This property has important implications in various areas of mathematics and computer science, especially in the context of neural networks and other machine learning algorithms where orthogonality often plays a key role.


---
## Hands-on Exercise:

### 1. Compute the dot product of vectors [1, 2, 3] and [4, 5, 6] manually:

Given two vectors:
$$ [ C = [c_1, c_2, c_3] ] $$
$$ [ D = [d_1, d_2, d_3] ] $$

The dot product is:

$$ [ C \cdot D = c_1 \times d_1 + c_2 \times d_2 + c_3 \times d_3 ] $$

Using the given vectors:
$$ [ C = [1, 2, 3]  ] $$
$$ [ D = [4, 5, 6] ] $$

The dot product becomes:
$$ [ C \cdot D = 1 \times 4 + 2 \times 5 + 3 \times 6 = 4 + 10 + 18 = 32 ] $$

---

### 2. Validate your answer using numpy:


In [None]:
# Define the vectors
C = np.array([1, 2, 3])
D = np.array([4, 5, 6])

# Compute the dot product using numpy
dot_product_CD = np.dot(C, D)
dot_product_CD


---
The dot product computed manually and using numpy matches! The result is indeed 32.

### 3. What is the geometric interpretation of the dot product when the angle between the vectors is 90 degrees? Why?

As previously discussed, when the angle between two vectors is 90 degrees, the vectors are considered orthogonal or perpendicular. Geometrically, this means they do not share any component in the direction of each other. In the context of the dot product, this is captured by the cosine of the angle between them. 

For an angle of 90 degrees (or \( \pi/2 \) radians), the cosine value is 0. Hence, the dot product formula:
$$ A \cdot B = |A| \times |B| \times \cos(\theta) $$

Becomes:
$$ A \cdot B = |A| \times |B| \times 0 = 0 $$

Thus, when two vectors are orthogonal, their dot product is 0. This property reinforces the idea that the vectors don't have any shared directionality.


---
## 3. Convolutions:

### Introduction to Convolutions:
Convolutions are foundational operations in image processing and particularly important in Convolutional Neural Networks (CNNs). They involve the process of sliding a smaller matrix (often called a kernel or filter) over a larger matrix to produce a new matrix (output). This process can be likened to sliding a flashlight over a large image to "illuminate" or "focus" on small portions at a time. As the flashlight moves over the image, it creates a new image representing what it "sees" or processes.

### Mathematics of Convolution:
At each position of the kernel over the input matrix, a local operation is performed. This local operation involves summing up the products of overlapping elements between the kernel and the portion of the larger matrix it covers.

Visualize this process:
Imagine you have a large matrix (the image) and a smaller matrix (the kernel). At each step, you align the kernel with a portion of the image, multiply the overlapping elements together, and then sum these products to produce a single number in the output matrix. This process is repeated for each position the kernel can occupy over the image.

### 1D Convolutions:
While we often use 2D convolutions for images (since images are 2D arrays of pixels), understanding 1D convolutions can simplify the concept.

Consider two 1D arrays:
- A larger array representing our "image"
- A smaller array representing our "kernel"

To compute the convolution, we will "slide" the kernel over the image, compute the product of overlapping elements, and sum these products.

For example:
Imagine a 1D "image" array [2, 1, 2, 1] and a "kernel" array [1, 0]. The convolution operation will involve sliding the kernel over the image and computing the dot product at each position.

Let's manually compute this for the first position:
\[ (2 \times 1) + (1 \times 0) = 2 \]

The result for this position in the output array is 2.

---


In [None]:
# Define our 1D image and kernel
image_1D = np.array([2, 1, 2, 1])
kernel_1D = np.array([1, 0])

# Compute the convolution using numpy
convolution_1D = np.convolve(image_1D, kernel_1D, mode='valid')
convolution_1D


---
The output array [2, 1, 2] represents the convolution of our image and kernel.

### Hands-on Exercise for 1D Convolution:

1. Given the 1D "image" array [3, 4, 2, 1] and "kernel" array [0, 1], compute the convolution manually for each position.
2. Validate your result using numpy.
3. Interpret the resulting convolution array. How does the choice of kernel values influence the output?

---

---
## 2D Convolutions:

2D convolutions are a direct extension of the 1D convolution process, but instead of working with 1D arrays, we work with 2D matrices. This is particularly important for image processing, as images are naturally 2D arrays of pixel values.

### Introduction to 2D Convolutions:

Imagine an actual image as a large grid of numbers, where each number corresponds to the pixel intensity at that point. Similarly, our kernel will be a smaller grid of numbers. The convolution process involves sliding this kernel over our image grid and at each position, summing up the products of overlapping elements.

### Visual Representation:

Let's use a simple example to visually represent this:

Image Matrix:

\begin{bmatrix}
1 & 2 & 3 \\
0 & 1 & 0 \\
2 & 1 & 0 \\
\end{bmatrix}


Kernel:

\begin{bmatrix}
0 & 1 \\
2 & 1 \\
\end{bmatrix}


In the first position, we would compute the sum as follows:
$$ [ (1 \times 0) + (2 \times 1) + (0 \times 2) + (1 \times 1) = 3 ] $$

This value, 3, would be the top-left value in our resulting convolution matrix.

### Let's compute this 2D convolution using numpy:


In [None]:
from scipy import signal

# Define our 2D image and kernel
image_2D = np.array([[1, 2, 3], [0, 1, 0], [2, 1, 0]])
kernel_2D = np.array([[0, 1], [2, 1]])

# Compute the 2D convolution using numpy
convolution_2D = signal.convolve2d(image_2D, kernel_2D, mode='valid')
convolution_2D


---
The resulting matrix represents the convolution of our image and kernel. 

### Interpretation:
This matrix provides a "filtered" representation of the original image. The kernel effectively acts as a feature detector, emphasizing certain features in the input image and de-emphasizing others. The specific nature of the emphasis depends on the values in the kernel.

### Hands-on Exercise for 2D Convolution:

1. Take the following 2D "image" matrix:

\begin{bmatrix}
4 & 3 & 2 & 1 \\
2 & 1 & 0 & 3 \\
3 & 2 & 1 & 4 \\
0 & 0, 1 & 2 \\
\end{bmatrix}

and the kernel:

\begin{bmatrix}
1 & 0 \\
0 & -1 \\
\end{bmatrix}


2. Compute the convolution manually for at least one position.
3. Validate the entire convolution result using numpy.
4. Discuss: What do you observe in the resulting convolution matrix? Can you infer the purpose of this kernel?

---

