![torch_logo.png](attachment:torch_logo.png)

# ~ PoC AI Pool 2025 ~
- ## Day 3: Neural Network
    - ### Module 3: Neural Network with torch
-----------

## Introduction


[PyTorch](https://pytorch.org/) is the most used framework when it comes to Machine Learning, especialy Deep Learning.\
Whether for computer vision or language processing, PyTorch allows you to build the state-of-the-art in AI.


Developed by Meta AI, PyTorch is now part of the Linux foundation and is completely [open-source](https://github.com/pytorch/pytorch).\
When you hear about deep learning, PyTorch is never far away as it is present in Tesla cars, it is used by OpenAI, Google, AMD, Nvidia, AWS, Microsoft, Meta, Netflix and many others !

As you can see, PyTorch has distinguished itself as the AI framework par excellence.

---


### Why use PyTorch

But do you know what PyTorch is for?

Yesterday, you dive into the theory of machine learning and neural network

Now imagine that you have millions of parameters, a complex architecture and that you have to create a neural network from scratch every time.

Well, **PyTorch allows you to build neural networks very easily**, forget about the mathematics behind it and build complex neural networks in a few functions and parameters. 

Excuse me, **don't entirely forget the mathematics**, it's important sometimes ;)

In [1]:
import torch 
import math
import numpy as np

## Part.1 The Tensor
### What is a Tensor ?

As I told you before, the data in PyTorch is in the form of tensors.

Concretely, a **Tensor is an object** that is similar to an array or a matrix.\
Actually, tensors are **similar to arrays in Numpy** with a **few differences**.

The main strength of tensors is that **they can run on GPUs** or other hardware acceleration devices.\
You may already know this, but AI models are often accelerated using graphics cards.

In addition, **Tensors are optimized to calculate gradients** in the gradient descent algorithm.\
If you remember gradient descent is the algorithm that allows to adjust the weights of our neural network and thus to make it learn new things.

In short, Tensors are used to encode the input and output data of our neural networks as well as the weights of our networks.\
They have the advantage of being able to run on a GPU and to be optimized for gradient descent.

### Step 1 - Build a Tensor from data

In this first exercise you will have to **create a tensor** from the array ``data``\
Be careful, the tensor must be built from the array and thus contain the same data.
> You might want to take a look to the [documentation](https://pytorch.org/docs/stable/tensors.html)

In [None]:
data = [[1, 2],[3, 4]]

#TODO : Tensorise the data with torch
tensor_data = ...

# Print the info of the tensor
print(tensor_data)
print("-"*20)
print(tensor_data.shape)

assert torch.is_tensor(tensor_data), "Your object is not a tensor"

### Step 2 - Build a Tensor from shape

One of the many interesting features of PyTorch is that you can generate Tensors from shapes.\
This can be useful when you want to initialize neural network weights for example.

Create a Tensor **filled with 0** and of shape ``(2, 3)``.

In [None]:
#TODO : Create a tensor of shape (2, 3) filled with zeros
shape = ...

tensor_shape_zeros = ...

print(tensor_shape_zeros)
print("-"*20)
print(tensor_shape_zeros.shape)

assert tensor_shape_zeros.sum().item() == 0, "Your tensor is not filled with zeros."
assert list(tensor_shape_zeros.shape) == [2, 3], "Your tensor does not have a shape (2, 3,)"

### Step 3 - Print Tensor's attributes

You now know how to create a tensor.\
Create a tensor with random value on the size you want !
Display **four pieces of information** about this tensor:

* The values itself
* Its shape
* The data type of the tensor
* The device on which the tensor is stored

Don't hesitate to take a look at the [documentation](https://pytorch.org/docs/stable/tensor_attributes.html)

In [None]:
#TODO : initalise a tensor with randome value
tensor = ...

############################################
#TODO print the infos of the tensor 
...

### Step 4 - Use GPU if available

If you look carefully at which device your tensor is stored on, it will be on your CPU even if you have a GPU.
And this is normal.\
If you don't indicate that you want to store your tensor on your GPU then it will use the CPU by default.

Look at the [documentation](https://pytorch.org/docs/stable/generated/torch.Tensor.to.html) about this.\
For checking if cuda is available [check this](https://pytorch.org/docs/stable/generated/torch.cuda.is_available.html#torch.cuda.is_available).
And for Mac user check if mps is [available](https://pytorch.org/docs/stable/notes/mps.html). 

Add a condition to know if a GPU is available, if it is the case move your tensor on your GPU.
Do the same for Mac user use MPS.

In [None]:
tensor_device = torch.rand(3, 2)

print("before :", tensor_device.device)
#TODO: check if cuda or mlx is available and move the tensor to cuda or mlx if it is ; ~4 lines
...

print("after :",tensor_device.device)

### Step 5 - Apply an arithmetic operation to a Tensor

In the same way as for a numpy array, one can easily apply arithmetic operations on a Tensor.

Multiply the data of the Tensor by **42**.

In [None]:
tensor = torch.ones((3, 3), dtype=torch.float)
print(tensor)

#TODO: multiply the tensor by 42
tensor = ...
print(tensor)
assert int(tensor.sum().item()) == 378, 'The tensor is not multiply by 42.'

### Step 6 - Reshape a Tensor

Again in the same way as a numpy array, you can reshape your Tensor using the ``reshape`` method.

Turn your shape tensor ``(3, 9)`` into a shape tensor ``(3, 3, 3)``.

In [None]:
tensor = ...
print(tensor)

### TODO: code here (~ 1 line)
tensor = ...
print("-"*50)
print(tensor)
assert list(tensor.shape) == [3, 3, 3], "Your tensor does not have a shape (3, 3, 3)"

## Part 2: Neural Network Layers
### What is a Neural Network Layer?

In PyTorch, **neural network layers** are the building blocks of deep learning models. 

There are two fundamental types of layers that are widely used in AI:

1. **Linear Layers (`torch.nn.Linear`)**  
   A **Linear Layer** performs a simple mathematical operation:
   $
   y = x * W^T + b
   $

   Linear layers are used to map input features to output features and are common in fully connected layers of neural networks.

2. **Convolutional Layers (`torch.nn.Conv2d`)**  
   A **Convolutional Layer** is primarily used for images. Instead of processing individual features, it processes small patches of the input using filters or kernels:
   - **Kernels**: Small learnable tensors that slide over the input, detecting patterns like edges or textures.
   - **Key Parameters**:
     - **Kernel Size**: The size of the sliding window (e.g., 3x3).
     - **Stride**: The step size of the kernel.
     - **Padding**: Adds extra borders around the input to control output size.

   These layers are essential in tasks like image classification, object detection, and image segmentation.

### Layers in PyTorch

In PyTorch, layers are defined in the `torch.nn` module where you can find the doc [here](https://pytorch.org/docs/stable/nn.html)

In the next sections, we will explore these layers in depth with practical examples, good luck !

In [8]:
import torch.nn as nn

### Part 2.1 Linear Layers 



ou might want to take a look to the [documentation](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html#torch.nn.Linear) of `linear layers`! 
### Step 1 - Build a simple layer


In this first exercise you will have to **create a layer** with that take 3 features and output 2 features !

In [None]:
# TODO: Create a linear layer with 3 inputs and 2 outputs
layer = ...

print(layer)
print("-"*30)
print(layer.weight)
print("-"*30)
print(layer.bias)

assert list(layer.weight.shape) == [2, 3], "The weight of the layer is not the right shape"

### Step 2 - Apply a Linear Layer to a Tensor

- Your goal here is to create a tensor with random value of shape (1, 4)
- Pass this tensor to a linear layer that output 2 values !


In [None]:
#TODO: Create a tensor with shape (1, 4) and pass it through the layer
tensor = ...

layer = ...

output = ...

print("input tensor :",tensor)
print("-"*30)
print("output tensor :",output)

assert list(output.shape) == [1, 2], "The output of the layer is not the right shape"

*Congratulations you make your first forward operation of a neural network ! (remember the forward function you made yesterday is the same as here :)*

### Step 3 - Chain two Linear Layers

In the same way of above, create two layers and forward a tensor to it !

- The input of the **first** layer size need to be 3 and the output of the **second** layer need to be 3
- Try to find out what are the value you need to put beetween the both.

In [None]:
#TODO: Create a neural network with 2 linear layers

tensor = ...
layer_1 = ...
layer_2 = ...

assert layer_1.out_features == layer_2.in_features, "The dimensions between the two layers are not compatible and need to be the same"

#TODO: Create a forward function that takes a tensor as input and passes it through the two layers ~3 lines
def forward(x):
    ...

output = forward(tensor)

print("output tensor :",output)
assert list(output.shape) == [1, 3], "The output of the forward function is not the right shape"


### Step 4 - Understand Batch

- create two tensor, one is the shape (1, 3) and the other is shape (3, 3)
- pass it to your model and compare the result, what can you find ?

In [None]:
#TODO: Create to tensor with different shapes and pass them through the forward function
tensor_shape_1 = ...
tensor_shape_3 = ...

result_1 = forward(tensor_shape_1)
result_3 = forward(tensor_shape_3)

print("Tensor with a batch of 1 :",result_1)
print("-"*30)
print("Tensor with a batch of 3 :",result_3)

assert list(result_1.shape) == [1, 3], "The output of the forward function is not the right shape"
assert list(result_3.shape) == [3, 3], "The output of the forward function is not the right shape"

As you can see, the shape of the output matches the first value (in the shape) of the tensor we pass to the model, which is basically the number of examples we send in. It doesn’t have to be just one—this is where PyTorch really shines with parallelization! All the examples are processed at the same time, making everything faster and more efficient

Here an image to understand it :

![understand_batch.png](attachment:understand_batch.png)

### Part 2.2 Convolutional layer 

here is the link to the [documentation](https://pytorch.org/docs/stable/nn.html#convolution-layers) of convolutional layer (all types, choose which you want) ! 

### Step 1 - Create a layer
your goal here is to create a Conv2d layer with 1 input and 1 output but with a kernel size of 3 !


In [None]:
# TODO: Initialize a Conv2d layer
conv_layer = ...

# Print layer weights
print("Layer", conv_layer)
print("-"*60)
print("Weights:", conv_layer.weight)
print("-"*60)

print("Weight Shape:", conv_layer.weight.shape)

Interesting, yeah? Why do we have a shape like this? Let me break it down for you:

We have a Conv2D layer with:

•	***in = 1***: This means there is 1 input channel. So, our input image is likely a grayscale image, which only has one color channel *(for example an input of 3 represent the RGB scale )*.

•	***out = 1***: This means we have 1 output channel. This is the number of filters that will be applied to the image, and here we only have 1 filter.

•	***kernel size = 3***: This refers to the size of the filter, which is a 3x3 grid of numbers.

*Now, the weight tensor shape is [1, 1, 3, 3]. Here’s what each part means:*

•	***The first 1***: This represents the number of output channels (filters). We only have one filter, so it’s 1.

•	***The second 1***: This represents the number of input channels. Since we are using a grayscale image (1 channel), it’s also 1.

•	***The 3 (third dimension)***: This is the height of the filter, which is 3 pixels.

•	***The 3 (fourth dimension)***: This is the width of the filter, also 3 pixels.

So, in simple terms, the filter is a 3x3 matrix that will scan over the input image (which also has 1 channel), and it will produce 1 output channel after applying the filter.



*We can also see a stride=(1,1) that simply represent by how many the kernel need to move in height and width*

Let's explain it with a visualisation

![convolution_gif.gif](attachment:convolution_gif.gif)

### Step - 2 pass a tensor to this layer

Your goal here is to reproduce the gif above ! (with random value)

create a tensor :

- batch_size = 1
- channel = 1 
- height = 5 
- width = 5

pass this tensor to you layer made above 

In [None]:
#TODO: Create a tensor  and pass it through the Conv2d layer
tensor = ...

output = ...

print("Output:", output)
print("-"*80)
print("Output Shape:", output.shape)

assert list(output.shape) == [1, 1, 3, 3], "The output of the Conv2d layer is not the right shape"

### Step - 3 Create a layer with padding 

Your goal is now to recreate the  same layer created in step 1 but with a padding = 1

Pass the tensor created above in the new layers



In [None]:
#TODO: Initialize a Conv2d layer with padding and pass a tensor through it
layers_with_padding = ...

new_tensor = ...

output = layers_with_padding(new_tensor)

print("Output with padding:", output)
print("-"*80)
print("Output Shape with padding:", output.shape)

assert list(output.shape) == [1, 1, 5, 5], "The output of the Conv2d layer with padding is not the right shape"

As you can see, the image didn't change of size !

This is what the padding are for, he simply add a number of layer around the image *(matrix)* as the model will process and analyse also the border of the image, 
here is a image to understand it 

![padding_example.png](attachment:padding_example.png)

### Step 4 - Chain two layers together!

Just like you did with the linear layers, now try it with convolutional layers!

Your goal here is to pass a 5x5 grayscale tensor through the first convolutional layer, using a kernel size of 3 and a padding of… (take a guess!). For the second layer, make sure it takes an image of the same size as the tensor, but with 3 channels (so it’s not grayscale anymore, but with a depth of 3!) and output 1 channel.

In [None]:
#TODO create the two conv layer 

tensor = ...

conv_1 = ...
conv_2 = ...

#TODO: Create a forward function that takes a tensor as input and passes it through the two layers ~3 lines
def forward(x):
    ...

output = forward(tensor)
print("Input tensor:",tensor)
print("-"*70)
print("Output tensor:",output)

assert conv_1.out_channels == conv_2.in_channels, "The dimensions of the two layers are not compatible and need to be the same, tips it's 3..."
assert list(output.shape) == [1, 1, 5, 5], "The output of the forward function is not the right shape"

### Step 5 - Get a 1D tensor from an image

Imagine you want to get your *Image* as the output of your forward function coded above and flatten it into a 2D tensor.

Why flatten a tensor, you may ask?

It’s because the Conv2D layer outputs a 4D tensor *(batch size, channels, height, width)*, but the Linear layer expects a 2D tensor *(batch size, features)*.

Flattening the tensor converts the 4D shape into a 2D shape, allowing it to be passed correctly to the Linear layer. This process combines all the spatial information from the convolutional layers into a single long vector of features, which can then be used by the fully connected layers for further processing.

Take a look at the PyTorch [documentation](https://pytorch.org/docs/stable/nn.html) and find the function that can flatten the tensor for you!

In [None]:
#TODO flatten the output tensor
flatten_output = ...

print("Output tensor:", output)
print("-"*70)
print("Flatten output tensor:",flatten_output)

### Step 6 - Combine convolutional with linear Layers !

Your goal here is to start with an 5*5 *image* and at the end ressort with an tensor with a shape of (1, 2) ! 

Add as much layer you want between, for create a real neural network ! 

In [None]:
#TODO : create a tensor with shape (1, 1, 5, 5)

input_tensor = ...

#TODO : create conv(s) layer(s)
conv_1 = ...

#TODO : create a flatten layer
flatten = ...

#TODO : create linear(s) layer(s)
linear = ...

#TODO: Create a forward function that takes a tensor as input and passes it through the layers ~4 lines
def forward(x):
    ...

output = forward(input_tensor)

print("Input tensor:",input_tensor)
print("-"*70)
print("Output tensor:",output)

assert list(output.shape) == [1, 2], "The output of the forward function is not the right shape"

Well done! Now that you’ve made it through all of that, let’s take a little break with some simple functions that will be very useful for you as you create complex neural networks!

___

Yesterday, you discovered many algorithms (depending on how far you got, but don’t worry, we’re not diving into the theory *you can always go back to yesterday to refresh your memory!*)

All the algorithms used for machine learning, especially in deep learning and neural networks, are already built into PyTorch! (Pretty exciting, right? What a funny joke, all that theory just to call it with torch… but the math is important!)

Let’s start with the loss functions in PyTorch, and here’s a simple documentation to get you started:

### Step - 1 Initialise loss function 

You need to initialise 4 values ***(in tensor ! remember the first exercice of today)*** here :

- a prediction of a linear regression model and it's actual target
- a predction of a logistic regression model and alos it's actual target

let's start with linear regression (li-r) model

In [None]:
#TODO : initalise the tensor values for li-r model
lir_y_pred = ...
lir_y = ...

#TODO : compute the mean squared error
lir_mse = ...

output = lir_mse(...)

print("MSE:",output)

and now logistic regression (lo-r) model

In [None]:
#TODO : initalise the tensor values for lo-r model
lor_y_pred = ...
lor_y = ...

#TODO : compute the binary cross entropy
lor_bce = ...

output = lor_bce(...)

print("BCE:",output)

Note that the gradient descend is already implemented in torch you just need to calcul your loss and *backward* it !
here's the [documentation](https://pytorch.org/docs/stable/generated/torch.Tensor.backward.html) 

### Step - 2 Activations functions

As the same of loss function, activation function are also available in torch ! let's create ....


In [None]:
#TODO : initialise the tensor values for the model
model_layers_relu = ...
model_layers_softmax = ...

#TODO : compute the relu and softmax
relu = ...
softmax = ...

output_relu = relu(model_layers_relu)
output_softmax = softmax(model_layers_softmax)

print("ReLU:",output_relu)
print("-"*70)
print("Softmax:",output_softmax)


---
Well done guys, you have know the base concept to create a model with torch !
dive into the vision parts to create your first linear model ! 
here is the [notebook](<../2 - Vision-Models/2.1 - Minst/Minst.ipynb>)