<a href="https://colab.research.google.com/github/FedericoSabbadini/DeepLearning/blob/main/PyTorch/PyTorch_Intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduzione a PyTorch

## **What is PyTorch?**

PyTorch is an open-source deep learning framework developed by Facebook’s AI Research team (FAIR). It's particularly popular for its flexibility and usability in research and production alike. Unlike TensorFlow's older versions, which used static computation graphs, PyTorch uses dynamic computation graphs, making it easier to debug and more intuitive for Python programmers.

In this notebook, we will explore PyTorch's key features, followed by comparisons to Keras and TensorFlow.

**Resources**

- [PyTorch Documentation](https://pytorch.org/docs/)
- [Keras Documentation](https://keras.io/)


<br>

## **Installing PyTorch**

Since we're on Colab, we have nothing to do.
But if you are interested in running it locally, you can follow the instructions from [PyTorch's official website](https://pytorch.org/get-started/locally/) to choose the correct version for CPU or GPU.


```bash
pip install torch torchvision torchaudio
```



---

### **Tensors: The Building Block of PyTorch**

In PyTorch, tensors are the fundamental data structure, analogous to arrays in NumPy but with the added advantage that they can run on GPUs. In this section, we’ll explore various ways to create tensors and some basic operations that can be performed on them.

In [169]:
import torch
import numpy as np

DATA = [[1.0, 2.0], [3.0, 4.0]]

np_array = np.array(DATA)
print(f"NumPy array: \n {np_array}")

tensor_from_list = torch.tensor(DATA, dtype=torch.float32)
print(f"\nTensor from list:\n {tensor_from_list}")

tensor_from_numpy = torch.tensor(np_array)
print(f"\nTensor from NumPy array:\n {tensor_from_numpy}")


NumPy array: 
 [[1. 2.]
 [3. 4.]]

Tensor from list:
 tensor([[1., 2.],
        [3., 4.]])

Tensor from NumPy array:
 tensor([[1., 2.],
        [3., 4.]], dtype=torch.float64)


**Creating Tensors with Special Initialization**

PyTorch provides several functions to create tensors with specific initial values.

In [170]:
# Creating a tensor of zeros
zeros_tensor = torch.zeros((2,3))
print(f"Tensor of zeros:\n {zeros_tensor}")

# Creating a tensor of ones
ones_tensor = torch.ones((2,3))
print(f"\nTensor of ones:\n {ones_tensor}")

# Creating a tensor with random values
random_tensor = torch.rand(2,3)
print(f"\nRandom tensor:\n {random_tensor}")

rand_like_tensor = torch.rand_like(random_tensor) # causale che eredita le dimensioni da un altro
print(f"\nRandom tensor with the same shape as the previous tensor:\n {rand_like_tensor}")

Tensor of zeros:
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

Tensor of ones:
 tensor([[1., 1., 1.],
        [1., 1., 1.]])

Random tensor:
 tensor([[0.7804, 0.3562, 0.5606],
        [0.1965, 0.4059, 0.3193]])

Random tensor with the same shape as the previous tensor:
 tensor([[0.7134, 0.7668, 0.6382],
        [0.2820, 0.7273, 0.0621]])


**Moving Tensors Between Devices (CPU and GPU)**

One of the key advantages of PyTorch is its seamless support for GPU acceleration. PyTorch allows tensors to be created on or moved between devices like CPUs and GPUs. This is done using the ```torch.device()``` object and the ```to()``` method. If a GPU is available, computations *can* be much faster.

In [171]:
from time import time

# Check if GPU is available
device = 'cuda:0' if torch.cuda.is_available() else 'cpu' #:0 significa la prima gpu disponibile, più efficiente
print(f"Using device: {device}\n")

Using device: cuda:0



In [172]:
!nvidia-smi

Fri Oct  3 10:54:29 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   76C    P0             30W /   70W |     126MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

In [173]:
tensor_on_gpu = torch.rand((2, 3), device=device )
print("Tensor on GPU (if available):\n", tensor_on_gpu)


# Moving a tensor from CPU to GPU
tensor_cpu = torch.ones((2, 3))

s = time()
result = tensor_cpu ** 2 * tensor_cpu ** 5
print(f"\nTime taken on CPU: {round((time() - s)*1000, 6)} ms")

tensor_gpu = tensor_cpu.to(device)
s = time()
result = tensor_gpu ** 2 * tensor_gpu ** 5
print(f"Time taken on GPU: {round((time() - s)*1000, 6)} ms")

Tensor on GPU (if available):
 tensor([[0.2963, 0.5811, 0.1909],
        [0.7734, 0.1190, 0.3650]], device='cuda:0')

Time taken on CPU: 0.183582 ms
Time taken on GPU: 0.247478 ms



---

### **Automatic Differentiation: PyTorch’s Autograd**

In deep learning, we often need to calculate gradients during backpropagation to update the weights of a neural network. [PyTorch’s autograd module](https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html) is responsible for automatically computing the gradients of tensors during the backward pass. It does this by building a [dynamic computational graph](https://pytorch.org/blog/computational-graphs-constructed-in-pytorch/), where nodes represent operations and edges represent the flow of data.

PyTorch tracks every operation on tensors with ```requires_grad=True``` to enable automatic differentiation.


Quando addestriamo un modello, come una rete neurale, dobbiamo calcolare i gradienti per ottimizzare i pesi. Per fare ciò, dobbiamo conoscere come i pesi influenzano l'output, e questo richiede il tracciamento delle operazioni che vengono eseguite sui tensori.

PyTorch costruisce un grafo computazionale dinamico: ogni operazione che esegui su un tensore con requires_grad=True (come nel tuo esempio) viene registrata in un grafo. Questo grafo rappresenta le dipendenze tra le variabili, cioè come ogni variabile dipende dalle altre. Quando calcoli la perdita (ad esempio, la differenza tra le previsioni e i valori reali), PyTorch può eseguire il backpropagation e calcolare i gradienti lungo il grafo.
Durante la fase di inferenza (quando il modello è già addestrato e viene utilizzato per fare previsioni), non è necessario calcolare i gradienti. In questa fase, vuoi solo eseguire le operazioni in modo più veloce. Ecco perché puoi disabilitare il tracciamento dei gradienti con torch.no_grad() per evitare un overhead computazionale inutile.

In [174]:
import torch

x = torch.tensor([2.0, 3.0])

y = x[0] ** 3 + x[1] * 2

print(f"Results: {y} \n")

print(f"Gradients of x: {x.grad}")
print(f"Backward Function of y: {y.grad_fn}")

Results: 14.0 

Gradients of x: None
Backward Function of y: None


In [175]:
# Create a tensor with requires_grad=True
a = torch.tensor([1.0, 2.0, 3.0], requires_grad=True) # si costruisce il grafo. per eseguirlo fare a.backward()

# Perform operations with gradient tracking
s = time()
b = a ** 2
print(f"With gradient tracking {round((time() - s)*1000, 6)} ms")

# Disable gradient tracking
with torch.no_grad():
    s = time()
    c = a ** 2
    print(f"Without gradient tracking {round((time() - s)*1000, 6)} ms")


With gradient tracking 0.168324 ms
Without gradient tracking 0.027895 ms



---

### **Building a Simple Neural Network in PyTorch**

In this section, we will walk through the process of creating a simple neural network using PyTorch. All the components needed to build the network are contained in the [torch.nn](https://pytorch.org/docs/stable/nn.html) package.

In [176]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Input

INPUT = np.array([[1.0, 2.0]])

# Define a simple network using Keras
model = Sequential([
    Input((2,)),
    Dense(4, activation='relu'),
    Dense(1)
])

print(model.summary())

s = time()
print(f"\nForward Pass: {model.predict(INPUT)} in {round((time() - s)*1000, 6)} ms")

None
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 155ms/step

Forward Pass: [[0.19302483]] in 238.548517 ms


In [177]:
import torch
import torch.nn as nn
import torch.optim as optim

INPUT = torch.tensor([[1.0, 2.0]])

net = nn.Sequential(
    nn.Linear(2, 4), # richiede definizione numero ingressi e uscite
    nn.ReLU(),
    nn.Linear(4, 1)
)

print(net)

s = time()
print(f"\nForward Pass: {net(INPUT)} in {round((time() - s)*1000, 6)} ms")

Sequential(
  (0): Linear(in_features=2, out_features=4, bias=True)
  (1): ReLU()
  (2): Linear(in_features=4, out_features=1, bias=True)
)

Forward Pass: tensor([[0.5272]], grad_fn=<AddmmBackward0>) in 1.049757 ms


PyTorch neural networks are typically defined by subclassing torch.nn.Module, which represents a base class for all neural networks in PyTorch. Layers are defined in the ```__init__()``` method, and the forward pass is implemented in the ```forward()``` method.

In [178]:
class SimpleNet(nn.Module):

  def __init__(self, input_dim, output_dim, hidden_dim): # costruttore, definizione dei layer
    super(SimpleNet, self).__init__()
    self.fc1 = nn.Linear(input_dim, hidden_dim)
    self.fc2 = nn.Linear(hidden_dim, output_dim)
    self.relu = nn.ReLU()

  def forward(self, x): # come fluiscono le informazioni nella nn tra i layer
    x = self.fc1(x)
    x = self.relu(x)
    x = self.fc2(x)
    return x

net = SimpleNet(input_dim=2, output_dim=1, hidden_dim=4)

print(net)

s = time()
print(f"\nForward Pass: {net(INPUT)} in {round((time() - s)*1000, 6)} ms")

SimpleNet(
  (fc1): Linear(in_features=2, out_features=4, bias=True)
  (fc2): Linear(in_features=4, out_features=1, bias=True)
  (relu): ReLU()
)

Forward Pass: tensor([[-0.4968]], grad_fn=<AddmmBackward0>) in 1.214743 ms


**Training a Neural Network in PyTorch**

PyTorch is a powerful deep learning library that gives us a high degree of manual control over every step of the training process. Unlike Keras, which abstracts many of the internal workings behind easy-to-use functions, PyTorch allows us to customize every part of the model’s behavior. This can be especially useful when we need to fine-tune specific aspects of the training or modify the underlying logic to fit complex or non-standard tasks.

We will begin by setting up a basic neural network model, define the [loss function](https://pytorch.org/docs/stable/nn.html#loss-functions), and then proceed with training the network on a dataset. As we go, we'll manually implement essential components such as forward passes, backpropagation, and weight updates with [optimizer](https://pytorch.org/docs/stable/optim.html).

In [179]:
# Define loss function and optimizer

net = SimpleNet(input_dim=2, output_dim=1, hidden_dim=4)

criterion = nn.MSELoss()
optimizer = optim.SGD(net.parameters(), lr=0.01)

# Train the network to understand if the input is positive or negative

train_dataset = [
    (torch.tensor([1.0, 2.0]), torch.tensor([1.0])),
    (torch.tensor([-3.0, -4.0]), torch.tensor([0.0])),
    (torch.tensor([5.0, 6.0]), torch.tensor([1.0])),
    (torch.tensor([-5.0, -6.0]), torch.tensor([0.0])),
]

NUM_EPOCHS = 5

In [180]:
# !!!!!!!!!!! comandi per addestramento di una rete !!!!!!!!!!!!!!!
for epoch in range(NUM_EPOCHS):
    for tensor, target in train_dataset:net.train
    net.train()
    output = net(tensor)
    loss = criterion(output, target)

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    print(f"Epoch {epoch+1}, Loss: {round(loss.item(), 4)}")

Epoch 1, Loss: 1.1193
Epoch 2, Loss: 0.0009
Epoch 3, Loss: 0.0
Epoch 4, Loss: 0.0
Epoch 5, Loss: 0.0


## **PyTorch vs. TensorFlow/Keras**


| Feature               | PyTorch                                    | TensorFlow/Keras                          |
|-----------------------|--------------------------------------------|-------------------------------------------|
| **API Level**          | Low-level, very flexible                   | High-level (Keras) or low-level (TF core) |
| **Computation Graph**  | Dynamic (eager execution)                  | Dynamic (with TensorFlow 2.x)             |
| **Ease of Use**        | More manual, but powerful                  | Keras is very user-friendly               |
| **Community**          | Growing rapidly, dominant in research      | Strong support, widely adopted in industry|
| **Ecosystem**          | Fewer add-ons (though fast-growing)        | Large ecosystem (e.g., TensorFlow Hub)    |

<br>

## **Conclusion**

- PyTorch provides exceptional flexibility, making it a favorite for researchers.
- TensorFlow has a mature ecosystem, but PyTorch’s dynamic nature is great for debugging and custom models.
- Keras is best for beginners or when rapid prototyping is necessary.

