# [Training a Neural Network with PyTorch](https://campus.datacamp.com/courses/introduction-to-deep-learning-with-pytorch/training-a-neural-network-with-pytorch?ex=1)

Third chapter in the Introduction to Deep Learning with PyTorch DataCamp course.

## 1 - A Deeper Dive Into Loading Data

Use  `TensorDataset` to prepare data for PyTorch models, storign features (X) and labels (y) as tensors, making them esay to manage.

In [None]:
import torch
from torch.utils.data import TensorDataset
import numpy as np

# Features: 4 x 3 matrix
# Four individuals, and 3 features
inputs = np.array([
    [0.5, 3.4, 6.7],
    [1.2, 5.0, 7.3],
    [34.2, 44.0, 12.3],
    [0.4, 6.7, 2.2]
])
print(inputs, '\n')

# Labels: 4 x 1 matrix
# 0 = specie one
# 1 = specie two
labels = np.array([
    0, 0, 1, 0
])
print(labels)

[[ 0.5  3.4  6.7]
 [ 1.2  5.   7.3]
 [34.2 44.  12.3]
 [ 0.4  6.7  2.2]] 

[0 0 1 0]


In [22]:
# Instantiate dataset class
dataset = TensorDataset(
    torch.tensor(inputs),
    torch.tensor(labels)
)

# Access an individual sample: square bracket indexing:
input_sample, label_sample = dataset[0]
print(f"""
    Input sample: {input_sample} \n
    Label sample: {label_sample}
"""
)


    Input sample: tensor([0.5000, 3.4000, 6.7000], dtype=torch.float64) 

    Label sample: 0



* After to create the dataset using `TensorDataset` function, use the `DataLoader` function to manage data loading during training.

* Since deep learning models require large datasets, batching helps process multiple samples at once, making training more efficient.

* Shuffle randomizes the data order at each epoch, hepling improve model generalization.

* **Epoch**: one full pass through the traning dataloader.

In [25]:
from torch.utils.data import DataLoader

# Define batch size
# Determines how many samples are included in each iteration.
batch_size = 2

# Define shuffle:
# Randomize the data order at each epoch
shuffle = True

# Create a DataLoader:
# Easy to iterate through the dataset in batches
dataloader = DataLoader(
    dataset=dataset,
    batch_size=batch_size,
    shuffle=True
)

# Iterate over the dataloader
for batch_inputs, batch_labels in dataloader:
    print('Batch inputs: \n', batch_inputs)
    print('Batch labels: \n', batch_labels, '\n')

Batch inputs: 
 tensor([[34.2000, 44.0000, 12.3000],
        [ 0.5000,  3.4000,  6.7000]], dtype=torch.float64)
Batch labels: 
 tensor([1, 0]) 

Batch inputs: 
 tensor([[1.2000, 5.0000, 7.3000],
        [0.4000, 6.7000, 2.2000]], dtype=torch.float64)
Batch labels: 
 tensor([0, 0]) 



* In real world deep learning, datasets are much larger, with batch sizes of typically 32 or more for better computational efficiency.

* The `DataLoader` class is essential for efficiently handling large datasets. It speeds up training, optimizes memory usage and stabilizes gradient updates, making deep learnig models more effective.

## 2 - Writing a First Training Loop 

Training a neural network requires:
- 1. Create a model
- 2. Choose a loss function
- 3. Define a dataset
- 4. Set an optimizer
- 5. Run a training loop
    - 5.1. Calculate loss (forward pass)
    - 5.2. Compute gradients (backpropagation)
    - 5.3. Update model parameters

### 2.1 - Binary Classification Example

In [63]:
import torch.nn as nn
from torch.nn import CrossEntropyLoss
from torch.utils.data import TensorDataset, DataLoader
import torch.optim as optim

# 1) Create a model
model = nn.Sequential(
    nn.Linear(3, 5),
    nn.Linear(5, 5),
    nn.Linear(5, 2)
)

# 2) Choose a loss function
criterion = CrossEntropyLoss()

# 3) Define a dataset
dataset = TensorDataset(
    torch.tensor(inputs).float(),
    torch.tensor(labels).long()
)

dataloader = DataLoader(
    dataset=dataset,
    batch_size=2,
    shuffle=True
)

# 4) Set an optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001)

Looping through the entire dataset once is called an epoch and we train over multiple epoch (num_epochs parameter)

In [66]:
# 5) Training loop
num_epochs = 5

for epoch in range(num_epochs): # For each epoch, we loop throug the dataloader
    
    for data in dataloader: # Each iteration of the dataloader provides a batch of samples
        
        # Set the gradients to zero because the optimizer stores gradientes from previous steps by default
        optimizer.zero_grad()

        # Get feature an targe from the dataloader
        feature, target = data

        # Run a forward pass
        pred = model(feature)

        # Compute loss and gradients
        loss = criterion(pred, target)
        loss.backward()

        # Update the parameters
        optimizer.step()

### 2.2 - Regression Example

In [87]:
import torch.nn as nn
from torch.nn import MSELoss
from torch.utils.data import TensorDataset, DataLoader
import torch.optim as optim

# Redefine labels
reg_inputs = np.array([
    [0.5, 3.4, 6.7],
    [1.2, 5.0, 7.3],
    [34.2, 44.0, 12.3],
    [0.4, 6.7, 2.2]
])

reg_labels = np.array([1.3, 4.5, 3.4, 2.2])

# 1) Create a model
model = nn.Sequential(
    nn.Linear(3, 5),
    nn.Linear(5, 1),
)

# 2) Choose a loss function
criterion = MSELoss()

# 3) Define a dataset
dataset = TensorDataset(
    torch.tensor(reg_inputs).float(),
    torch.tensor(reg_labels).float()
)

dataloader = DataLoader(
    dataset=dataset,
    batch_size=2,
    shuffle=True
)

# 4) Set an optimizer
optimizer = optim.SGD(model.parameters(), lr=0.001)

# 5) Training loop
num_epochs = 5

for epoch in range(num_epochs): # For each epoch, we loop throug the dataloader
    
    for data in dataloader: # Each iteration of the dataloader provides a batch of samples
        
        # Set the gradients to zero because the optimizer stores gradientes from previous steps by default
        optimizer.zero_grad()

        # Get feature an targe from the dataloader
        feature, target = data

        # Run a forward pass
        pred = model(feature)
        print(pred, pred.shape)
        print(target, target.shape)

        # Compute loss and gradients
        loss = criterion(pred, target.unsqueeze(1))
        print(loss, '\n')
        loss.backward()

        # Update the parameters
        optimizer.step()

tensor([[ 1.5600],
        [16.6975]], grad_fn=<AddmmBackward0>) torch.Size([2, 1])
tensor([1.3000, 3.4000]) torch.Size([2])
tensor(88.4456, grad_fn=<MseLossBackward0>) 

tensor([[-1.5703],
        [-2.6054]], grad_fn=<AddmmBackward0>) torch.Size([2, 1])
tensor([2.2000, 4.5000]) torch.Size([2])
tensor(32.3509, grad_fn=<MseLossBackward0>) 

tensor([[-1.4400],
        [-1.9352]], grad_fn=<AddmmBackward0>) torch.Size([2, 1])
tensor([1.3000, 4.5000]) torch.Size([2])
tensor(24.4597, grad_fn=<MseLossBackward0>) 

tensor([[-13.8494],
        [ -0.8286]], grad_fn=<AddmmBackward0>) torch.Size([2, 1])
tensor([3.4000, 2.2000]) torch.Size([2])
tensor(153.3571, grad_fn=<MseLossBackward0>) 

tensor([[1.5015],
        [1.5108]], grad_fn=<AddmmBackward0>) torch.Size([2, 1])
tensor([2.2000, 1.3000]) torch.Size([2])
tensor(0.2662, grad_fn=<MseLossBackward0>) 

tensor([[1.7618],
        [8.0952]], grad_fn=<AddmmBackward0>) torch.Size([2, 1])
tensor([4.5000, 3.4000]) torch.Size([2])
tensor(14.7713, grad_f