# PyTorch Workflow

Explore an example PyTorch end-to-end workflow.

Ressources:
* Ground truth notebook - https://github.com/mrdbourke/pytorch-deep-learning/blob/main/01_pytorch_workflow.ipynb
* Book version: https://www.learnpytorch.io/01_pytorch_workflow/
* Ask a question: https://github.com/mrdbourke/pytorch-deep-learning/discussions/

![](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/01_a_pytorch_workflow.png)

In [None]:
what_were_covering = {1: "data (prepare and load)",
                      2: "build model",
                      3: "Fiiting the model to data (training)",
                      4: "making predictions and evaluate a model (inference)",
                      5: "saving and loading a model",
                      6: "putting it all together"}
what_were_covering

In [None]:
import torch
from torch import nn # neural network module
import matplotlib.pyplot as plt
import numpy as np
from pathlib import Path

# check PyTorch version
torch.__version__

In [None]:
!nvidia-smi

## 1. Data (preparing and loading)

Data can be almost anything ... in machine learning

* Excel spreadsheet
* Images
* Videos
* Audio
* DNA
* Text
* ... 
 
 ML is a game of two parts:
 1. Get data into numerical representation.
 2. Build a model to learn patterns in that numerical representation.

![](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/01-machine-learning-a-game-of-two-parts.png)

To showcaes this, let's create some *known* showcase data using linear regression formula.

We'll use a linear regression formula to make a straight line with *known* **parameters**.

\begin{equation}
Y_i = f(X_i, \beta) + e_i
\end{equation}

In [None]:
# Create *known* parameters
weight = 0.7
bias = 0.3

# Create 
start = 0
end = 1
step = 0.02
X = torch.arange(start, end, step).unsqueeze(dim=1)
y = weight * X + bias

X[:10], y[:10], len(X), len(y)

### Splitting data into training and test sets
(One of the most important concepts in machine learning in general)

In [None]:
# Create a train/test split
train_split = int(0.8 * len(X))
X_train, y_train = X[:train_split], y[:train_split]
X_test, y_test = X[train_split:], y[train_split:]

In [None]:
print(len(X_train), len(y_train), len(X_test), len(y_test))

How might we better visualize our data?

This is where the data explorer's motto comes in!

"Visualize, visualize, ..."

In [None]:
# Create plot of y above X
def plot_predictions(train_data=X_train,
                     train_labels=y_train, 
                     test_data=X_test, 
                     test_labels=y_test, 
                     predictions=None):
    
    """ Plots training data, test data and compares predictions. """
    
    plt.figure(figsize=(10, 7))
    plt.scatter(train_data, train_labels, c='b', label='Training data')
    plt.scatter(test_data, test_labels, c='g', label='Testing data')
    if predictions is not None:
        plt.scatter(test_data, predictions, c='r', label='Predictions')
    plt.legend()
    plt.show()  


In [None]:
# Create plot of training and test data
plot_predictions(train_data=X_train, 
                 train_labels=y_train, 
                 test_data=X_test, 
                 test_labels=y_test)

## 2. Build model

Our first Pytorch model!

Because we are going to be building classes throughout the course, I'd recommend getting familiar with OOP in Python, to do so you can use the following resource from Real Python...

What the model does:
* Start with random values (weight & bias)
* Look at training data and adjust the random values to better represint (or get closer to) the ideal values (the weight & bias values we used to create the data)

How does it do so?

Through two main algorithms:
1. Gradient descent
2. Backpropagation

In [None]:
# Create a linear regression model class
class LinearRegressionModel(nn.Module):
    """
    This class represents a linear regression model in PyTorch. It inherits from the `nn.Module` class
    and implements the necessary methods for defining the computation in the model.
    
    Attributes:
        weights (nn.Parameter): The weight parameter of the linear regression model.
        bias (nn.Parameter): The bias parameter of the linear regression model.
    """
    
    def __init__(self):
        super().__init__()
        
        self.weights = nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float32))
        self.bias = nn.Parameter(torch.randn(1, requires_grad=True, dtype=torch.float32))

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """ Forward method to define the computation in the model. """
        return self.weights * x + self.bias

* Subclass `nn.Module` contains all the building blocks for neural networks
* Initialize `model.parameters` to be used in various computations (these could be different layers from `torch.nn`, single parameters, hard-coded values or functions)
* `requires_grad=True` means PyTorch will track the gradients of this specific parameter for use with `torch.autograd` and gradient descent (for many `torch.nn` modules, `requires_grad=True` is set by default)
* Any subclass of `nn.Module` needs to override `forward()` (this defines the forward computation of the model)

### PyTorch model building essentials

* `torch.nn` - Contains all of the buildings for computational graphs (a neural network can be considered a computational graph)
* `torch.nn.Parameter` - What parameters should our model try and learn, often a PyTorch layer from `torch.nn` will set these for us
* `torch.nn.Module` - The baseclass for all neural network modules, if you subclass it, one should override `forward()`
* `torch.optim` - This is where the optimizers in PyTorch live, they will help with gradient descent
* `def forward()` - All `nn.Module` subclasses require you to override `forward()`, this method defines what happens in the forward computation

### Checking the contents of our PyTorch model

Now we've created a model, let's see what's inside...

So we can check our model parameters or what's inside our model using `.parameters()`.

In [None]:
# Create a random seed
torch.manual_seed(42)

# Create an instance of the LinearRegressionModel
model_0 = LinearRegressionModel()

# Check the model's parameters
list(model_0.parameters())

In [None]:
# List named parameters
model_0.state_dict()

### Making predictions using `torch.inference_mode()`

To check our model's predictive powern, let's see how well it predicts `y_test` based on `X_test`.

When we pass data through our model, it's goingo to run through the `forward()` method.

In [None]:
# Make predictions with the model
with torch.inference_mode():
    y_preds = model_0(X_test)

y_preds

In [None]:
plot_predictions(predictions=y_preds)

## 3. Train model


The whole idea of training is for a model to move from some *unknown* parameters (these may be random) to some *know* parameters.  

Or in other words from poor representation of the data to a better representation of the data.

One way to measure how poor or how wrong our model predictions are, is to use a loss function.

* Note: Loss function may also be called cost function or criterion in different areas. For our case, we're going to refer to it as a loss function.

Things we need to train:

* **Loss function**: A function to measuren how wrong the predictions are compared to the ideal output. The lower the better. 
* **Optimize.**: Takes into account the loss of the model and adjusts the model's parameters (e.g. weight & bias in our case) to improve the loss function.
    * Inside the optimizer one'll often have to set two parameters:
        * `params` - the model parameters one would like to optimize, for example `params=model_0.parameters()`
        * `lr` - the learning rate is a hyperparameter that defines how big/small the optimizer changes the parameters with each step (a small `lr` results in small changes, a large `lr` results in large changes)

And specifically for PyTorch, we need:
* A training loop
* A testing loop

In [None]:
# Setup a loss function
loss_fn = nn.L1Loss()

# Setup an optimizer (SGD)
optimizer = torch.optim.SGD(
    model_0.parameters(), 
    lr=0.01)

**Q**: Which loss function and opimizer should i use?  
**A**: This will be problem specific. But with experience, one'll get an idea of what works and what does not with your particular problem set.

For example, for a regression problem (like this one), a loss function of `nn.L1Loss` and an optimizer like `torch.optim.SGD()` will suffice.  
But for a classification problem like classfying whether a photo is of a dog or a cat, one will likely want to use a loss function of `nn.BCELoss()` (binary cross entropy loss).

### Building a training (and testing) loop in PyTorch

A couple of things we need in a training loop:

0. Loop through the data and do the following: 
1. Forward pass (this involves data moving throuwh our model's `forward()` functions) to make predictions on data - also called forward propagation
2. Calculate the loss (compare forward pass predictions to ground truth labels)
3. Optimizer zero grad
4. Loss backward - move backwards through the network to calculate the gradients of each of the parameters of our model with respect to the loss (**backpropagation**)
5. Optimizer step - Use the optimizer to adjust our model's parameters to try and improve the loss (**gradient descent**)

In [None]:
torch.manual_seed(42)

# An epoch is one loop through the entire dataset ... (this is a hyperparameter because we set it ourselves)
epochs = 200

epoch_count = []
loss_values = []
test_loss_values = []

## Training the model
# 0. Loop through the data
for epoch in range(epochs):
    model_0.train() # Set the model to train mode: Sets all parameters that require gradients to require gradients

    y_pred = model_0(X_train) # Make predictions
    
    loss = loss_fn(y_train, y_pred) # Calculate the loss

    optimizer.zero_grad() # Zero the gradients

    loss.backward() # Calculate gradients

    optimizer.step() # Update the weights

    # Testing / Evaulating the model
    model_0.eval() # Turns off different settings, which are not needed for testing/evaluation
    with torch.inference_mode(): # Turns off gradient tracking
        test_preds = model_0(X_test)  

        test_loss = loss_fn(y_test, test_preds)

    if epoch % 10 == 0:
        epoch_count.append(epoch)
        loss_values.append(loss)
        test_loss_values.append(test_loss)
        
        print(f"Epoch: {epoch} --- Loss: {loss} --- Test loss: {test_loss}") 
        print(f"Model state_dict: {model_0.state_dict()}")

In [None]:
# Plot the loss values over the epochs
plt.figure(figsize=(10, 7))
plt.plot(epoch_count, np.array(torch.tensor(loss_values).cpu().numpy()), label='training loss')
plt.plot(epoch_count, test_loss_values, label='test loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and test loss curves')
plt.legend()
plt.show()


In [None]:
with torch.inference_mode():
    y_preds_after_training = model_0(X_test)

In [None]:
plot_predictions(predictions=y_preds_after_training)

## Saving a model in PyTorch

There are three main methods one should know about for saving and loading models in PyTorsch.

1. `torch.save()` - allows you to save a PyTorch object in Python's .pkl-Format
2. `torch.load()` - allows you to load a saved PyTorch object
3. `torch.nn.Module.load_state_dict()` - allows to load a  model's saved state dictionary.

In [None]:
# 1. Create a directory to save the model
MODEL_PATH = Path('models')
MODEL_PATH.mkdir(parents=True, exist_ok=True)

# 2. Create model save path
MODEL_NAME = "01_pytorch_workflow_model_0.pth"
MODEL_SAVE_PATH = MODEL_PATH / MODEL_NAME

MODEL_SAVE_PATH

# 3. Save model's state_dict
print(f"Saving model to: {MODEL_SAVE_PATH}")
torch.save(
    obj=model_0.state_dict(), 
    f=MODEL_SAVE_PATH)

## Loading a PyTorch model

Since we saved our model's `state_dict()` rather the entire model, we'll create a new instance of our model class and load the saved `state_dict()` into that.

In [None]:
# To load in a saved state_dict we have to instantiate a new instance of our model class
loaded_model_0 = LinearRegressionModel()

# Load the saved state_dict of model_0 (this will update the new instance with updated parameters)
loaded_model_0.load_state_dict(torch.load(f=MODEL_SAVE_PATH))

In [None]:
loaded_model_0.state_dict(), model_0.state_dict()

In [None]:
# Moke some predictions with our loaded model
loaded_model_0.eval()
with torch.inference_mode():
    loaded_model_preds = loaded_model_0(X_test)

loaded_model_preds

In [None]:
# Compare loaded model preds with original model preds
y_preds_after_training == loaded_model_preds