<a href="https://colab.research.google.com/github/mrdbourke/pytorch-deep-learning/blob/main/extras/exercises/01_pytorch_workflow_exercises.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 01. PyTorch Workflow Exercise Template

The following is a template for the PyTorch workflow exercises.

It's only starter code and it's your job to fill in the blanks.

Because of the flexibility of PyTorch, there may be more than one way to answer the question.

Don't worry about trying to be *right* just try writing code that suffices the question.

You can see one form of [solutions on GitHub](https://github.com/mrdbourke/pytorch-deep-learning/tree/main/extras/solutions) (but try the exercises below yourself first!).

In [10]:
# Import necessary libraries
import torch
from torch import nn
from pathlib import Path
import matplotlib.pyplot as plt

In [2]:
# Setup device-agnostic code

## 1. Create a straight line dataset using the linear regression formula (`weight * X + bias`).
  * Set `weight=0.3` and `bias=0.9` there should be at least 100 datapoints total. 
  * Split the data into 80% training, 20% testing.
  * Plot the training and testing data so it becomes visual.

Your output of the below cell should look something like:
```
Number of X samples: 100
Number of y samples: 100
First 10 X & y samples:
X: tensor([0.0000, 0.0100, 0.0200, 0.0300, 0.0400, 0.0500, 0.0600, 0.0700, 0.0800,
        0.0900])
y: tensor([0.9000, 0.9030, 0.9060, 0.9090, 0.9120, 0.9150, 0.9180, 0.9210, 0.9240,
        0.9270])
```

Of course the numbers in `X` and `y` may be different but ideally they're created using the linear regression formula.

In [3]:
# Create the data parameters
weight = 0.3
bias = 0.9

# Make X and y using linear regression feature
X = torch.arange(0, 1, 0.01)
y = weight*X + bias

print(f"Number of X samples: {len(X)}")
print(f"Number of y samples: {len(y)}")
print(f"First 10 X & y samples:\nX: {X[:10]}\ny: {y[:10]}")

Number of X samples: 100
Number of y samples: 100
First 10 X & y samples:
X: tensor([0.0000, 0.0100, 0.0200, 0.0300, 0.0400, 0.0500, 0.0600, 0.0700, 0.0800,
        0.0900])
y: tensor([0.9000, 0.9030, 0.9060, 0.9090, 0.9120, 0.9150, 0.9180, 0.9210, 0.9240,
        0.9270])


In [4]:
# Split the data into training and testing
num_train = int(.8*len(X))

train_X, train_y, test_X, test_y = X[:num_train], y[:num_train], X[num_train:], y[num_train:]

In [5]:
# Plot the training and testing data 
# plt.scatter(train_X, train_y, c="g", label="Train")
# plt.scatter(test_X, test_y, c="r", label="Test")

## 2. Build a PyTorch model by subclassing `nn.Module`. 
  * Inside should be a randomly initialized `nn.Parameter()` with `requires_grad=True`, one for `weights` and one for `bias`. 
  * Implement the `forward()` method to compute the linear regression function you used to create the dataset in 1. 
  * Once you've constructed the model, make an instance of it and check its `state_dict()`.
  * **Note:** If you'd like to use `nn.Linear()` instead of `nn.Parameter()` you can.

In [14]:
# Create PyTorch linear regression model by subclassing nn.Module
class LinearModel(nn.Module):
    """Implements a one-dimensional linear regression model"""
    def __init__(self):
        super().__init__()
        
        self.weights = nn.Parameter(torch.randn(1, dtype=torch.float, requires_grad=True)) # Weight parameter
        self.bias = nn.Parameter(torch.randn(1, dtype=torch.float, requires_grad=True)) # Bias parameter
        
        
    def forward(self, x):
        return self.weights*x + self.bias
    
linreg = LinearModel()
linreg.state_dict()

OrderedDict([('weights', tensor([0.5821])), ('bias', tensor([0.4041]))])

In [7]:
# Instantiate the model and put it to the target device


## 3. Create a loss function and optimizer using `nn.L1Loss()` and `torch.optim.SGD(params, lr)` respectively. 
  * Set the learning rate of the optimizer to be 0.01 and the parameters to optimize should be the model parameters from the model you created in 2.
  * Write a training loop to perform the appropriate training steps for 300 epochs.
  * The training loop should test the model on the test dataset every 20 epochs.

In [22]:
# Create the loss function and optimizer
l1_loss = nn.L1Loss()
optim = torch.optim.SGD(linreg.parameters(), lr=0.01)

In [25]:
# Training loop


# Train model for 300 epochs
epochs=300

# Send data to target device


for epoch in range(epochs):
  ### Training

  # Put model in train mode
    linreg.train()

  # 1. Forward pass
    pred_y = linreg.forward(train_X)

  # 2. Calculate loss
    loss = l1_loss(pred_y, train_y)

  # 3. Zero gradients
    optim.zero_grad()

  # 4. Backpropagation
    loss.backward()

  # 5. Step the optimizer
    optim.step()

  ### Perform testing every 20 epochs
    if epoch % 20 == 0:

    # Put model in evaluation mode and setup inference context 
        linreg.eval()
        with torch.inference_mode():
      # 1. Forward pass
            pred_y = linreg(test_X)
      # 2. Calculate test loss
            test_loss = l1_loss(pred_y, test_y)
      # Print out what's happening
            print(f"Epoch: {epoch} | Train loss: {loss:.3f} | Test loss: {test_loss:.3f}")

Epoch: 0 | Train loss: 0.361 | Test loss: 0.203
Epoch: 20 | Train loss: 0.131 | Test loss: 0.065
Epoch: 40 | Train loss: 0.079 | Test loss: 0.156
Epoch: 60 | Train loss: 0.071 | Test loss: 0.160
Epoch: 80 | Train loss: 0.064 | Test loss: 0.147
Epoch: 100 | Train loss: 0.057 | Test loss: 0.131
Epoch: 120 | Train loss: 0.050 | Test loss: 0.116
Epoch: 140 | Train loss: 0.043 | Test loss: 0.101
Epoch: 160 | Train loss: 0.036 | Test loss: 0.085
Epoch: 180 | Train loss: 0.029 | Test loss: 0.068
Epoch: 200 | Train loss: 0.023 | Test loss: 0.052
Epoch: 220 | Train loss: 0.016 | Test loss: 0.036
Epoch: 240 | Train loss: 0.009 | Test loss: 0.020
Epoch: 260 | Train loss: 0.002 | Test loss: 0.004
Epoch: 280 | Train loss: 0.006 | Test loss: 0.010


## 4. Make predictions with the trained model on the test data.
  * Visualize these predictions against the original training and testing data (**note:** you may need to make sure the predictions are *not* on the GPU if you want to use non-CUDA-enabled libraries such as matplotlib to plot).

In [26]:
# Make predictions with the model
linreg.eval()

with torch.inference_mode():
    pred_y = linreg(test_X)
    print(pred_y, test_y)

tensor([1.1356, 1.1387, 1.1417, 1.1447, 1.1478, 1.1508, 1.1538, 1.1569, 1.1599,
        1.1630, 1.1660, 1.1690, 1.1721, 1.1751, 1.1781, 1.1812, 1.1842, 1.1872,
        1.1903, 1.1933]) tensor([1.1400, 1.1430, 1.1460, 1.1490, 1.1520, 1.1550, 1.1580, 1.1610, 1.1640,
        1.1670, 1.1700, 1.1730, 1.1760, 1.1790, 1.1820, 1.1850, 1.1880, 1.1910,
        1.1940, 1.1970])


In [None]:
# Plot the predictions (these may need to be on a specific device)


## 5. Save your trained model's `state_dict()` to file.
  * Create a new instance of your model class you made in 2. and load in the `state_dict()` you just saved to it.
  * Perform predictions on your test data with the loaded model and confirm they match the original model predictions from 4.

In [28]:
from pathlib import Path

# 1. Create models directory 
MODEL_PATH = Path("models")
MODEL_PATH.mkdir(parents=True, exist_ok=True)
# 2. Create model save path 
MODEL_NAME = "pytorch_linreg.pth"
MODEL_SAVE_PATH = MODEL_PATH / MODEL_NAME
# 3. Save the model state dict
torch.save(obj=linreg.state_dict(), f=MODEL_SAVE_PATH)

In [29]:
# Create new instance of model and load saved state dict (make sure to put it on the target device)
loaded_linreg = LinearModel()

loaded_linreg.load_state_dict(torch.load(f=MODEL_SAVE_PATH))

<All keys matched successfully>

In [30]:
# Make predictions with loaded model and compare them to the previous
pred_y_loaded = loaded_linreg.forward(test_X)
pred_y, pred_y_loaded

(tensor([1.1356, 1.1387, 1.1417, 1.1447, 1.1478, 1.1508, 1.1538, 1.1569, 1.1599,
         1.1630, 1.1660, 1.1690, 1.1721, 1.1751, 1.1781, 1.1812, 1.1842, 1.1872,
         1.1903, 1.1933]),
 tensor([1.1356, 1.1387, 1.1417, 1.1447, 1.1478, 1.1508, 1.1538, 1.1569, 1.1599,
         1.1630, 1.1660, 1.1690, 1.1721, 1.1751, 1.1781, 1.1812, 1.1842, 1.1872,
         1.1903, 1.1933], grad_fn=<AddBackward0>))