# Project 2: A minimal model training experiment

**Goal**:

- Create a [PyTorch LightningModule](https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html) named `ImageClassifier` that holds a convolutional network with ResNet18 backbone.
- Learn to adapt the dataset class to be compatible with the model.
- Train a model using the dataset class and the dataloader created in the previous object.
- Understand the concept of fine-tuning and the benefits of starting from a pre-trained model.
- Understand the benefits and options of a dataloader.
- Learn how to visualize model predictions.

**Acceptance Criteria**:

- Implement a test that checks a simple `ImageClassifier` can be predict on an image that has the correct shape.
- The `ImageClassifier` can be trained on the CIFAR10 dataset, showing decreasing loss and accuracy for several epochs.

## Step 1: Create a model using `LightningModule`

`LightningModule` is a convenient and structured way to implement a PyTorch model, as well as its training and validation behaviors. For more information, please refer to the [documentation](https://pytorch-lightning.readthedocs.io/en/stable/common/lightning_module.html) for `LightningModule`.

In [3]:
# Install the dependencies:
!pip install torch==1.12.0 torchvision==0.13.0 pytorch-lightning==1.6.4

Collecting torch==1.12.0
  Using cached torch-1.12.0-cp38-none-macosx_10_9_x86_64.whl (137.6 MB)
Collecting torchvision==0.13.0
  Using cached torchvision-0.13.0-cp38-cp38-macosx_10_9_x86_64.whl (1.3 MB)
Collecting pytorch-lightning
  Downloading pytorch_lightning-1.6.4-py3-none-any.whl (585 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m585.5/585.5 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0mm eta [36m0:00:01[0m0:01[0m:01[0m
[?25hCollecting typing-extensions
  Downloading typing_extensions-4.3.0-py3-none-any.whl (25 kB)
Collecting protobuf<=3.20.1
  Downloading protobuf-3.20.1-cp38-cp38-macosx_10_9_x86_64.whl (962 kB)
[2K     [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m962.3/962.3 kB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m MB/s[0m eta [36m0:00:01[0m
[?25hCollecting pyDeprecate>=0.3.1
  Downloading pyDeprecate-0.3.2-py3-none-any.whl (10 kB)
Collecting tensorboard>=2.2.0
  Downloading tensorboard-2.9.1-py3-none-any.wh

In [8]:
import pytorch_lightning as pl
from torchvision.models import resnet18

class ImageClassifier(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.net = resnet18(num_classes=10, weights=None)
    
    def forward(self, x):
        return self.net(x)

In [9]:
model = ImageClassifier()

### Testing the forward path of the model

As a trainable function, the model is callable. The model's forward path (as defined in `forward()`) is the normal execution of the function. We can test this by passing an image to the model.

In [13]:
assert callable(model)  # is the model callable?

#### What is a valid input?

To test the forward path of the model, we need to pass a valid input. The input should be a tensor of shape `(b, 3, 32, 32)` where `b` is the batch size (an integer).

**Your Task**: Fix the code below to pass the test.

**Tips:** `torch.from_numpy()` can be used to convert a numpy array to a tensor.


In [None]:
import numpy as np
import torch

# TODO: The test is broken. Please fix it.
def test_model_can_predict_on_a_random_image():
    input_image = np.ones(shape=(3, 224, 224), dtype=np.float32)
    output = model(input_image)  # run the model on the input image


test_model_can_predict_on_a_random_image()

## Step 2: Prepare the dataset and data loader

Please reuse the `CustomDataset` and dataloader from the previous object.

In [None]:
# TODO: copy over the code from the previous project.

## Step 3: Training the model

We can start adding the training behavior to the model. To train a model using backpropagation, we need to define a loss function, an optimizer and a training step.

In [None]:
from torch.nn import functional as F


class ImageClassifier(pl.LightningModule):
    def __init__(self):
        super().__init__()
        self.net = resnet18(num_classes=10, weights=None)
    
    def forward(self, x):
        return self.net(x)
    
    def configure_optimizers(self):
        return torch.optim.Adam(self.parameters(), lr=0.001)
    
    def loss(self, y_hat, y):
        return F.cross_entropy(y_hat, y)
    
    def training_step(self, batch, batch_idx):
        x, y = batch
        y_hat = self(x)
        loss = self.loss(y_hat, y)
        return {'loss': loss}

In [None]:
def test_training_step_works():
    # TODO: Use the data loader to get a batch of data.
    #    Then, call the `training_step` and assert that the return value is a dictionary that looks
    #    like this: {'loss': 0.5}.
    ...