# 07. PyTorch Experiment Tracking

Machine learning is very experimental.

In order to figure out which experiment are worth purshing, that's where **experiment tracking** comes in, it helps you to figure out what doesn't work so you can figure out what **does** work.

In this notebook, we're going to see an example of programmatically tracking experiments.

Resources:
* [Book version of notebook](https://www.learnpytorch.io/07_pytorch_experiment_tracking/)
* [Ask a question](https://github.com/mrdbourke/pytorch-deep-learning/discussions)
* [Extra-curriculum](https://madewithml.com/courses/mlops/experiment-tracking/)

In [1]:
import torch
import torchvision

torch.__version__, torchvision.__version__

('1.13.0+cu117', '0.14.0+cu117')

In [2]:
# Setup device-agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [3]:
# Set seeds
def set_seeds(seed: int = 42):
    """Sets random sets for torch operations.

    Args:
        seed (int, optional): Random seed to set. Defaults to 42.
    """
    # Set the seed for general torch operations
    torch.manual_seed(seed)
    # Set the seed for CUDA torch operations (ones that happen on the GPU)
    torch.cuda.manual_seed(seed)

In [4]:
set_seeds()

## 1. Get data

Want to get pizza, steak, sushi images.

So we can run experiments building FoodVision Mini and see which model performs best.

In [5]:
import os
import zipfile
import requests
from pathlib import Path


# Example source: https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip
def download_data(source: str,
                  destination: str,
                  remove_source: bool = True) -> Path:
    """Download a zipped data set from source and unzip to destinaiton."""
    # Setup path to data folder
    data_path = Path('data/')
    image_path = data_path / destination

    # If image folder doesn't exist, creat it
    if image_path.is_dir():
        print(f'[INFO] {image_path} already exist, skipping download')
    else:
        print(f'[INFO] {image_path} does not exist, creating')
        image_path.mkdir(parents=True, exist_ok=True)

        # Download the target data
        target_file = Path(source).name
        with open(data_path / target_file, 'wb') as f:
            request = requests.get(source)
            print(f'[INFO] Downloading {target_file} from {source}')
            f.write(request.content)

        # Unzip target file
        with zipfile.ZipFile(data_path / target_file, 'r') as f:
            print(f'[INFO] Unzipping {target_file} data')
            f.extractall(image_path)

        # Remove zip file
        if remove_source:
            os.remove(data_path / target_file)

    return image_path

In [6]:
image_path = download_data(source='https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip',
                           destination='pizza_steak_sushi')

image_path

[INFO] data\pizza_steak_sushi already exist, skipping download


WindowsPath('data/pizza_steak_sushi')

## 2. Create DataSets and DataLoaders

### 2.1 Create DataLoaders with manual transforms

The goal with transforms is to ensure your custom data is transformed in a reproducible way as well as a way that will suit pretrained models.

In [7]:
# Setup a directories
train_dir = image_path / 'train'
test_dir = image_path / 'test'

train_dir, test_dir

(WindowsPath('data/pizza_steak_sushi/train'),
 WindowsPath('data/pizza_steak_sushi/test'))

In [8]:
# Setup ImageNet normalization levels
# See here: https://pytorch.org/vision/stable/models.html
from going_modular.data_setup import create_dataloaders
from torchvision import transforms


normalize = transforms.Normalize(mean=[.485, .456, .406],
                                 std=[.229, .224, .225])

# Create transform pipeline manually
manual_transforms = transforms.Compose([
    transforms.Resize(size=(224, 224)),
    transforms.ToTensor(),
    normalize
])
print(f'Manually created transforms: {manual_transforms}')

# Create DataLoaders

train_dataloader, test_dataloader, class_names = create_dataloaders(train_dir=train_dir, test_dir=test_dir, transform=manual_transforms, batch_size=32)

train_dataloader, test_dataloader, class_names

Manually created transforms: Compose(
    Resize(size=(224, 224), interpolation=bilinear, max_size=None, antialias=None)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)


(<torch.utils.data.dataloader.DataLoader at 0x17f3b902860>,
 <torch.utils.data.dataloader.DataLoader at 0x17f3b903400>,
 ['pizza', 'steak', 'sushi'])

### 2.2 Create DataLoaders using automatically created transforms

The same principle applies for automatic transforms: we want our custom data in the same format as a pretrained model was trained on.

In [9]:
# Setup directories
import torchvision
train_dir = image_path / 'train'
test_dir = image_path / 'test'

# Setup pretrained weights
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT  # "DEFAULT" = best available

# Get transforms from weights (these are the transforms used to train a particular or obtain a particular set of weights)
automatic_transforms = weights.transforms()
print(f'Automatically created transforms: {automatic_transforms}')

# Create DataLoaders
train_dataloader, test_dataloader, class_names = create_dataloaders(train_dir=train_dir, test_dir=test_dir, transform=automatic_transforms, batch_size=32)

train_dataloader, test_dataloader, class_names

Automatically created transforms: ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)


(<torch.utils.data.dataloader.DataLoader at 0x17f3b902620>,
 <torch.utils.data.dataloader.DataLoader at 0x17f3b902b60>,
 ['pizza', 'steak', 'sushi'])

## 3. Getting a pretrained model, freeze the base layers and change the classifier head

In [10]:
# Download the pretrained weights for efficientNet_B0
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT

# Setup the model with the pretrained weights and send it to the target device
model = torchvision.models.efficientnet_b0(weights=weights).to(device)

In [11]:
# Free all base layers by setting their required_grad attribute to False
for param in model.features.parameters():
    param.requires_grad = False

In [12]:
# Chage the classifier hear
from torch import nn
model.classifier = nn.Sequential(
    nn.Dropout(p=.2, inplace=True),
    nn.Linear(in_features=1280, out_features=len(class_names))
).to(device)

In [13]:
from torchinfo import summary


summary(model=model,
        input_size=(32, 3, 224, 224),
        verbose=0,
        col_names=['input_size', 'output_size', 'num_params', 'trainable'],
        row_settings=['var_names'])

Layer (type (var_name))                                      Input Shape               Output Shape              Param #                   Trainable
EfficientNet (EfficientNet)                                  [32, 3, 224, 224]         [32, 3]                   --                        Partial
├─Sequential (features)                                      [32, 3, 224, 224]         [32, 1280, 7, 7]          --                        False
│    └─Conv2dNormActivation (0)                              [32, 3, 224, 224]         [32, 32, 112, 112]        --                        False
│    │    └─Conv2d (0)                                       [32, 3, 224, 224]         [32, 32, 112, 112]        (864)                     False
│    │    └─BatchNorm2d (1)                                  [32, 32, 112, 112]        [32, 32, 112, 112]        (64)                      False
│    │    └─SiLU (2)                                         [32, 32, 112, 112]        [32, 32, 112, 112]        --         

## 4. Train a single model and track results

In [14]:
# Define loss function and optimizer
loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(params=model.parameters(),
                             lr=.001)

To track experiments, we're going to use [TensorBoard](https://www.tensorflow.org/tensorboard?hl=zh-tw)

And to interact with TensorBoard, we can use PyTorch's [SummaryWriter](https://pytorch.org/docs/stable/tensorboard.html)
  * Also see [here](https://pytorch.org/docs/stable/tensorboard.html#torch.utils.tensorboard.writer.SummaryWriter)

In [15]:
# Setup a SummaryWriter
from torch.utils.tensorboard import SummaryWriter


writer = SummaryWriter()
writer

<torch.utils.tensorboard.writer.SummaryWriter at 0x17f43573970>

In [16]:
from tqdm.auto import tqdm
from typing import Dict, List, Tuple
from going_modular.engine import train_step, test_step


def train(model: torch.nn.Module,
          train_dataloader: torch.utils.data.DataLoader,
          test_dataloader: torch.utils.data.DataLoader,
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int,
          device: torch.device) -> Dict[str, List[float]]:
    """Trains and tests a PyTorch model.

    Passes a target PyTorch models through train_step() and test_step()
    functions for a number of epochs, training and testing the model
    in the same epoch loop.

    Calculates, prints and stores evaluation metrics throughout.

    Args:
      model: A PyTorch model to be trained and tested.
      train_dataloader: A DataLoader instance for the model to be trained on.
      test_dataloader: A DataLoader instance for the model to be tested on.
      optimizer: A PyTorch optimizer to help minimize the loss function.
      loss_fn: A PyTorch loss function to calculate loss on both datasets.
      epochs: An integer indicating how many epochs to train for.
      device: A target device to compute on (e.g. "cuda" or "cpu").

    Returns:
      A dictionary of training and testing loss as well as training and
      testing accuracy metrics. Each metric has a value in a list for 
      each epoch.
      In the form: {train_loss: [...],
                    train_acc: [...],
                    test_loss: [...],
                    test_acc: [...]} 
      For example if training for epochs=2: 
                   {train_loss: [2.0616, 1.0537],
                    train_acc: [0.3945, 0.3945],
                    test_loss: [1.2641, 1.5706],
                    test_acc: [0.3400, 0.2973]} 
    """
    # Create empty results dictionary
    results = {"train_loss": [],
               "train_acc": [],
               "test_loss": [],
               "test_acc": []
               }

    # Loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                           dataloader=train_dataloader,
                                           loss_fn=loss_fn,
                                           optimizer=optimizer,
                                           device=device)
        test_loss, test_acc = test_step(model=model,
                                        dataloader=test_dataloader,
                                        loss_fn=loss_fn,
                                        device=device)

        # Print out what's happening
        print(
            f"Epoch: {epoch+1} | "
            f"train_loss: {train_loss:.4f} | "
            f"train_acc: {train_acc:.4f} | "
            f"test_loss: {test_loss:.4f} | "
            f"test_acc: {test_acc:.4f}"
        )

        # Update results dictionary
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

        ### New: Experiment tracking ###
        writer.add_scalars(main_tag='Loss',
                           tag_scalar_dict={'train_loss': train_loss,
                                            'test_loss': test_loss},
                           global_step=epoch)

        writer.add_scalars(main_tag='Accuracy',
                           tag_scalar_dict={'train_acc': train_acc,
                                            'test_acc': test_acc},
                           global_step=epoch)

        writer.add_graph(model=model,
                         input_to_model=torch.randn(32, 3, 224, 224).to(device))

    # Close the writer
    writer.close()
    ### End new ###

    # Return the filled results at the end of the epochs
    return results

ImportError: cannot import name 'train_step' from partially initialized module 'going_modular.engine' (most likely due to a circular import) (d:\Side_project\Udemy\PyTorch_for_Deep_Learning_in_2023\going_modular\engine.py)

In [None]:
# Train model
# Note: not using engine.train(), since we updated the train() function above
set_seeds()

results = train(model=model,
                train_dataloader=train_dataloader,
                test_dataloader=test_dataloader,
                optimizer=optimizer,
                loss_fn=loss_fn,
                epochs=5,
                device=device)

  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0744 | train_acc: 0.4219 | test_loss: 0.8657 | test_acc: 0.7737
Epoch: 2 | train_loss: 0.8918 | train_acc: 0.6641 | test_loss: 0.7681 | test_acc: 0.7945
Epoch: 3 | train_loss: 0.7432 | train_acc: 0.7500 | test_loss: 0.6500 | test_acc: 0.8655
Epoch: 4 | train_loss: 0.6621 | train_acc: 0.8867 | test_loss: 0.6412 | test_acc: 0.8456
Epoch: 5 | train_loss: 0.6875 | train_acc: 0.7617 | test_loss: 0.6665 | test_acc: 0.8144


## 5. View our model's results with TensorBoard

There are a few ways to view TensorBoard results, see them [here](https://www.learnpytorch.io/07_pytorch_experiment_tracking/#5-view-our-models-results-in-tensorboard)

In [None]:
# Let's view our experiments from within the notebook
%load_ext tensorboard
%tensorboard --logdir runs

The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard


Reusing TensorBoard on port 6006 (pid 25292), started 0:00:27 ago. (Use '!kill 25292' to kill it.)

## 6. Create a function to prepare a `SummaryWriter()` instance

By default our `SummaryWriter()` saves to `log_dir`.

How about if we wanted to save different experiments to different folders?

In essence, one experiment = one folder

For example, we'd like to track:
* Experiment date/timestep
* Experiment name
* Model name
* Extra - is there anything else that should be tracked?

Let's create a function to create a `SummaryWriter()` instance to take all of these things into account.

So ideally we end up tracking experiments to a directory:

`run/YYYY-MM-DD/experiment_name/model_name/extra`

In [None]:
from torch.utils.tensorboard.writer import SummaryWriter


def create_writer(experiment_name: str,
                  model_name: str,
                  extra: str = None):
    """Creat a torch.utils.tensorboard.writer.SummaryWriter() instance tracking to a specific directory."""
    from datetime import datetime
    import os

    # Get timestamp of current date in reverse order
    timestamp = datetime.now().strftime('%Y-%m-%d')

    if extra:
        # Create log directory path
        log_dir = os.path.join('runs', timestamp, experiment_name, model_name, extra)
    else:
        log_dir = os.path.join('runs', timestamp, experiment_name, model_name)

    print(f'[INFO] Created SummaryWriter saving to {log_dir}')

    return SummaryWriter(log_dir=log_dir)

In [None]:
example_writer = create_writer(experiment_name='data_10_percent',
                               model_name='effnetb0',
                               extra='5_epochs')

example_writer

[INFO] Created SummaryWriter saving to runs\2023-07-17\data_10_percent\effnetb0\5_epochs


<torch.utils.tensorboard.writer.SummaryWriter at 0x1c85fc410c0>

### 6.1 Update the `train()` function to include a `writer` parameter

In [None]:
from tqdm.auto import tqdm
from typing import Dict, List, Tuple
from going_modular.engine import train_step, test_step


def train(model: torch.nn.Module,
          train_dataloader: torch.utils.data.DataLoader,
          test_dataloader: torch.utils.data.DataLoader,
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int,
          device: torch.device,
          writer: torch.utils.tensorboard.writer.SummaryWriter) -> Dict[str, List[float]]:
    """Trains and tests a PyTorch model.

    Passes a target PyTorch models through train_step() and test_step()
    functions for a number of epochs, training and testing the model
    in the same epoch loop.

    Calculates, prints and stores evaluation metrics throughout.

    Args:
      model: A PyTorch model to be trained and tested.
      train_dataloader: A DataLoader instance for the model to be trained on.
      test_dataloader: A DataLoader instance for the model to be tested on.
      optimizer: A PyTorch optimizer to help minimize the loss function.
      loss_fn: A PyTorch loss function to calculate loss on both datasets.
      epochs: An integer indicating how many epochs to train for.
      device: A target device to compute on (e.g. "cuda" or "cpu").

    Returns:
      A dictionary of training and testing loss as well as training and
      testing accuracy metrics. Each metric has a value in a list for 
      each epoch.
      In the form: {train_loss: [...],
                    train_acc: [...],
                    test_loss: [...],
                    test_acc: [...]} 
      For example if training for epochs=2: 
                   {train_loss: [2.0616, 1.0537],
                    train_acc: [0.3945, 0.3945],
                    test_loss: [1.2641, 1.5706],
                    test_acc: [0.3400, 0.2973]} 
    """
    # Create empty results dictionary
    results = {"train_loss": [],
               "train_acc": [],
               "test_loss": [],
               "test_acc": []
               }

    # Loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                           dataloader=train_dataloader,
                                           loss_fn=loss_fn,
                                           optimizer=optimizer,
                                           device=device)
        test_loss, test_acc = test_step(model=model,
                                        dataloader=test_dataloader,
                                        loss_fn=loss_fn,
                                        device=device)

        # Print out what's happening
        print(
            f"Epoch: {epoch+1} | "
            f"train_loss: {train_loss:.4f} | "
            f"train_acc: {train_acc:.4f} | "
            f"test_loss: {test_loss:.4f} | "
            f"test_acc: {test_acc:.4f}"
        )

        # Update results dictionary
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

        ### New: Experiment tracking ###
        if writer:
            writer.add_scalars(main_tag='Loss',
                            tag_scalar_dict={'train_loss': train_loss,
                                                'test_loss': test_loss},
                            global_step=epoch)

            writer.add_scalars(main_tag='Accuracy',
                            tag_scalar_dict={'train_acc': train_acc,
                                                'test_acc': test_acc},
                            global_step=epoch)

            writer.add_graph(model=model,
                            input_to_model=torch.randn(32, 3, 224, 224).to(device))

            # Close the writer
            writer.close()
            ### End new ###
        else:
            pass

    # Return the filled results at the end of the epochs
    return results

## 7. Setting up a series of modelling experiments

* Setup 2x modelling experiments with effnetb0, pizza, steak, sushi data and train one model for 5 epochs and another model for 10 epochs

### 7.1 What kind of experiments should you run?

The number of machine learning experiments you can run, is like the number of different models you can build... almost limitless.

However, you can't test everything...

So what should you test?
* Change the number of epochs
* Change the number of hidden layer/units
* Change the amount of data (right now we're using 10% of the Food101 dataset for pizza, steak, sushi)
* Change the learning rate
* Try different kinds of data augmentation
* Choose a different model architectures

This is why transfer learning is so powerful, because, it's a working model that you can apply to your own problem.

### 7.2 What experiments are we going to run?

We're going to turn three dials:
1. Model size - Effnetb0 vs EffnetB2 (in terms of number of parameters)
2. Dataset size - 10% of pizza, steak, sushi images vs 20% (generally more data = better results)
3. Training time - 5 epochs vs 10 epochs (generally longer training time = better results, up to a point)

To begin, we're still keeping things relatively small so that our experiments run quickly.

**Our goal:** a model that is well performing but still small enough to run on a mobile device or web browser, so FoodVision Mini can come to life.

If you had infinite compute + time, you should basically always choose the biggest model and biggest data you can.


### 7.3 Download different dataset

We want two datasets:
1. [Pizza, steak, sushi 10%](https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip)
2. [Pizza, steak, sushi 20%](https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip)

In [18]:
# Download 10 percent and 20 percent datasets
data_10_percent_path = download_data(source='https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip', destination='pizza_steak_sushi')

data_20_percent_path = download_data(source='https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip', destination='pizza_steak_sushi_20_percent')

[INFO] data\pizza_steak_sushi already exist, skipping download
[INFO] data\pizza_steak_sushi_20_percent already exist, skipping download
