<a href="https://colab.research.google.com/github/mrdbourke/pytorch-deep-learning/blob/main/extras/exercises/07_pytorch_experiment_tracking_exercise_template.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 07. PyTorch Experiment Tracking Exercise Template

Welcome to the 07. PyTorch Experiment Tracking exercise template notebook.

> **Note:** There may be more than one solution to each of the exercises. This notebook only shows one possible example.

## Resources

1. These exercises/solutions are based on [section 07. PyTorch Transfer Learning](https://www.learnpytorch.io/07_pytorch_experiment_tracking/) of the Learn PyTorch for Deep Learning course by Zero to Mastery.
2. See a live [walkthrough of the solutions (errors and all) on YouTube](https://youtu.be/cO_r2FYcAjU).
3. See [other solutions on the course GitHub](https://github.com/mrdbourke/pytorch-deep-learning/tree/main/extras/solutions).

> **Note:** The first section of this notebook is dedicated to getting various helper functions and datasets used for the exercises. The exercises start at the heading "Exercise 1: ...".

### Get various imports and helper functions

We'll need to make sure we have `torch` v.1.12+ and `torchvision` v0.13+.

In [1]:
from helper_functions import plot_predictions, plot_decision_boundary, accuracy_fn
from going_modular import engine, data_loaders
from going_modular.engine import train_step, test_step
from going_modular.utils import save_model
import mlxtend
from mlxtend.plotting import plot_confusion_matrix
import numpy as np
import os
import pandas as pd
from pathlib import Path

import matplotlib.pyplot as plt
import pandas as pd
from PIL import Image
import random
import requests
import sklearn
from sklearn.datasets import make_circles
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_blobs
from torchinfo import summary
import torch
from torch import nn
from torch.utils.tensorboard import SummaryWriter
from torchmetrics import Accuracy, ConfusionMatrix
import torchvision
from torchvision import datasets

from torchvision import transforms
from torchvision.transforms import ToTensor
from torch.utils.data import DataLoader, Dataset
from timeit import default_timer as timer
from tqdm.auto import tqdm
from typing import Tuple, Dict, List
writer = SummaryWriter()
import zipfile



  from .autonotebook import tqdm as notebook_tqdm


In [None]:
# # For this notebook to run with updated APIs, we need torch 1.12+ and torchvision 0.13+
# try:
#     import torch
#     import torchvision
#     assert int(torch.__version__.split(".")[1]) >= 12, "torch version should be 1.12+"
#     assert int(torchvision.__version__.split(".")[1]) >= 13, "torchvision version should be 0.13+"
#     print(f"torch version: {torch.__version__}")
#     print(f"torchvision version: {torchvision.__version__}")
# except:
#     print(f"[INFO] torch/torchvision versions not as required, installing nightly versions.")
#     !pip3 install -U --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cu113
#     import torch
#     import torchvision
#     print(f"torch version: {torch.__version__}")
#     print(f"torchvision version: {torchvision.__version__}")

torch version: 1.13.0.dev20220622+cu113
torchvision version: 0.14.0.dev20220622+cu113


In [2]:
print(f"torch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")

torch version: 2.3.0
torchvision version: 0.18.0


In [3]:
# Make sure we have a GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [None]:
# # Get regular imports 
# import matplotlib.pyplot as plt
# import torch
# import torchvision

# from torch import nn
# from torchvision import transforms

# # Try to get torchinfo, install it if it doesn't work
# try:
#     from torchinfo import summary
# except:
#     print("[INFO] Couldn't find torchinfo... installing it.")
#     !pip install -q torchinfo
#     from torchinfo import summary

# # Try to import the going_modular directory, download it from GitHub if it doesn't work
# try:
#     from going_modular.going_modular import data_setup, engine
# except:
#     # Get the going_modular scripts
#     print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
#     !git clone https://github.com/mrdbourke/pytorch-deep-learning
#     !mv pytorch-deep-learning/going_modular .
#     !rm -rf pytorch-deep-learning
#     from going_modular.going_modular import data_setup, engine

In [4]:
# Set seeds
def set_seeds(seed: int=42):
    """Sets random sets for torch operations.

    Args:
        seed (int, optional): Random seed to set. Defaults to 42.
    """
    # Set the seed for general torch operations
    torch.manual_seed(seed)
    # Set the seed for CUDA torch operations (ones that happen on the GPU)
    torch.cuda.manual_seed(seed)

In [8]:
import os
import zipfile

from pathlib import Path

import requests

def download_data(source: str, 
                  destination: str,
                  data_dir = '../data',
                  remove_source: bool = True) -> Path:
    """Downloads a zipped dataset from source and unzips to destination.

    Args:
        source (str): A link to a zipped file containing data.
        destination (str): A target directory to unzip data to.
        remove_source (bool): Whether to remove the source after downloading and extracting.
    
    Returns:
        pathlib.Path to downloaded data.
    
    Example usage:
        download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                      destination="pizza_steak_sushi")
    """
    # Setup path to data folder
    data_path = Path(data_dir)
    image_path = data_path / destination

    # If the image folder doesn't exist, download it and prepare it... 
    if image_path.is_dir():
        print(f"[INFO] {image_path} directory exists, skipping download.")
    else:
        print(f"[INFO] Did not find {image_path} directory, creating one...")
        image_path.mkdir(parents=True, exist_ok=True)
        
        # Download pizza, steak, sushi data
        target_file = Path(source).name
        with open(data_path / target_file, "wb") as f:
            request = requests.get(source)
            print(f"[INFO] Downloading {target_file} from {source}...")
            f.write(request.content)

        # Unzip pizza, steak, sushi data
        with zipfile.ZipFile(data_path / target_file, "r") as zip_ref:
            print(f"[INFO] Unzipping {target_file} data...") 
            zip_ref.extractall(image_path)

        # Remove .zip file
        if remove_source:
            os.remove(data_path / target_file)
    
    return image_path

image_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                           destination="pizza_steak_sushi_1")
image_path

[INFO] ..\data\pizza_steak_sushi_1 directory exists, skipping download.


WindowsPath('../data/pizza_steak_sushi_1')

In [10]:
from torch.utils.tensorboard import SummaryWriter
def create_writer(experiment_name: str, 
                  model_name: str, 
                  extra: str=None):
    """Creates a torch.utils.tensorboard.writer.SummaryWriter() instance saving to a specific log_dir.

    log_dir is a combination of runs/timestamp/experiment_name/model_name/extra.

    Where timestamp is the current date in YYYY-MM-DD format.

    Args:
        experiment_name (str): Name of experiment.
        model_name (str): Name of model.
        extra (str, optional): Anything extra to add to the directory. Defaults to None.

    Returns:
        torch.utils.tensorboard.writer.SummaryWriter(): Instance of a writer saving to log_dir.

    Example usage:
        # Create a writer saving to "runs/2022-06-04/data_10_percent/effnetb2/5_epochs/"
        writer = create_writer(experiment_name="data_10_percent",
                               model_name="effnetb2",
                               extra="5_epochs")
        # The above is the same as:
        writer = SummaryWriter(log_dir="runs/2022-06-04/data_10_percent/effnetb2/5_epochs/")
    """
    from datetime import datetime
    import os

    # Get timestamp of current date (all experiments on certain day live in same folder)
    timestamp = datetime.now().strftime("%Y-%m-%d") # returns current date in YYYY-MM-DD format

    if extra:
        # Create log directory path
        log_dir = os.path.join("runs", timestamp, experiment_name, model_name, extra)
    else:
        log_dir = os.path.join("runs", timestamp, experiment_name, model_name)
        
    print(f"[INFO] Created SummaryWriter, saving to: {log_dir}...")
    return SummaryWriter(log_dir=log_dir)

In [11]:
# Create a test writer
writer = create_writer(experiment_name="test_experiment_name",
                       model_name="this_is_the_model_name",
                       extra="add_a_little_extra_if_you_want")

[INFO] Created SummaryWriter, saving to: runs\2024-09-19\test_experiment_name\this_is_the_model_name\add_a_little_extra_if_you_want...


In [12]:
# from typing import Dict, List
# from tqdm.auto import tqdm

# from going_modular.going_modular.engine import train_step, test_step

# Add writer parameter to train()
def train(model: torch.nn.Module, 
          train_dataloader: torch.utils.data.DataLoader, 
          test_dataloader: torch.utils.data.DataLoader, 
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int,
          device: torch.device, 
          writer: torch.utils.tensorboard.writer.SummaryWriter # new parameter to take in a writer
          ) -> Dict[str, List]:
    """Trains and tests a PyTorch model.

    Passes a target PyTorch models through train_step() and test_step()
    functions for a number of epochs, training and testing the model
    in the same epoch loop.

    Calculates, prints and stores evaluation metrics throughout.

    Stores metrics to specified writer log_dir if present.

    Args:
      model: A PyTorch model to be trained and tested.
      train_dataloader: A DataLoader instance for the model to be trained on.
      test_dataloader: A DataLoader instance for the model to be tested on.
      optimizer: A PyTorch optimizer to help minimize the loss function.
      loss_fn: A PyTorch loss function to calculate loss on both datasets.
      epochs: An integer indicating how many epochs to train for.
      device: A target device to compute on (e.g. "cuda" or "cpu").
      writer: A SummaryWriter() instance to log model results to.

    Returns:
      A dictionary of training and testing loss as well as training and
      testing accuracy metrics. Each metric has a value in a list for 
      each epoch.
      In the form: {train_loss: [...],
                train_acc: [...],
                test_loss: [...],
                test_acc: [...]} 
      For example if training for epochs=2: 
              {train_loss: [2.0616, 1.0537],
                train_acc: [0.3945, 0.3945],
                test_loss: [1.2641, 1.5706],
                test_acc: [0.3400, 0.2973]} 
    """
    # Create empty results dictionary
    results = {"train_loss": [],
               "train_acc": [],
               "test_loss": [],
               "test_acc": []
    }

    # Loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                          dataloader=train_dataloader,
                                          loss_fn=loss_fn,
                                          optimizer=optimizer,
                                          device=device)
        test_loss, test_acc = test_step(model=model,
          dataloader=test_dataloader,
          loss_fn=loss_fn,
          device=device)

        # Print out what's happening
        print(
          f"Epoch: {epoch+1} | "
          f"train_loss: {train_loss:.4f} | "
          f"train_acc: {train_acc:.4f} | "
          f"test_loss: {test_loss:.4f} | "
          f"test_acc: {test_acc:.4f}"
        )

        # Update results dictionary
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)


        ### New: Use the writer parameter to track experiments ###
        # See if there's a writer, if so, log to it
        if writer:
            # Add results to SummaryWriter
            writer.add_scalars(main_tag="Loss", 
                               tag_scalar_dict={"train_loss": train_loss,
                                                "test_loss": test_loss},
                               global_step=epoch)
            writer.add_scalars(main_tag="Accuracy", 
                               tag_scalar_dict={"train_acc": train_acc,
                                                "test_acc": test_acc}, 
                               global_step=epoch)

            # Close the writer
            writer.close()
        else:
            pass
    ### End new ###

    # Return the filled results at the end of the epochs
    return results

### Download data

Using the same data from https://www.learnpytorch.io/07_pytorch_experiment_tracking/

In [None]:
# # Download 10 percent and 20 percent training data (if necessary)
# data_10_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
#                                      destination="pizza_steak_sushi")

# data_20_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip",
#                                      destination="pizza_steak_sushi_20_percent")

[INFO] data/pizza_steak_sushi directory exists, skipping download.
[INFO] data/pizza_steak_sushi_20_percent directory exists, skipping download.


In [13]:
data_10_percent_path = Path('../data/pizza_steak_sushi')
data_20_percent_path = Path('../data/pizza_steak_sushi_20_percent')
# Setup training directory paths
train_dir_10_percent = data_10_percent_path / "train"
train_dir_20_percent = data_20_percent_path / "train"

# Setup testing directory paths (note: use the same test dataset for both to compare the results)
test_dir = data_10_percent_path / "test"

# Check the directories
print(f"Training directory 10%: {train_dir_10_percent}")
print(f"Training directory 20%: {train_dir_20_percent}")
print(f"Testing directory: {test_dir}")

Training directory 10%: ..\data\pizza_steak_sushi\train
Training directory 20%: ..\data\pizza_steak_sushi_20_percent\train
Testing directory: ..\data\pizza_steak_sushi\test


In [14]:
from torchvision import transforms

# Create a transform to normalize data distribution to be inline with ImageNet
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], # values per colour channel [red, green, blue]
                                 std=[0.229, 0.224, 0.225])

# Create a transform pipeline
simple_transform = transforms.Compose([
                                       transforms.Resize((224, 224)),
                                       transforms.ToTensor(), # get image values between 0 & 1
                                       normalize
])

### Turn data into DataLoaders 

In [16]:
BATCH_SIZE = 32

# Create 10% training and test DataLoaders
train_dataloader_10_percent, test_dataloader, class_names = data_loaders.create_dataloaders(train_dir=train_dir_10_percent,
                                                                                          test_dir=test_dir,
                                                                                          transform=simple_transform,
                                                                                          batch_size=BATCH_SIZE)

# Create 20% training and test DataLoaders
train_dataloader_20_percent, test_dataloader, class_names = data_loaders.create_dataloaders(train_dir=train_dir_20_percent,
                                                                                          test_dir=test_dir,
                                                                                          transform=simple_transform,
                                                                                          batch_size=BATCH_SIZE)

# Find the number of samples/batches per dataloader (using the same test_dataloader for both experiments)
print(f"Number of batches of size {BATCH_SIZE} in 10 percent training data: {len(train_dataloader_10_percent)}")
print(f"Number of batches of size {BATCH_SIZE} in 20 percent training data: {len(train_dataloader_20_percent)}")
print(f"Number of batches of size {BATCH_SIZE} in testing data: {len(train_dataloader_10_percent)} (all experiments will use the same test set)")
print(f"Number of classes: {len(class_names)}, class names: {class_names}")

Number of batches of size 32 in 10 percent training data: 8
Number of batches of size 32 in 20 percent training data: 15
Number of batches of size 32 in testing data: 8 (all experiments will use the same test set)
Number of classes: 3, class names: ['pizza', 'steak', 'sushi']


## Exercise 1: Pick a larger model from [`torchvision.models`](https://pytorch.org/vision/main/models.html) to add to the list of experiments (for example, EffNetB3 or higher)

* How does it perform compared to our existing models?
* **Hint:** You'll need to set up an exerpiment similar to [07. PyTorch Experiment Tracking section 7.6](https://www.learnpytorch.io/07_pytorch_experiment_tracking/#76-create-experiments-and-set-up-training-code).

In [23]:
# TODO: your code

In [None]:
# TODO: your code
# effnetb7_weights = torchvision.models.EfficientNet_B7_Weights.DEFAULT
# effnetb7 = torchvision.models.efficientnet_b7(weights=effnetb7_weights)

print (summary (model= effnetb7,
       input_size = (32, 3, 600, 600),
       col_names = ['input_size', 'output_size', 'num_params', 'trainable'],
       col_width = 20,
       row_settings = ['var_names']))

print (f' number of input_features of final layter {len (effnetb7.classifier.state_dict()["1.weight"][0])}')

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [32, 3, 600, 600]    [32, 1000]           --                   True
├─Sequential (features)                                      [32, 3, 600, 600]    [32, 2560, 19, 19]   --                   True
│    └─Conv2dNormActivation (0)                              [32, 3, 600, 600]    [32, 64, 300, 300]   --                   True
│    │    └─Conv2d (0)                                       [32, 3, 600, 600]    [32, 64, 300, 300]   1,728                True
│    │    └─BatchNorm2d (1)                                  [32, 64, 300, 300]   [32, 64, 300, 300]   128                  True
│    │    └─SiLU (2)                                         [32, 64, 300, 300]   [32, 64, 300, 300]   --                   --
│    └─Sequential (1)                                        [32, 64, 300, 300]   [32, 32, 300

In [26]:
#  2560
OUT_FEATURES = len (class_names)
def create_effnetb7():
    weights = torchvision.models.EfficientNet_B7_Weights.DEFAULT
    model = torchvision.models.efficientnet_b7(weights=weights).to(device)

    for param in model.features.parameters():
        param.requires_grad=False

    set_seeds()

    model.classifier= nn.Sequential(
            nn.Dropout(p=0.2),
            nn.Linear(in_features = 2560, out_features = OUT_FEATURES)
    ).to(device)

    model.name = 'efnenetb7'
    print (f'[INFO] created new {model.name} model')
    return model

In [29]:
effnetb7 = create_effnetb7()

summary(model = effnetb7,
         input_size = (32, 3, 600, 600),
         col_names=['input_size', 'output_size', 'num_params', 'trainable'],
         col_width = 20,
         row_settings = ['var_names'])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [32, 3, 600, 600]    [32, 3]              --                   Partial
├─Sequential (features)                                      [32, 3, 600, 600]    [32, 2560, 19, 19]   --                   False
│    └─Conv2dNormActivation (0)                              [32, 3, 600, 600]    [32, 64, 300, 300]   --                   False
│    │    └─Conv2d (0)                                       [32, 3, 600, 600]    [32, 64, 300, 300]   (1,728)              False
│    │    └─BatchNorm2d (1)                                  [32, 64, 300, 300]   [32, 64, 300, 300]   (128)                False
│    │    └─SiLU (2)                                         [32, 64, 300, 300]   [32, 64, 300, 300]   --                   --
│    └─Sequential (1)                                        [32, 64, 300, 300]   [32, 

In [31]:
num_epochs = [5, 10]
models = ['effnetb7']
train_dataloaders = {'data_10_percent': train_dataloader_10_percent,
                     'data_20_percent':train_dataloader_20_percent}

In [35]:
%%time
from going_modular.utils import save_model
set_seeds (seed=42)
experiment_num = 0

for dataloader_name, train_data_loader in train_dataloaders.items():
    for epochs in num_epochs:
        for model_name in models:
            experiment_num +=1
            print(f'Experiment no: {experiment_num}')
            print(f'Model: {model_name}')
            print(f'Dataloader: {dataloader_name}')
            print (f'number of epochs: {epochs}')

            if model_name == 'effnetb7':
                model = create_effnetb7()
            else:
                print ('no model found')
                break

            loss_fn = nn.CrossEntropyLoss()
            optimizer = torch.optim.Adam(params=model.parameters(), lr = 0.001)

            train (model = model,
                train_dataloader=train_data_loader,
                test_dataloader = test_dataloader,
                optimizer = optimizer,
                loss_fn = loss_fn,
                epochs = epochs,
                device = device,
                writer= create_writer(experiment_name = dataloader_name,
                    model_name = model_name,
                    extra = f'{epochs}_epochs'))
            save_filepath = f'07ex_{model_name}_{dataloader_name}_{epochs}_epochs.pth'
            save_model (model = model,
                target_dir = 'models',
                model_name = save_filepath)

            print ('_'*50 + '\n')

Experiment no: 1
Model: effnetb7
Dataloader: data_10_percent
number of epochs: 5
[INFO] created new efnenetb7 model
[INFO] Created SummaryWriter, saving to: runs\2024-09-20\data_10_percent\effnetb7\5_epochs...


 20%|██        | 1/5 [00:14<00:58, 14.61s/it]

Epoch: 1 | train_loss: 1.0192 | train_acc: 0.6094 | test_loss: 0.9639 | test_acc: 0.6402


 40%|████      | 2/5 [00:19<00:27,  9.03s/it]

Epoch: 2 | train_loss: 0.8261 | train_acc: 0.8398 | test_loss: 0.8761 | test_acc: 0.6515


 60%|██████    | 3/5 [00:24<00:14,  7.30s/it]

Epoch: 3 | train_loss: 0.7251 | train_acc: 0.8594 | test_loss: 0.7926 | test_acc: 0.6932


 80%|████████  | 4/5 [00:30<00:06,  6.63s/it]

Epoch: 4 | train_loss: 0.7154 | train_acc: 0.7891 | test_loss: 0.7136 | test_acc: 0.7443


100%|██████████| 5/5 [00:35<00:00,  7.07s/it]

Epoch: 5 | train_loss: 0.5877 | train_acc: 0.8281 | test_loss: 0.6531 | test_acc: 0.7955
[INFO] Saving model to: models\07ex_effnetb7_data_10_percent_5_epochs.pth





__________________________________________________

Experiment no: 2
Model: effnetb7
Dataloader: data_10_percent
number of epochs: 10
[INFO] created new efnenetb7 model
[INFO] Created SummaryWriter, saving to: runs\2024-09-20\data_10_percent\effnetb7\10_epochs...


 10%|█         | 1/10 [00:04<00:39,  4.37s/it]

Epoch: 1 | train_loss: 1.0192 | train_acc: 0.6094 | test_loss: 0.9639 | test_acc: 0.6402


 20%|██        | 2/10 [00:08<00:33,  4.17s/it]

Epoch: 2 | train_loss: 0.8261 | train_acc: 0.8398 | test_loss: 0.8761 | test_acc: 0.6515


 30%|███       | 3/10 [00:12<00:28,  4.04s/it]

Epoch: 3 | train_loss: 0.7251 | train_acc: 0.8594 | test_loss: 0.7926 | test_acc: 0.6932


 40%|████      | 4/10 [00:16<00:24,  4.02s/it]

Epoch: 4 | train_loss: 0.7154 | train_acc: 0.7891 | test_loss: 0.7136 | test_acc: 0.7443


 50%|█████     | 5/10 [00:20<00:20,  4.02s/it]

Epoch: 5 | train_loss: 0.5877 | train_acc: 0.8281 | test_loss: 0.6531 | test_acc: 0.7955


 60%|██████    | 6/10 [00:24<00:15,  3.95s/it]

Epoch: 6 | train_loss: 0.5404 | train_acc: 0.8320 | test_loss: 0.6005 | test_acc: 0.8163


 70%|███████   | 7/10 [00:27<00:11,  3.89s/it]

Epoch: 7 | train_loss: 0.4993 | train_acc: 0.8477 | test_loss: 0.5570 | test_acc: 0.8059


 80%|████████  | 8/10 [00:31<00:07,  3.82s/it]

Epoch: 8 | train_loss: 0.4658 | train_acc: 0.9688 | test_loss: 0.5364 | test_acc: 0.8059


 90%|█████████ | 9/10 [00:35<00:03,  3.71s/it]

Epoch: 9 | train_loss: 0.4606 | train_acc: 0.8438 | test_loss: 0.5224 | test_acc: 0.8059


100%|██████████| 10/10 [00:39<00:00,  3.90s/it]

Epoch: 10 | train_loss: 0.3910 | train_acc: 0.9688 | test_loss: 0.5040 | test_acc: 0.8059
[INFO] Saving model to: models\07ex_effnetb7_data_10_percent_10_epochs.pth





__________________________________________________

Experiment no: 3
Model: effnetb7
Dataloader: data_20_percent
number of epochs: 5
[INFO] created new efnenetb7 model
[INFO] Created SummaryWriter, saving to: runs\2024-09-20\data_20_percent\effnetb7\5_epochs...


 20%|██        | 1/5 [00:11<00:45, 11.46s/it]

Epoch: 1 | train_loss: 0.9590 | train_acc: 0.6604 | test_loss: 0.8629 | test_acc: 0.7443


 40%|████      | 2/5 [00:19<00:28,  9.40s/it]

Epoch: 2 | train_loss: 0.6992 | train_acc: 0.8625 | test_loss: 0.6864 | test_acc: 0.8466


 60%|██████    | 3/5 [00:27<00:17,  8.57s/it]

Epoch: 3 | train_loss: 0.5729 | train_acc: 0.8917 | test_loss: 0.5823 | test_acc: 0.8570


 80%|████████  | 4/5 [00:35<00:08,  8.56s/it]

Epoch: 4 | train_loss: 0.4894 | train_acc: 0.8625 | test_loss: 0.5209 | test_acc: 0.8371


100%|██████████| 5/5 [00:43<00:00,  8.68s/it]

Epoch: 5 | train_loss: 0.4465 | train_acc: 0.9104 | test_loss: 0.4860 | test_acc: 0.8371
[INFO] Saving model to: models\07ex_effnetb7_data_20_percent_5_epochs.pth





__________________________________________________

Experiment no: 4
Model: effnetb7
Dataloader: data_20_percent
number of epochs: 10
[INFO] created new efnenetb7 model
[INFO] Created SummaryWriter, saving to: runs\2024-09-20\data_20_percent\effnetb7\10_epochs...


 10%|█         | 1/10 [00:07<01:03,  7.05s/it]

Epoch: 1 | train_loss: 0.9590 | train_acc: 0.6604 | test_loss: 0.8629 | test_acc: 0.7443


 20%|██        | 2/10 [00:14<00:56,  7.07s/it]

Epoch: 2 | train_loss: 0.6992 | train_acc: 0.8625 | test_loss: 0.6864 | test_acc: 0.8466


 30%|███       | 3/10 [00:21<00:49,  7.09s/it]

Epoch: 3 | train_loss: 0.5729 | train_acc: 0.8917 | test_loss: 0.5823 | test_acc: 0.8570


 40%|████      | 4/10 [00:29<00:44,  7.42s/it]

Epoch: 4 | train_loss: 0.4894 | train_acc: 0.8625 | test_loss: 0.5209 | test_acc: 0.8371


 50%|█████     | 5/10 [00:39<00:43,  8.63s/it]

Epoch: 5 | train_loss: 0.4465 | train_acc: 0.9104 | test_loss: 0.4860 | test_acc: 0.8371


 60%|██████    | 6/10 [00:48<00:34,  8.74s/it]

Epoch: 6 | train_loss: 0.4158 | train_acc: 0.9083 | test_loss: 0.4666 | test_acc: 0.8267


 70%|███████   | 7/10 [00:56<00:25,  8.42s/it]

Epoch: 7 | train_loss: 0.3705 | train_acc: 0.9229 | test_loss: 0.4546 | test_acc: 0.8267


 80%|████████  | 8/10 [01:03<00:15,  7.96s/it]

Epoch: 8 | train_loss: 0.3514 | train_acc: 0.9042 | test_loss: 0.4509 | test_acc: 0.8466


 90%|█████████ | 9/10 [01:10<00:07,  7.66s/it]

Epoch: 9 | train_loss: 0.3196 | train_acc: 0.9292 | test_loss: 0.4471 | test_acc: 0.8466


100%|██████████| 10/10 [01:18<00:00,  7.90s/it]

Epoch: 10 | train_loss: 0.3169 | train_acc: 0.9125 | test_loss: 0.4383 | test_acc: 0.8059
[INFO] Saving model to: models\07ex_effnetb7_data_20_percent_10_epochs.pth





__________________________________________________

CPU times: total: 13min 31s
Wall time: 3min 23s


In [36]:
%load_ext tensorboard
%tensorboard --logdir runs

Reusing TensorBoard on port 6006 (pid 87220), started 3 days, 1:48:04 ago. (Use '!kill 87220' to kill it.)

## Exercise 2. Introduce data augmentation to the list of experiments using the 20% pizza, steak, sushi training and test datasets, does this change anything?
    
* For example, you could have one training DataLoader that uses data augmentation (e.g. `train_dataloader_20_percent_aug` and `train_dataloader_20_percent_no_aug`) and then compare the results of two of the same model types training on these two DataLoaders.
* **Note:** You may need to alter the `create_dataloaders()` function to be able to take a transform for the training data and the testing data (because you don't need to perform data augmentation on the test data). See [04. PyTorch Custom Datasets section 6](https://www.learnpytorch.io/04_pytorch_custom_datasets/#6-other-forms-of-transforms-data-augmentation) for examples of using data augmentation or the script below for an example:

```python
# Note: Data augmentation transform like this should only be performed on training data
train_transform_data_aug = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.TrivialAugmentWide(),
    transforms.ToTensor(),
    normalize
])

# Create a helper function to visualize different augmented (and not augmented) images
def view_dataloader_images(dataloader, n=10):
    if n > 10:
        print(f"Having n higher than 10 will create messy plots, lowering to 10.")
        n = 10
    imgs, labels = next(iter(dataloader))
    plt.figure(figsize=(16, 8))
    for i in range(n):
        # Min max scale the image for display purposes
        targ_image = imgs[i]
        sample_min, sample_max = targ_image.min(), targ_image.max()
        sample_scaled = (targ_image - sample_min)/(sample_max - sample_min)

        # Plot images with appropriate axes information
        plt.subplot(1, 10, i+1)
        plt.imshow(sample_scaled.permute(1, 2, 0)) # resize for Matplotlib requirements
        plt.title(class_names[labels[i]])
        plt.axis(False)

# Have to update `create_dataloaders()` to handle different augmentations
import os
from torch.utils.data import DataLoader
from torchvision import datasets

NUM_WORKERS = os.cpu_count() # use maximum number of CPUs for workers to load data 

# Note: this is an update version of data_setup.create_dataloaders to handle
# differnt train and test transforms.
def create_dataloaders(
    train_dir, 
    test_dir, 
    train_transform, # add parameter for train transform (transforms on train dataset)
    test_transform,  # add parameter for test transform (transforms on test dataset)
    batch_size=32, num_workers=NUM_WORKERS
):
    # Use ImageFolder to create dataset(s)
    train_data = datasets.ImageFolder(train_dir, transform=train_transform)
    test_data = datasets.ImageFolder(test_dir, transform=test_transform)

    # Get class names
    class_names = train_data.classes

    # Turn images into data loaders
    train_dataloader = DataLoader(
        train_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )
    test_dataloader = DataLoader(
        test_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )

    return train_dataloader, test_dataloader, class_names
```

In [22]:
# TODO: your code
# effnetb7_weights = torchvision.models.EfficientNet_B7_Weights.DEFAULT
# effnetb7 = torchvision.models.efficientnet_b7(weights=effnetb7_weights)

print (summary (model= effnetb7,
       input_size = (32, 3, 600, 600),
       col_names = ['input_size', 'output_size', 'num_params', 'trainable'],
       col_width = 20,
       row_settings = ['var_names']))

print (f' number of input_features of final layter {len (effnetb7.classifier.state_dict()["1.weight"][0])}')

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [32, 3, 600, 600]    [32, 1000]           --                   True
├─Sequential (features)                                      [32, 3, 600, 600]    [32, 2560, 19, 19]   --                   True
│    └─Conv2dNormActivation (0)                              [32, 3, 600, 600]    [32, 64, 300, 300]   --                   True
│    │    └─Conv2d (0)                                       [32, 3, 600, 600]    [32, 64, 300, 300]   1,728                True
│    │    └─BatchNorm2d (1)                                  [32, 64, 300, 300]   [32, 64, 300, 300]   128                  True
│    │    └─SiLU (2)                                         [32, 64, 300, 300]   [32, 64, 300, 300]   --                   --
│    └─Sequential (1)                                        [32, 64, 300, 300]   [32, 32, 300

## Exercise 3. Scale up the dataset to turn FoodVision Mini into FoodVision Big using the entire [Food101 dataset from `torchvision.models`](https://pytorch.org/vision/stable/generated/torchvision.datasets.Food101.html#torchvision.datasets.Food101)
    
* You could take the best performing model from your various experiments or even the EffNetB2 feature extractor we created in this notebook and see how it goes fitting for 5 epochs on all of Food101.
* If you try more than one model, it would be good to have the model's results tracked.
* If you load the Food101 dataset from `torchvision.models`, you'll have to create PyTorch DataLoaders to use it in training.
* **Note:** Due to the larger amount of data in Food101 compared to our pizza, steak, sushi dataset, this model will take longer to train.

In [None]:
# TODO: your code