<a href="https://colab.research.google.com/github/mrdbourke/pytorch-deep-learning/blob/main/extras/exercises/07_pytorch_experiment_tracking_exercise_template.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 07. PyTorch Experiment Tracking Exercise Template

Welcome to the 07. PyTorch Experiment Tracking exercise template notebook.

> **Note:** There may be more than one solution to each of the exercises. This notebook only shows one possible example.

## Resources

1. These exercises/solutions are based on [section 07. PyTorch Transfer Learning](https://www.learnpytorch.io/07_pytorch_experiment_tracking/) of the Learn PyTorch for Deep Learning course by Zero to Mastery.
2. See a live [walkthrough of the solutions (errors and all) on YouTube](https://youtu.be/cO_r2FYcAjU).
3. See [other solutions on the course GitHub](https://github.com/mrdbourke/pytorch-deep-learning/tree/main/extras/solutions).

> **Note:** The first section of this notebook is dedicated to getting various helper functions and datasets used for the exercises. The exercises start at the heading "Exercise 1: ...".

# Initialization

Initially Colab has

* torch version: 2.1.0+cu121
* torchvision version: 0.16.0+cu121

Try the code bellow.



In [None]:
# # For this notebook to run with updated APIs, we need torch 1.12+ and torchvision 0.13+

# import torch
# import torchvision

# # Update code
# torch_version_str = '.'.join(torch.__version__.split('.')[0:2])
# torchvision_version_str = '.'.join(torchvision.__version__.split('.')[0:2])
# assert float(torch_version_str) >= 1.12, "torch version should be 1.12+"
# assert float(torchvision_version_str) >= 0.13, "torchvision version should be 0.13+"


# print(torch.__version__, torchvision.__version__)

In [None]:
# Check Colab Execution

try:
  from google.colab import drive
  drive.mount('/content/drive')
  IN_COLAB = True
except:
  IN_COLAB = False

IN_COLAB

In [None]:
%%time

if IN_COLAB:
    # updgrade torch, there are some bugs with init weights in torch 2.1.0
    !pip install torch torchaudio torchdata torchtext torchvision -U
else:
    !pipenv install

### Get various imports and helper functions

We'll need to make sure we have `torch` v.1.12+ and `torchvision` v0.13+.

In [None]:
try:
    import torch
    import torchvision

    # Update code
    torch_version_str = '.'.join(torch.__version__.split('.')[0:2])
    torchvision_version_str = '.'.join(torchvision.__version__.split('.')[0:2])
    assert float(torch_version_str) >= 1.12, "torch version should be 1.12+"
    assert float(torchvision_version_str) >= 0.13, "torchvision version should be 0.13+"
except:
  pass

torch_version_str, torchvision_version_str

In [None]:
# For this notebook to run with updated APIs, we need torch 1.12+ and torchvision 0.13+
try:
    import torch
    import torchvision

    # Update code
    torch_version_str = '.'.join(torch.__version__.split('.')[0:2])
    torchvision_version_str = '.'.join(torchvision.__version__.split('.')[0:2])
    assert float(torch_version_str) >= 1.12, "torch version should be 1.12+"
    assert float(torchvision_version_str) >= 0.13, "torchvision version should be 0.13+"

    # Previous code
    # assert int(torch.__version__.split(".")[1]) >= 12, "torch version should be 1.12+"
    # assert int(torchvision.__version__.split(".")[1]) >= 13, "torchvision version should be 0.13+"
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")
except:
    print(f"[INFO] torch/torchvision versions not as required, installing nightly versions.")
    !pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    import torch
    import torchvision
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")

print(torch.__version__, torchvision.__version__)

In [None]:
 # Make sure we have a GPU
 device = "cuda" if torch.cuda.is_available() else "cpu"
 device

In [None]:
# Get regular imports
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    !pip install -q torchinfo
    from torchinfo import summary

# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
    from going_modular.going_modular import data_setup, engine
except:
    # Get the going_modular scripts
    print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    !mv pytorch-deep-learning/going_modular .
    !rm -rf pytorch-deep-learning
    from going_modular.going_modular import data_setup, engine

In [None]:
# Set seeds
def set_seeds(seed: int=42):
    """Sets random sets for torch operations.

    Args:
        seed (int, optional): Random seed to set. Defaults to 42.
    """
    # Set the seed for general torch operations
    torch.manual_seed(seed)
    # Set the seed for CUDA torch operations (ones that happen on the GPU)
    torch.cuda.manual_seed(seed)

In [None]:
# Get a summary of the model (uncomment for full output)
def get_summary(model, input_size=(32, 3, 224, 224)):
    print(summary(model,
            input_size=(32, 3, 224, 224), # make sure this is "input_size", not "input_shape" (batch_size, color_channels, height, width)
            verbose=0,
            col_names=["input_size", "output_size", "num_params", "trainable"],
            col_width=20,
            row_settings=["var_names"]
    ))

In [None]:
# Download the data
import os
import zipfile

from pathlib import Path

import requests

def download_data(source: str,
                  destination: str,
                  remove_source: bool = True) -> Path:
    """Downloads a zipped dataset from source and unzips to destination.

    Args:
        source (str): A link to a zipped file containing data.
        destination (str): A target directory to unzip data to.
        remove_source (bool): Whether to remove the source after downloading and extracting.

    Returns:
        pathlib.Path to downloaded data.

    Example usage:
        download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                      destination="pizza_steak_sushi")
    """
    # Setup path to data folder
    data_path = Path("data/")
    image_path = data_path / destination

    # If the image folder doesn't exist, download it and prepare it...
    if image_path.is_dir():
        print(f"[INFO] {image_path} directory exists, skipping download.")
    else:
        print(f"[INFO] Did not find {image_path} directory, creating one...")
        image_path.mkdir(parents=True, exist_ok=True)

        # Download pizza, steak, sushi data
        target_file = Path(source).name
        with open(data_path / target_file, "wb") as f:
            request = requests.get(source)
            print(f"[INFO] Downloading {target_file} from {source}...")
            f.write(request.content)

        # Unzip pizza, steak, sushi data
        with zipfile.ZipFile(data_path / target_file, "r") as zip_ref:
            print(f"[INFO] Unzipping {target_file} data...")
            zip_ref.extractall(image_path)

        # Remove .zip file
        if remove_source:
            os.remove(data_path / target_file)

    return image_path

image_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                           destination="pizza_steak_sushi")
image_path

In [None]:
if IN_COLAB:
  RUNS_DIR = Path('/content/drive/MyDrive/dev/data/01_python_data_packages/07_pythorch/02_pythorch-deep-learning-course/runs')
  MODELS_DIR = Path('/content/drive/MyDrive/dev/data/01_python_data_packages/07_pythorch/02_pythorch-deep-learning-course/models')
else:
  RUNS_DIR = Path('runs')
  MODELS_DIR = Path('models')


In [None]:
# Writer creation
from torch.utils.tensorboard import SummaryWriter


def create_writer(experiment_name: str,
                  model_name: str,
                  tag: str = None,
                  extra: str = None):
    """Creates a torch.utils.tensorboard.writer.SummaryWriter() instance saving to a specific log_dir.

    log_dir is a combination of runs/timestamp/experiment_name/model_name/extra.

    Where timestamp is the current date in YYYY-MM-DD format.

    Args:
        experiment_name (str): Name of experiment.
        model_name (str): Name of model.
        extra (str, optional): Anything extra to add to the directory. Defaults to None.

    Returns:
        torch.utils.tensorboard.writer.SummaryWriter(): Instance of a writer saving to log_dir.

    Example usage:
        # Create a writer saving to "runs/2022-06-04/data_10_percent/effnetb2/5_epochs/"
        writer = create_writer(experiment_name="data_10_percent",
                               model_name="effnetb2",
                               extra="5_epochs")
        # The above is the same as:
        writer = SummaryWriter(log_dir="runs/2022-06-04/data_10_percent/effnetb2/5_epochs/")
    """
    from datetime import datetime
    import os

    timestamp = datetime.now().strftime("%Y-%m-%d")
    # Get timestamp of current date (all experiments on certain day live in same folder)
    if tag:
        # returns current date in YYYY-MM-DD format
        timestamp = timestamp + '_' + tag

    if extra:
        # Create log directory path
        log_dir = os.path.join(RUNS_DIR, timestamp,
                               experiment_name, model_name, extra)
    else:
        log_dir = os.path.join(RUNS_DIR, timestamp,
                               experiment_name, model_name)

    print(f"[INFO] Created SummaryWriter, saving to: {log_dir}...")
    return SummaryWriter(log_dir=log_dir)

In [None]:
# Create a test writer
# writer = create_writer(experiment_name="test_experiment_name",
#                        model_name="this_is_the_model_name",
#                        extra="add_a_little_extra_if_you_want")

In [None]:
# Update train function with writer
from typing import Dict, List
from tqdm.auto import tqdm

from going_modular.going_modular.engine import train_step, test_step

# Add writer parameter to train()
def train(model: torch.nn.Module,
          train_dataloader: torch.utils.data.DataLoader,
          test_dataloader: torch.utils.data.DataLoader,
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int,
          device: torch.device,
          writer: torch.utils.tensorboard.writer.SummaryWriter # new parameter to take in a writer
          ) -> Dict[str, List]:
    """Trains and tests a PyTorch model.

    Passes a target PyTorch models through train_step() and test_step()
    functions for a number of epochs, training and testing the model
    in the same epoch loop.

    Calculates, prints and stores evaluation metrics throughout.

    Stores metrics to specified writer log_dir if present.

    Args:
      model: A PyTorch model to be trained and tested.
      train_dataloader: A DataLoader instance for the model to be trained on.
      test_dataloader: A DataLoader instance for the model to be tested on.
      optimizer: A PyTorch optimizer to help minimize the loss function.
      loss_fn: A PyTorch loss function to calculate loss on both datasets.
      epochs: An integer indicating how many epochs to train for.
      device: A target device to compute on (e.g. "cuda" or "cpu").
      writer: A SummaryWriter() instance to log model results to.

    Returns:
      A dictionary of training and testing loss as well as training and
      testing accuracy metrics. Each metric has a value in a list for
      each epoch.
      In the form: {train_loss: [...],
                train_acc: [...],
                test_loss: [...],
                test_acc: [...]}
      For example if training for epochs=2:
              {train_loss: [2.0616, 1.0537],
                train_acc: [0.3945, 0.3945],
                test_loss: [1.2641, 1.5706],
                test_acc: [0.3400, 0.2973]}
    """
    # Create empty results dictionary
    results = {"train_loss": [],
               "train_acc": [],
               "test_loss": [],
               "test_acc": []
    }

    # Loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                          dataloader=train_dataloader,
                                          loss_fn=loss_fn,
                                          optimizer=optimizer,
                                          device=device)
        test_loss, test_acc = test_step(model=model,
          dataloader=test_dataloader,
          loss_fn=loss_fn,
          device=device)

        # Print out what's happening
        print(
          f"Epoch: {epoch+1} | "
          f"train_loss: {train_loss:.4f} | "
          f"train_acc: {train_acc:.4f} | "
          f"test_loss: {test_loss:.4f} | "
          f"test_acc: {test_acc:.4f}"
        )

        # Update results dictionary
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)


        ### New: Use the writer parameter to track experiments ###
        # See if there's a writer, if so, log to it
        if writer:
            # Add results to SummaryWriter
            writer.add_scalars(main_tag="Loss",
                               tag_scalar_dict={"train_loss": train_loss,
                                                "test_loss": test_loss},
                               global_step=epoch)
            writer.add_scalars(main_tag="Accuracy",
                               tag_scalar_dict={"train_acc": train_acc,
                                                "test_acc": test_acc},
                               global_step=epoch)

            # Close the writer
            writer.close()
        else:
            pass
    ### End new ###

    # Return the filled results at the end of the epochs
    return results

### Start init here ctrl + f8

### Download data

Using the same data from https://www.learnpytorch.io/07_pytorch_experiment_tracking/

In [None]:
# Download 10 percent and 20 percent training data (if necessary)
data_10_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                                     destination="pizza_steak_sushi")

data_20_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip",
                                     destination="pizza_steak_sushi_20_percent")

In [None]:
# Setup training directory paths
train_dir_10_percent = data_10_percent_path / "train"
train_dir_20_percent = data_20_percent_path / "train"

# Setup testing directory paths (note: use the same test dataset for both to compare the results)
test_dir = data_10_percent_path / "test"

# Check the directories
print(f"Training directory 10%: {train_dir_10_percent}")
print(f"Training directory 20%: {train_dir_20_percent}")
print(f"Testing directory: {test_dir}")

In [None]:
# Creation transforms
from torchvision import transforms

# Create a transform to normalize data distribution to be inline with ImageNet
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], # values per colour channel [red, green, blue]
                                 std=[0.229, 0.224, 0.225])

# Create a transform pipeline
simple_transform = transforms.Compose([
                                       transforms.Resize((224, 224)),
                                       transforms.ToTensor(), # get image values between 0 & 1
                                       normalize
])

### Turn data into DataLoaders

In [None]:
BATCH_SIZE = 32

# Create 10% training and test DataLoaders
train_dataloader_10_percent, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir_10_percent,
                                                                                          test_dir=test_dir,
                                                                                          transform=simple_transform,
                                                                                          batch_size=BATCH_SIZE)

# Create 20% training and test DataLoaders
train_dataloader_20_percent, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir_20_percent,
                                                                                          test_dir=test_dir,
                                                                                          transform=simple_transform,
                                                                                          batch_size=BATCH_SIZE)

# Find the number of samples/batches per dataloader (using the same test_dataloader for both experiments)
print(f"Number of batches of size {BATCH_SIZE} in 10 percent training data: {len(train_dataloader_10_percent)}")
print(f"Number of batches of size {BATCH_SIZE} in 20 percent training data: {len(train_dataloader_20_percent)}")
print(f"Number of batches of size {BATCH_SIZE} in testing data: {len(train_dataloader_10_percent)} (all experiments will use the same test set)")
print(f"Number of classes: {len(class_names)}, class names: {class_names}")

### Create models

In [None]:
import torchvision
from torch import nn

# Get num out features (one for each class pizza, steak, sushi)
OUT_FEATURES = len(class_names)

# Create an EffNetB0 feature extractor
def create_effnetb0():
    # 1. Get the base mdoel with pretrained weights and send to target device
    weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT
    model = torchvision.models.efficientnet_b0(weights=weights).to(device)

    # 2. Freeze the base model layers
    for param in model.features.parameters():
        param.requires_grad = False

    # 3. Set the seeds
    set_seeds()

    # 4. Change the classifier head
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.2),
        nn.Linear(in_features=1280, out_features=OUT_FEATURES)
    ).to(device)

    # 5. Give the model a name
    model.name = "effnetb0"
    print(f"[INFO] Created new {model.name} model.")
    return model

# Create an EffNetB2 feature extractor
def create_effnetb2():
    # 1. Get the base model with pretrained weights and send to target device
    weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT
    model = torchvision.models.efficientnet_b2(weights=weights).to(device)

    # 2. Freeze the base model layers
    for param in model.features.parameters():
        param.requires_grad = False

    # 3. Set the seeds
    set_seeds()

    # 4. Change the classifier head
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.3),
        nn.Linear(in_features=1408, out_features=OUT_FEATURES)
    ).to(device)

    # 5. Give the model a name
    model.name = "effnetb2"
    print(f"[INFO] Created new {model.name} model.")
    return model

# Create an EffNetB3 feature extractor
def create_effnetb3():
    # 1. Get the base mdoel with pretrained weights and send to target device
    weights = torchvision.models.EfficientNet_B3_Weights.DEFAULT
    print(device)
    model = torchvision.models.efficientnet_b3(weights=weights).to(device)

    # 2. Freeze the base model layers
    for param in model.features.parameters():
        param.requires_grad = False

    # 3. Set the seeds
    set_seeds()

    # 4. Change the classifier head
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.2),
        nn.Linear(in_features=1536, out_features=OUT_FEATURES)
    ).to(device)

    # 5. Give the model a name
    model.name = "effnetb3"
    print(f"[INFO] Created new {model.name} model.")
    return model

# Create an EffNetB5 feature extractor
def create_effnetb5():
    # 1. Get the base model with pretrained weights and send to target device
    weights = torchvision.models.EfficientNet_B5_Weights.DEFAULT
    model = torchvision.models.efficientnet_b5(weights=weights).to(device)

    # 2. Freeze the base model layers
    for param in model.features.parameters():
        param.requires_grad = False

    # 3. Set the seeds
    set_seeds()

    # 4. Change the classifier head
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.2),
        nn.Linear(in_features=2048, out_features=OUT_FEATURES)
    ).to(device)

    # 5. Give the model a name
    model.name = "effnetb5"
    print(f"[INFO] Created new {model.name} model.")
    return model

## Exercise 1: Pick a larger model from [`torchvision.models`](https://pytorch.org/vision/main/models.html) to add to the list of experiments (for example, EffNetB3 or higher)

* How does it perform compared to our existing models?
* **Hint:** You'll need to set up an exerpiment similar to [07. PyTorch Experiment Tracking section 7.6](https://www.learnpytorch.io/07_pytorch_experiment_tracking/#76-create-experiments-and-set-up-training-code).

In [None]:
# 1. Create epochs list
num_epochs = [5, 10]

# 2. Create models list (need to create a new model for each experiment)
models = {}  # ["effnetb3", "effnetb5"]
models["effnetb0"] = create_effnetb0
models["effnetb2"] = create_effnetb2
models["effnetb3"] = create_effnetb3
models["effnetb5"] = create_effnetb5

# 3. Create dataloaders dictionary for various dataloaders
train_dataloaders = {"data_10_percent": train_dataloader_10_percent,
                     "data_20_percent": train_dataloader_20_percent}

%%time
from going_modular.going_modular.utils import save_model

# 1. Set the random seeds
set_seeds(seed=42)

# 2. Keep track of experiment numbers
experiment_number = 0

# 3. Loop through each DataLoader
for dataloader_name, train_dataloader in train_dataloaders.items():

    # 4. Loop through each number of epochs
    for epochs in num_epochs:

        # 5. Loop through each model name and create a new model based on the name
        for model_name, model_init in models.items():

            # 6. Create information print outs
            experiment_number += 1
            print(f"[INFO] Experiment number: {experiment_number}")
            print(f"[INFO] Model: {model_name}")
            print(f"[INFO] DataLoader: {dataloader_name}")
            print(f"[INFO] Number of epochs: {epochs}")

            # 7. Select the model
            # if model_name == "effnetb0":
            #     model = create_effnetb0() # creates a new model each time (important because we want each experiment to start from scratch)
            # else:
            #     model = create_effnetb2() # creates a new model each time (important because we want each experiment to start from scratch)

            # break
            model = model_init()

            # 8. Create a new loss and optimizer for every model
            loss_fn = nn.CrossEntropyLoss()
            optimizer = torch.optim.Adam(params=model.parameters(), lr=0.001)

            # 9. Train target model with target dataloaders and track experiments
            train(model=model,
                  train_dataloader=train_dataloader,
                  test_dataloader=test_dataloader,
                  optimizer=optimizer,
                  loss_fn=loss_fn,
                  epochs=epochs,
                  device=device,
                  writer=create_writer(experiment_name=dataloader_name,
                                       model_name=model_name,
                                       extra=f"{epochs}_epochs"))

            # 10. Save the model to file so we can get back the best model
            save_filepath = f"07_{model_name}_{dataloader_name}_{epochs}_epochs.pth"
            save_model(model=model,
                       target_dir=MODELS_DIR,
                       model_name=save_filepath)
            print("-"*50 + "\n")

In [None]:
1 /0

## Exercise 2. Introduce data augmentation to the list of experiments using the 20% pizza, steak, sushi training and test datasets, does this change anything?
    
* For example, you could have one training DataLoader that uses data augmentation (e.g. `train_dataloader_20_percent_aug` and `train_dataloader_20_percent_no_aug`) and then compare the results of two of the same model types training on these two DataLoaders.
* **Note:** You may need to alter the `create_dataloaders()` function to be able to take a transform for the training data and the testing data (because you don't need to perform data augmentation on the test data). See [04. PyTorch Custom Datasets section 6](https://www.learnpytorch.io/04_pytorch_custom_datasets/#6-other-forms-of-transforms-data-augmentation) for examples of using data augmentation or the script below for an example:

```python
# Note: Data augmentation transform like this should only be performed on training data
train_transform_data_aug = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.TrivialAugmentWide(),
    transforms.ToTensor(),
    normalize
])

# Create a helper function to visualize different augmented (and not augmented) images
def view_dataloader_images(dataloader, n=10):
    if n > 10:
        print(f"Having n higher than 10 will create messy plots, lowering to 10.")
        n = 10
    imgs, labels = next(iter(dataloader))
    plt.figure(figsize=(16, 8))
    for i in range(n):
        # Min max scale the image for display purposes
        targ_image = imgs[i]
        sample_min, sample_max = targ_image.min(), targ_image.max()
        sample_scaled = (targ_image - sample_min)/(sample_max - sample_min)

        # Plot images with appropriate axes information
        plt.subplot(1, 10, i+1)
        plt.imshow(sample_scaled.permute(1, 2, 0)) # resize for Matplotlib requirements
        plt.title(class_names[labels[i]])
        plt.axis(False)

# Have to update `create_dataloaders()` to handle different augmentations
import os
from torch.utils.data import DataLoader
from torchvision import datasets

NUM_WORKERS = os.cpu_count() # use maximum number of CPUs for workers to load data

# Note: this is an update version of data_setup.create_dataloaders to handle
# differnt train and test transforms.
def create_dataloaders(
    train_dir,
    test_dir,
    train_transform, # add parameter for train transform (transforms on train dataset)
    test_transform,  # add parameter for test transform (transforms on test dataset)
    batch_size=32, num_workers=NUM_WORKERS
):
    # Use ImageFolder to create dataset(s)
    train_data = datasets.ImageFolder(train_dir, transform=train_transform)
    test_data = datasets.ImageFolder(test_dir, transform=test_transform)

    # Get class names
    class_names = train_data.classes

    # Turn images into data loaders
    train_dataloader = DataLoader(
        train_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )
    test_dataloader = DataLoader(
        test_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )

    return train_dataloader, test_dataloader, class_names
```

### AUG transform


In [None]:
train_transform_data_aug = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.TrivialAugmentWide(),
    transforms.ToTensor(),
    normalize
])

In [None]:
# view dataloader images func
# Create a helper function to visualize different augmented (and not augmented) images
def view_dataloader_images(dataloader, n=10):
    if n > 10:
        print(f"Having n higher than 10 will create messy plots, lowering to 10.")
        n = 10
    imgs, labels = next(iter(dataloader))
    plt.figure(figsize=(16, 8))
    for i in range(n):
        # Min max scale the image for display purposes
        targ_image = imgs[i]
        sample_min, sample_max = targ_image.min(), targ_image.max()
        sample_scaled = (targ_image - sample_min)/(sample_max - sample_min)

        # Plot images with appropriate axes information
        plt.subplot(1, 10, i+1)
        plt.imshow(sample_scaled.permute(1, 2, 0)) # resize for Matplotlib requirements
        plt.title(class_names[labels[i]])
        plt.axis(False)

# Have to update `create_dataloaders()` to handle different augmentations
import os
from torch.utils.data import DataLoader
from torchvision import datasets


### New Dataloader

In [None]:
# New version of dataloaders.
# Note: this is an update version of data_setup.create_dataloaders to handle
# differnt train and test transforms.

NUM_WORKERS = os.cpu_count() # use maximum number of CPUs for workers to load data

def create_dataloaders(
    train_dir,
    test_dir,
    train_transform, # add parameter for train transform (transforms on train dataset)
    test_transform,  # add parameter for test transform (transforms on test dataset)
    batch_size=32, num_workers=NUM_WORKERS
):
    # Use ImageFolder to create dataset(s)
    train_data = datasets.ImageFolder(train_dir, transform=train_transform)
    test_data = datasets.ImageFolder(test_dir, transform=test_transform)

    # Get class names
    class_names = train_data.classes

    # Turn images into data loaders
    train_dataloader = DataLoader(
        train_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )
    test_dataloader = DataLoader(
        test_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )

    return train_dataloader, test_dataloader, class_names

In [None]:
BATCH_SIZE = 32

# Create 20% training and test DataLoaders
train_dataloader_20_percent_aug, test_dataloader, class_names = create_dataloaders(train_dir=train_dir_20_percent,
                                                                                   test_dir=test_dir,
                                                                                   train_transform=train_transform_data_aug,
                                                                                   test_transform=simple_transform,
                                                                                   batch_size=BATCH_SIZE)

# Find the number of samples/batches per dataloader (using the same test_dataloader for both experiments)
print(f"Number of batches of size {BATCH_SIZE} in 20 percent training data aug: {len(train_dataloader_20_percent_aug)}")
print(f"Number of classes: {len(class_names)}, class names: {class_names}")

### Setup experiment. AUG dataloader Effnetb3, effnetb5

In [None]:
# TODO: your code

# 1. Create epochs list
num_epochs = [5, 10]

# 2. Create models list (need to create a new model for each experiment)
models = {} #["effnetb3", "effnetb5"]
models["effnetb0"] = create_effnetb0
models["effnetb2"] = create_effnetb2
models["effnetb3"] = create_effnetb3
models["effnetb5"] = create_effnetb5

train_dataloaders = {"data_10_percent": train_dataloader_10_percent,
                     "data_20_percent": train_dataloader_20_percent,
                     "data_20_percent_aug": train_dataloader_20_percent_aug}


In [None]:
view_dataloader_images(train_dataloader_20_percent_aug)

In [None]:
view_dataloader_images(train_dataloader_20_percent)

In [30]:
%%time
from going_modular.going_modular.utils import save_model

# 1. Set the random seeds
set_seeds(seed=42)

# 2. Keep track of experiment numbers
experiment_number = 0

# 3. Loop through each DataLoader
for dataloader_name, train_dataloader in train_dataloaders.items():

    # 4. Loop through each number of epochs
    for epochs in num_epochs:

        # 5. Loop through each model name and create a new model based on the name
        for model_name, model_init in models.items():

            # 6. Create information print outs
            experiment_number += 1
            print(f"[INFO] Experiment number: {experiment_number}")
            print(f"[INFO] Model: {model_name}")
            print(f"[INFO] DataLoader: {dataloader_name}")
            print(f"[INFO] Number of epochs: {epochs}")
            print(f"[INFO] Number of epochs: {epochs}")

            # 7. Select the model
            # if model_name == "effnetb0":
            #     model = create_effnetb0() # creates a new model each time (important because we want each experiment to start from scratch)
            # else:
            #     model = create_effnetb2() # creates a new model each time (important because we want each experiment to start from scratch)

            # break
            model = model_init()
            # 8. Create a new loss and optimizer for every model
            loss_fn = nn.CrossEntropyLoss()
            optimizer = torch.optim.Adam(params=model.parameters(), lr=0.001)

            # 9. Train target model with target dataloaders and track experiments
            train(model=model,
                  train_dataloader=train_dataloader,
                  test_dataloader=test_dataloader,
                  optimizer=optimizer,
                  loss_fn=loss_fn,
                  epochs=epochs,
                  device=device,
                  writer=create_writer(experiment_name=dataloader_name,
                                       model_name=model_name,
                                       extra=f"{epochs}_epochs"))

            # 10. Save the model to file so we can get back the best model
            save_filepath = f"07_{model_name}_{dataloader_name}_{epochs}_epochs.pth"
            save_model(model=model,
                       target_dir=MODELS_DIR,
                       model_name=save_filepath)
            print("-"*50 + "\n")

[INFO] Experiment number: 1
[INFO] Model: effnetb0
[INFO] DataLoader: data_10_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5


Downloading: "https://download.pytorch.org/models/efficientnet_b0_rwightman-7f5810bc.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-7f5810bc.pth
100%|██████████| 20.5M/20.5M [00:00<00:00, 51.9MB/s]


[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_10_percent/effnetb0/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0528 | train_acc: 0.4961 | test_loss: 0.8721 | test_acc: 0.5672
Epoch: 2 | train_loss: 0.9239 | train_acc: 0.6016 | test_loss: 0.7691 | test_acc: 0.7424
Epoch: 3 | train_loss: 0.7694 | train_acc: 0.6992 | test_loss: 0.6394 | test_acc: 0.9167
Epoch: 4 | train_loss: 0.7055 | train_acc: 0.7617 | test_loss: 0.6587 | test_acc: 0.8570
Epoch: 5 | train_loss: 0.7055 | train_acc: 0.7773 | test_loss: 0.5753 | test_acc: 0.8769
[INFO] Saving model to: models/07_effnetb0_data_10_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 2
[INFO] Model: effnetb2
[INFO] DataLoader: data_10_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5


Downloading: "https://download.pytorch.org/models/efficientnet_b2_rwightman-c35c1473.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b2_rwightman-c35c1473.pth
100%|██████████| 35.2M/35.2M [00:00<00:00, 81.2MB/s]


[INFO] Created new effnetb2 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_10_percent/effnetb2/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0928 | train_acc: 0.3711 | test_loss: 0.9427 | test_acc: 0.6809
Epoch: 2 | train_loss: 0.8966 | train_acc: 0.6680 | test_loss: 0.8358 | test_acc: 0.7831
Epoch: 3 | train_loss: 0.8247 | train_acc: 0.6875 | test_loss: 0.7738 | test_acc: 0.8153
Epoch: 4 | train_loss: 0.7367 | train_acc: 0.7812 | test_loss: 0.7567 | test_acc: 0.8153
Epoch: 5 | train_loss: 0.7271 | train_acc: 0.7617 | test_loss: 0.6825 | test_acc: 0.8759
[INFO] Saving model to: models/07_effnetb2_data_10_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 3
[INFO] Model: effnetb3
[INFO] DataLoader: data_10_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
cuda


Downloading: "https://download.pytorch.org/models/efficientnet_b3_rwightman-b3899882.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b3_rwightman-b3899882.pth
100%|██████████| 47.2M/47.2M [00:00<00:00, 170MB/s]


[INFO] Created new effnetb3 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_10_percent/effnetb3/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0684 | train_acc: 0.3984 | test_loss: 0.9963 | test_acc: 0.6998
Epoch: 2 | train_loss: 0.9321 | train_acc: 0.6680 | test_loss: 0.8963 | test_acc: 0.7538
Epoch: 3 | train_loss: 0.7735 | train_acc: 0.8789 | test_loss: 0.7664 | test_acc: 0.8561
Epoch: 4 | train_loss: 0.6910 | train_acc: 0.8438 | test_loss: 0.6939 | test_acc: 0.8561
Epoch: 5 | train_loss: 0.6281 | train_acc: 0.8398 | test_loss: 0.6207 | test_acc: 0.9271
[INFO] Saving model to: models/07_effnetb3_data_10_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 4
[INFO] Model: effnetb5
[INFO] DataLoader: data_10_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5


Downloading: "https://download.pytorch.org/models/efficientnet_b5_lukemelas-1a07897c.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b5_lukemelas-1a07897c.pth
100%|██████████| 117M/117M [00:01<00:00, 77.4MB/s]


[INFO] Created new effnetb5 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_10_percent/effnetb5/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0473 | train_acc: 0.4844 | test_loss: 0.9087 | test_acc: 0.8551
Epoch: 2 | train_loss: 0.8696 | train_acc: 0.8672 | test_loss: 0.7947 | test_acc: 0.9072
Epoch: 3 | train_loss: 0.7919 | train_acc: 0.7500 | test_loss: 0.6972 | test_acc: 0.9271
Epoch: 4 | train_loss: 0.6788 | train_acc: 0.8008 | test_loss: 0.5708 | test_acc: 0.9375
Epoch: 5 | train_loss: 0.6117 | train_acc: 0.8008 | test_loss: 0.5220 | test_acc: 0.9062
[INFO] Saving model to: models/07_effnetb5_data_10_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 5
[INFO] Model: effnetb0
[INFO] DataLoader: data_10_percent
[INFO] Number of epochs: 10
[INFO] Number of epochs: 10
[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_10_percent/effnetb0/10_epochs...


  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0528 | train_acc: 0.4961 | test_loss: 0.8721 | test_acc: 0.5672
Epoch: 2 | train_loss: 0.9239 | train_acc: 0.6016 | test_loss: 0.7691 | test_acc: 0.7424
Epoch: 3 | train_loss: 0.7694 | train_acc: 0.6992 | test_loss: 0.6394 | test_acc: 0.9167
Epoch: 4 | train_loss: 0.7055 | train_acc: 0.7617 | test_loss: 0.6587 | test_acc: 0.8570
Epoch: 5 | train_loss: 0.7055 | train_acc: 0.7773 | test_loss: 0.5753 | test_acc: 0.8769
Epoch: 6 | train_loss: 0.5831 | train_acc: 0.8008 | test_loss: 0.5428 | test_acc: 0.8873
Epoch: 7 | train_loss: 0.5511 | train_acc: 0.9258 | test_loss: 0.4998 | test_acc: 0.9271
Epoch: 8 | train_loss: 0.4796 | train_acc: 0.9336 | test_loss: 0.4753 | test_acc: 0.8968
Epoch: 9 | train_loss: 0.4571 | train_acc: 0.9336 | test_loss: 0.5106 | test_acc: 0.8570
Epoch: 10 | train_loss: 0.5389 | train_acc: 0.7930 | test_loss: 0.4924 | test_acc: 0.8570
[INFO] Saving model to: models/07_effnetb0_data_10_percent_10_epochs.pth
------------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0928 | train_acc: 0.3711 | test_loss: 0.9427 | test_acc: 0.6809
Epoch: 2 | train_loss: 0.8966 | train_acc: 0.6680 | test_loss: 0.8358 | test_acc: 0.7831
Epoch: 3 | train_loss: 0.8247 | train_acc: 0.6875 | test_loss: 0.7738 | test_acc: 0.8153
Epoch: 4 | train_loss: 0.7367 | train_acc: 0.7812 | test_loss: 0.7567 | test_acc: 0.8153
Epoch: 5 | train_loss: 0.7271 | train_acc: 0.7617 | test_loss: 0.6825 | test_acc: 0.8759
Epoch: 6 | train_loss: 0.6050 | train_acc: 0.7852 | test_loss: 0.6646 | test_acc: 0.8873
Epoch: 7 | train_loss: 0.5356 | train_acc: 0.9180 | test_loss: 0.5977 | test_acc: 0.8864
Epoch: 8 | train_loss: 0.5072 | train_acc: 0.9219 | test_loss: 0.6129 | test_acc: 0.8769
Epoch: 9 | train_loss: 0.5476 | train_acc: 0.7891 | test_loss: 0.6190 | test_acc: 0.9072
Epoch: 10 | train_loss: 0.4853 | train_acc: 0.9336 | test_loss: 0.5119 | test_acc: 0.9384
[INFO] Saving model to: models/07_effnetb2_data_10_percent_10_epochs.pth
------------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0684 | train_acc: 0.3984 | test_loss: 0.9963 | test_acc: 0.6998
Epoch: 2 | train_loss: 0.9321 | train_acc: 0.6680 | test_loss: 0.8963 | test_acc: 0.7538
Epoch: 3 | train_loss: 0.7735 | train_acc: 0.8789 | test_loss: 0.7664 | test_acc: 0.8561
Epoch: 4 | train_loss: 0.6910 | train_acc: 0.8438 | test_loss: 0.6939 | test_acc: 0.8561
Epoch: 5 | train_loss: 0.6281 | train_acc: 0.8398 | test_loss: 0.6207 | test_acc: 0.9271
Epoch: 6 | train_loss: 0.6859 | train_acc: 0.7344 | test_loss: 0.5627 | test_acc: 0.9072
Epoch: 7 | train_loss: 0.6017 | train_acc: 0.7812 | test_loss: 0.5790 | test_acc: 0.8655
Epoch: 8 | train_loss: 0.6265 | train_acc: 0.7852 | test_loss: 0.5853 | test_acc: 0.8456
Epoch: 9 | train_loss: 0.4877 | train_acc: 0.9297 | test_loss: 0.5064 | test_acc: 0.8968
Epoch: 10 | train_loss: 0.5234 | train_acc: 0.7930 | test_loss: 0.4918 | test_acc: 0.9167
[INFO] Saving model to: models/07_effnetb3_data_10_percent_10_epochs.pth
------------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0473 | train_acc: 0.4844 | test_loss: 0.9087 | test_acc: 0.8551
Epoch: 2 | train_loss: 0.8696 | train_acc: 0.8672 | test_loss: 0.7947 | test_acc: 0.9072
Epoch: 3 | train_loss: 0.7919 | train_acc: 0.7500 | test_loss: 0.6972 | test_acc: 0.9271
Epoch: 4 | train_loss: 0.6788 | train_acc: 0.8008 | test_loss: 0.5708 | test_acc: 0.9375
Epoch: 5 | train_loss: 0.6117 | train_acc: 0.8008 | test_loss: 0.5220 | test_acc: 0.9062
Epoch: 6 | train_loss: 0.5607 | train_acc: 0.7852 | test_loss: 0.5133 | test_acc: 0.8466
Epoch: 7 | train_loss: 0.5049 | train_acc: 0.9180 | test_loss: 0.4327 | test_acc: 0.9167
Epoch: 8 | train_loss: 0.5518 | train_acc: 0.8008 | test_loss: 0.4313 | test_acc: 0.9271
Epoch: 9 | train_loss: 0.4469 | train_acc: 0.9492 | test_loss: 0.4118 | test_acc: 0.9375
Epoch: 10 | train_loss: 0.4547 | train_acc: 0.8320 | test_loss: 0.4263 | test_acc: 0.8968
[INFO] Saving model to: models/07_effnetb5_data_10_percent_10_epochs.pth
------------------------------------

  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9576 | train_acc: 0.6188 | test_loss: 0.6536 | test_acc: 0.8655
Epoch: 2 | train_loss: 0.6833 | train_acc: 0.8396 | test_loss: 0.5426 | test_acc: 0.9375
Epoch: 3 | train_loss: 0.5723 | train_acc: 0.8812 | test_loss: 0.4492 | test_acc: 0.9375
Epoch: 4 | train_loss: 0.5319 | train_acc: 0.8250 | test_loss: 0.4709 | test_acc: 0.8674
Epoch: 5 | train_loss: 0.4489 | train_acc: 0.8771 | test_loss: 0.3799 | test_acc: 0.9081
[INFO] Saving model to: models/07_effnetb0_data_20_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 10
[INFO] Model: effnetb2
[INFO] DataLoader: data_20_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
[INFO] Created new effnetb2 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent/effnetb2/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9830 | train_acc: 0.5521 | test_loss: 0.7681 | test_acc: 0.8551
Epoch: 2 | train_loss: 0.7320 | train_acc: 0.7979 | test_loss: 0.6457 | test_acc: 0.8561
Epoch: 3 | train_loss: 0.6011 | train_acc: 0.8458 | test_loss: 0.5454 | test_acc: 0.9375
Epoch: 4 | train_loss: 0.4874 | train_acc: 0.9083 | test_loss: 0.5095 | test_acc: 0.9186
Epoch: 5 | train_loss: 0.4786 | train_acc: 0.8396 | test_loss: 0.4505 | test_acc: 0.9479
[INFO] Saving model to: models/07_effnetb2_data_20_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 11
[INFO] Model: effnetb3
[INFO] DataLoader: data_20_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
cuda
[INFO] Created new effnetb3 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent/effnetb3/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9738 | train_acc: 0.5917 | test_loss: 0.8418 | test_acc: 0.8352
Epoch: 2 | train_loss: 0.7481 | train_acc: 0.7667 | test_loss: 0.6553 | test_acc: 0.8352
Epoch: 3 | train_loss: 0.5785 | train_acc: 0.8688 | test_loss: 0.5550 | test_acc: 0.8456
Epoch: 4 | train_loss: 0.4719 | train_acc: 0.9000 | test_loss: 0.4871 | test_acc: 0.8655
Epoch: 5 | train_loss: 0.4465 | train_acc: 0.9229 | test_loss: 0.4533 | test_acc: 0.8352
[INFO] Saving model to: models/07_effnetb3_data_20_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 12
[INFO] Model: effnetb5
[INFO] DataLoader: data_20_percent
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
[INFO] Created new effnetb5 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent/effnetb5/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9715 | train_acc: 0.6479 | test_loss: 0.7849 | test_acc: 0.8864
Epoch: 2 | train_loss: 0.7314 | train_acc: 0.8292 | test_loss: 0.6222 | test_acc: 0.8873
Epoch: 3 | train_loss: 0.5959 | train_acc: 0.8583 | test_loss: 0.5076 | test_acc: 0.9271
Epoch: 4 | train_loss: 0.5381 | train_acc: 0.8604 | test_loss: 0.4148 | test_acc: 0.9176
Epoch: 5 | train_loss: 0.4818 | train_acc: 0.8750 | test_loss: 0.3982 | test_acc: 0.9271
[INFO] Saving model to: models/07_effnetb5_data_20_percent_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 13
[INFO] Model: effnetb0
[INFO] DataLoader: data_20_percent
[INFO] Number of epochs: 10
[INFO] Number of epochs: 10
[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent/effnetb0/10_epochs...


  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9576 | train_acc: 0.6188 | test_loss: 0.6536 | test_acc: 0.8655
Epoch: 2 | train_loss: 0.6833 | train_acc: 0.8396 | test_loss: 0.5426 | test_acc: 0.9375
Epoch: 3 | train_loss: 0.5723 | train_acc: 0.8812 | test_loss: 0.4492 | test_acc: 0.9375
Epoch: 4 | train_loss: 0.5319 | train_acc: 0.8250 | test_loss: 0.4709 | test_acc: 0.8674
Epoch: 5 | train_loss: 0.4489 | train_acc: 0.8771 | test_loss: 0.3799 | test_acc: 0.9081
Epoch: 6 | train_loss: 0.4661 | train_acc: 0.8479 | test_loss: 0.3473 | test_acc: 0.8977
Epoch: 7 | train_loss: 0.3623 | train_acc: 0.9083 | test_loss: 0.3067 | test_acc: 0.9072
Epoch: 8 | train_loss: 0.3393 | train_acc: 0.9125 | test_loss: 0.2854 | test_acc: 0.9583
Epoch: 9 | train_loss: 0.3388 | train_acc: 0.9000 | test_loss: 0.3584 | test_acc: 0.8778
Epoch: 10 | train_loss: 0.3593 | train_acc: 0.8542 | test_loss: 0.2844 | test_acc: 0.9176
[INFO] Saving model to: models/07_effnetb0_data_20_percent_10_epochs.pth
------------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9830 | train_acc: 0.5521 | test_loss: 0.7681 | test_acc: 0.8551
Epoch: 2 | train_loss: 0.7320 | train_acc: 0.7979 | test_loss: 0.6457 | test_acc: 0.8561
Epoch: 3 | train_loss: 0.6011 | train_acc: 0.8458 | test_loss: 0.5454 | test_acc: 0.9375
Epoch: 4 | train_loss: 0.4874 | train_acc: 0.9083 | test_loss: 0.5095 | test_acc: 0.9186
Epoch: 5 | train_loss: 0.4786 | train_acc: 0.8396 | test_loss: 0.4505 | test_acc: 0.9479
Epoch: 6 | train_loss: 0.4264 | train_acc: 0.8792 | test_loss: 0.4464 | test_acc: 0.9186
Epoch: 7 | train_loss: 0.3489 | train_acc: 0.9458 | test_loss: 0.3760 | test_acc: 0.9072
Epoch: 8 | train_loss: 0.3337 | train_acc: 0.9333 | test_loss: 0.3938 | test_acc: 0.9384
Epoch: 9 | train_loss: 0.3379 | train_acc: 0.8938 | test_loss: 0.4188 | test_acc: 0.9489
Epoch: 10 | train_loss: 0.3585 | train_acc: 0.8917 | test_loss: 0.3552 | test_acc: 0.9280
[INFO] Saving model to: models/07_effnetb2_data_20_percent_10_epochs.pth
------------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9738 | train_acc: 0.5917 | test_loss: 0.8418 | test_acc: 0.8352
Epoch: 2 | train_loss: 0.7481 | train_acc: 0.7667 | test_loss: 0.6553 | test_acc: 0.8352
Epoch: 3 | train_loss: 0.5785 | train_acc: 0.8688 | test_loss: 0.5550 | test_acc: 0.8456
Epoch: 4 | train_loss: 0.4719 | train_acc: 0.9000 | test_loss: 0.4871 | test_acc: 0.8655
Epoch: 5 | train_loss: 0.4465 | train_acc: 0.9229 | test_loss: 0.4533 | test_acc: 0.8352
Epoch: 6 | train_loss: 0.4528 | train_acc: 0.8667 | test_loss: 0.4066 | test_acc: 0.8352
Epoch: 7 | train_loss: 0.3704 | train_acc: 0.9104 | test_loss: 0.4197 | test_acc: 0.8248
Epoch: 8 | train_loss: 0.3970 | train_acc: 0.8771 | test_loss: 0.4359 | test_acc: 0.8456
Epoch: 9 | train_loss: 0.3913 | train_acc: 0.8562 | test_loss: 0.3926 | test_acc: 0.8655
Epoch: 10 | train_loss: 0.3185 | train_acc: 0.9146 | test_loss: 0.3882 | test_acc: 0.8447
[INFO] Saving model to: models/07_effnetb3_data_20_percent_10_epochs.pth
------------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9715 | train_acc: 0.6479 | test_loss: 0.7849 | test_acc: 0.8864
Epoch: 2 | train_loss: 0.7314 | train_acc: 0.8292 | test_loss: 0.6222 | test_acc: 0.8873
Epoch: 3 | train_loss: 0.5959 | train_acc: 0.8583 | test_loss: 0.5076 | test_acc: 0.9271
Epoch: 4 | train_loss: 0.5381 | train_acc: 0.8604 | test_loss: 0.4148 | test_acc: 0.9176
Epoch: 5 | train_loss: 0.4818 | train_acc: 0.8750 | test_loss: 0.3982 | test_acc: 0.9271
Epoch: 6 | train_loss: 0.4227 | train_acc: 0.8938 | test_loss: 0.4513 | test_acc: 0.8674
Epoch: 7 | train_loss: 0.4408 | train_acc: 0.8646 | test_loss: 0.3525 | test_acc: 0.9271
Epoch: 8 | train_loss: 0.4163 | train_acc: 0.8958 | test_loss: 0.3616 | test_acc: 0.8759
Epoch: 9 | train_loss: 0.4017 | train_acc: 0.8979 | test_loss: 0.3520 | test_acc: 0.9062
Epoch: 10 | train_loss: 0.3844 | train_acc: 0.8938 | test_loss: 0.3780 | test_acc: 0.8864
[INFO] Saving model to: models/07_effnetb5_data_20_percent_10_epochs.pth
------------------------------------

  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9665 | train_acc: 0.5708 | test_loss: 0.6638 | test_acc: 0.8759
Epoch: 2 | train_loss: 0.7166 | train_acc: 0.7979 | test_loss: 0.5489 | test_acc: 0.9072
Epoch: 3 | train_loss: 0.6131 | train_acc: 0.7979 | test_loss: 0.4284 | test_acc: 0.9479
Epoch: 4 | train_loss: 0.5944 | train_acc: 0.7792 | test_loss: 0.4607 | test_acc: 0.8873
Epoch: 5 | train_loss: 0.4907 | train_acc: 0.8542 | test_loss: 0.3948 | test_acc: 0.9290
[INFO] Saving model to: models/07_effnetb0_data_20_percent_aug_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 18
[INFO] Model: effnetb2
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
[INFO] Created new effnetb2 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent_aug/effnetb2/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9968 | train_acc: 0.4958 | test_loss: 0.7718 | test_acc: 0.8655
Epoch: 2 | train_loss: 0.7824 | train_acc: 0.7250 | test_loss: 0.6388 | test_acc: 0.9167
Epoch: 3 | train_loss: 0.6271 | train_acc: 0.8458 | test_loss: 0.5410 | test_acc: 0.9072
Epoch: 4 | train_loss: 0.5617 | train_acc: 0.8500 | test_loss: 0.5007 | test_acc: 0.9176
Epoch: 5 | train_loss: 0.5188 | train_acc: 0.8583 | test_loss: 0.4340 | test_acc: 0.9271
[INFO] Saving model to: models/07_effnetb2_data_20_percent_aug_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 19
[INFO] Model: effnetb3
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
cuda
[INFO] Created new effnetb3 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent_aug/effnetb3/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9770 | train_acc: 0.6250 | test_loss: 0.8357 | test_acc: 0.8352
Epoch: 2 | train_loss: 0.7626 | train_acc: 0.7854 | test_loss: 0.6803 | test_acc: 0.8248
Epoch: 3 | train_loss: 0.6138 | train_acc: 0.8333 | test_loss: 0.5690 | test_acc: 0.8561
Epoch: 4 | train_loss: 0.5070 | train_acc: 0.8812 | test_loss: 0.5007 | test_acc: 0.8447
Epoch: 5 | train_loss: 0.4791 | train_acc: 0.8833 | test_loss: 0.4643 | test_acc: 0.8248
[INFO] Saving model to: models/07_effnetb3_data_20_percent_aug_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 20
[INFO] Model: effnetb5
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 5
[INFO] Created new effnetb5 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent_aug/effnetb5/5_epochs...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9675 | train_acc: 0.6167 | test_loss: 0.8103 | test_acc: 0.8447
Epoch: 2 | train_loss: 0.7756 | train_acc: 0.7917 | test_loss: 0.6391 | test_acc: 0.8968
Epoch: 3 | train_loss: 0.6511 | train_acc: 0.8000 | test_loss: 0.5296 | test_acc: 0.9167
Epoch: 4 | train_loss: 0.6160 | train_acc: 0.8063 | test_loss: 0.4310 | test_acc: 0.9072
Epoch: 5 | train_loss: 0.5626 | train_acc: 0.8458 | test_loss: 0.4016 | test_acc: 0.9271
[INFO] Saving model to: models/07_effnetb5_data_20_percent_aug_5_epochs.pth
--------------------------------------------------

[INFO] Experiment number: 21
[INFO] Model: effnetb0
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 10
[INFO] Number of epochs: 10
[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30/data_20_percent_aug/effnetb0/10_epochs...


  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9665 | train_acc: 0.5708 | test_loss: 0.6638 | test_acc: 0.8759
Epoch: 2 | train_loss: 0.7166 | train_acc: 0.7979 | test_loss: 0.5489 | test_acc: 0.9072
Epoch: 3 | train_loss: 0.6131 | train_acc: 0.7979 | test_loss: 0.4284 | test_acc: 0.9479
Epoch: 4 | train_loss: 0.5944 | train_acc: 0.7792 | test_loss: 0.4607 | test_acc: 0.8873
Epoch: 5 | train_loss: 0.4907 | train_acc: 0.8542 | test_loss: 0.3948 | test_acc: 0.9290
Epoch: 6 | train_loss: 0.5208 | train_acc: 0.8396 | test_loss: 0.3487 | test_acc: 0.9489
Epoch: 7 | train_loss: 0.4420 | train_acc: 0.8562 | test_loss: 0.2987 | test_acc: 0.9176
Epoch: 8 | train_loss: 0.3916 | train_acc: 0.8875 | test_loss: 0.2933 | test_acc: 0.9479
Epoch: 9 | train_loss: 0.3971 | train_acc: 0.8792 | test_loss: 0.3595 | test_acc: 0.8883
Epoch: 10 | train_loss: 0.4040 | train_acc: 0.8729 | test_loss: 0.2793 | test_acc: 0.9072
[INFO] Saving model to: models/07_effnetb0_data_20_percent_aug_10_epochs.pth
--------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9968 | train_acc: 0.4958 | test_loss: 0.7718 | test_acc: 0.8655
Epoch: 2 | train_loss: 0.7824 | train_acc: 0.7250 | test_loss: 0.6388 | test_acc: 0.9167
Epoch: 3 | train_loss: 0.6271 | train_acc: 0.8458 | test_loss: 0.5410 | test_acc: 0.9072
Epoch: 4 | train_loss: 0.5617 | train_acc: 0.8500 | test_loss: 0.5007 | test_acc: 0.9176
Epoch: 5 | train_loss: 0.5188 | train_acc: 0.8583 | test_loss: 0.4340 | test_acc: 0.9271
Epoch: 6 | train_loss: 0.4827 | train_acc: 0.8646 | test_loss: 0.4190 | test_acc: 0.9280
Epoch: 7 | train_loss: 0.4158 | train_acc: 0.9062 | test_loss: 0.3653 | test_acc: 0.9176
Epoch: 8 | train_loss: 0.4195 | train_acc: 0.8500 | test_loss: 0.3931 | test_acc: 0.9081
Epoch: 9 | train_loss: 0.4191 | train_acc: 0.8792 | test_loss: 0.4055 | test_acc: 0.9384
Epoch: 10 | train_loss: 0.4349 | train_acc: 0.8438 | test_loss: 0.3346 | test_acc: 0.9384
[INFO] Saving model to: models/07_effnetb2_data_20_percent_aug_10_epochs.pth
--------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9770 | train_acc: 0.6250 | test_loss: 0.8357 | test_acc: 0.8352
Epoch: 2 | train_loss: 0.7626 | train_acc: 0.7854 | test_loss: 0.6803 | test_acc: 0.8248
Epoch: 3 | train_loss: 0.6138 | train_acc: 0.8333 | test_loss: 0.5690 | test_acc: 0.8561
Epoch: 4 | train_loss: 0.5070 | train_acc: 0.8812 | test_loss: 0.5007 | test_acc: 0.8447
Epoch: 5 | train_loss: 0.4791 | train_acc: 0.8833 | test_loss: 0.4643 | test_acc: 0.8248
Epoch: 6 | train_loss: 0.5046 | train_acc: 0.8750 | test_loss: 0.4232 | test_acc: 0.8665
Epoch: 7 | train_loss: 0.4166 | train_acc: 0.8771 | test_loss: 0.4212 | test_acc: 0.8551
Epoch: 8 | train_loss: 0.4411 | train_acc: 0.8542 | test_loss: 0.4289 | test_acc: 0.8864
Epoch: 9 | train_loss: 0.4411 | train_acc: 0.8562 | test_loss: 0.3958 | test_acc: 0.8759
Epoch: 10 | train_loss: 0.4003 | train_acc: 0.8542 | test_loss: 0.3921 | test_acc: 0.8456
[INFO] Saving model to: models/07_effnetb3_data_20_percent_aug_10_epochs.pth
--------------------------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9675 | train_acc: 0.6167 | test_loss: 0.8103 | test_acc: 0.8447
Epoch: 2 | train_loss: 0.7756 | train_acc: 0.7917 | test_loss: 0.6391 | test_acc: 0.8968
Epoch: 3 | train_loss: 0.6511 | train_acc: 0.8000 | test_loss: 0.5296 | test_acc: 0.9167
Epoch: 4 | train_loss: 0.6160 | train_acc: 0.8063 | test_loss: 0.4310 | test_acc: 0.9072
Epoch: 5 | train_loss: 0.5626 | train_acc: 0.8458 | test_loss: 0.4016 | test_acc: 0.9271
Epoch: 6 | train_loss: 0.4790 | train_acc: 0.8625 | test_loss: 0.4578 | test_acc: 0.8674
Epoch: 7 | train_loss: 0.5202 | train_acc: 0.8354 | test_loss: 0.3568 | test_acc: 0.8968
Epoch: 8 | train_loss: 0.5361 | train_acc: 0.8104 | test_loss: 0.3598 | test_acc: 0.9062
Epoch: 9 | train_loss: 0.4960 | train_acc: 0.8521 | test_loss: 0.3431 | test_acc: 0.9062
Epoch: 10 | train_loss: 0.4608 | train_acc: 0.8542 | test_loss: 0.3572 | test_acc: 0.8864
[INFO] Saving model to: models/07_effnetb5_data_20_percent_aug_10_epochs.pth
--------------------------------

In [31]:
# TODO: your code

# 1. Create epochs list
num_epochs = [5, 10]

# 2. Create models list (need to create a new model for each experiment)
models = {} #["effnetb3", "effnetb5"]
models["effnetb0"] = create_effnetb0
models["effnetb2"] = create_effnetb2
models["effnetb3"] = create_effnetb3
models["effnetb5"] = create_effnetb5

# 3. Create dataloaders dictionary for various dataloaders
# train_dataloaders = {"data_10_percent": train_dataloader_10_percent,
#                      "data_20_percent": train_dataloader_20_percent}

num_lr = [0.001, 0.0001]

train_dataloaders = {"data_20_percent_aug": train_dataloader_20_percent_aug}


### Exp 2.1 Changing learning rate

In [None]:
%%time
from going_modular.going_modular.utils import save_model

# 1. Set the random seeds
set_seeds(seed=42)

# 2. Keep track of experiment numbers
experiment_number = 0

# 3. Loop through each DataLoader
for dataloader_name, train_dataloader in train_dataloaders.items():

    # 4. Loop through each number of epochs
    for epochs in num_epochs:

        # 5. Loop through each model name and create a new model based on the name
        for model_name, model_init in models.items():

            for lr in num_lr:

                # 6. Create information print outs
                experiment_number += 1
                print(f"[INFO] Experiment number: {experiment_number}")
                print(f"[INFO] Model: {model_name}")
                print(f"[INFO] DataLoader: {dataloader_name}")
                print(f"[INFO] Number of epochs: {epochs}")
                print(f"[INFO] Number of epochs: {lr}")


                # 7. Select the model
                # if model_name == "effnetb0":
                #     model = create_effnetb0() # creates a new model each time (important because we want each experiment to start from scratch)
                # else:
                #     model = create_effnetb2() # creates a new model each time (important because we want each experiment to start from scratch)

                # break
                model = model_init()
                # 8. Create a new loss and optimizer for every model
                loss_fn = nn.CrossEntropyLoss()
                optimizer = torch.optim.Adam(params=model.parameters(), lr=lr)

                # 9. Train target model with target dataloaders and track experiments
                train(model=model,
                    train_dataloader=train_dataloader,
                    test_dataloader=test_dataloader,
                    optimizer=optimizer,
                    loss_fn=loss_fn,
                    epochs=epochs,
                    device=device,
                    writer=create_writer(experiment_name=dataloader_name,
                                        model_name=model_name,
                                        tag='lr',
                                        extra=f"{epochs}_epochs_{lr}_learning_rate"))

                # 10. Save the model to file so we can get back the best model
                save_filepath = f"07_{model_name}_{dataloader_name}_{epochs}_epochs_{lr}_learning_rate.pth"
                save_model(model=model,
                        target_dir=MODELS_DIR,
                        model_name=save_filepath)
                print("-"*50 + "\n")

[INFO] Experiment number: 1
[INFO] Model: effnetb0
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.001
[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb0/5_epochs_0.001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9665 | train_acc: 0.5708 | test_loss: 0.6638 | test_acc: 0.8759
Epoch: 2 | train_loss: 0.7166 | train_acc: 0.7979 | test_loss: 0.5489 | test_acc: 0.9072
Epoch: 3 | train_loss: 0.6131 | train_acc: 0.7979 | test_loss: 0.4284 | test_acc: 0.9479
Epoch: 4 | train_loss: 0.5944 | train_acc: 0.7792 | test_loss: 0.4607 | test_acc: 0.8873
Epoch: 5 | train_loss: 0.4907 | train_acc: 0.8542 | test_loss: 0.3948 | test_acc: 0.9290
[INFO] Saving model to: models/07_effnetb0_data_20_percent_aug_5_epochs_0.001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 2
[INFO] Model: effnetb0
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.0001
[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb0/5_epochs_0.0001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0827 | train_acc: 0.3771 | test_loss: 1.0462 | test_acc: 0.3608
Epoch: 2 | train_loss: 1.0396 | train_acc: 0.4938 | test_loss: 1.0308 | test_acc: 0.5256
Epoch: 3 | train_loss: 1.0201 | train_acc: 0.5500 | test_loss: 0.9804 | test_acc: 0.6278
Epoch: 4 | train_loss: 1.0184 | train_acc: 0.5312 | test_loss: 0.9751 | test_acc: 0.6411
Epoch: 5 | train_loss: 0.9804 | train_acc: 0.5625 | test_loss: 0.9246 | test_acc: 0.7131
[INFO] Saving model to: models/07_effnetb0_data_20_percent_aug_5_epochs_0.0001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 3
[INFO] Model: effnetb2
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.001
[INFO] Created new effnetb2 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb2/5_epochs_0.001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9968 | train_acc: 0.4958 | test_loss: 0.7718 | test_acc: 0.8655
Epoch: 2 | train_loss: 0.7824 | train_acc: 0.7250 | test_loss: 0.6388 | test_acc: 0.9167
Epoch: 3 | train_loss: 0.6271 | train_acc: 0.8458 | test_loss: 0.5410 | test_acc: 0.9072
Epoch: 4 | train_loss: 0.5617 | train_acc: 0.8500 | test_loss: 0.5007 | test_acc: 0.9176
Epoch: 5 | train_loss: 0.5188 | train_acc: 0.8583 | test_loss: 0.4340 | test_acc: 0.9271
[INFO] Saving model to: models/07_effnetb2_data_20_percent_aug_5_epochs_0.001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 4
[INFO] Model: effnetb2
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.0001
[INFO] Created new effnetb2 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb2/5_epochs_0.0001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.1069 | train_acc: 0.3333 | test_loss: 1.0673 | test_acc: 0.4328
Epoch: 2 | train_loss: 1.0873 | train_acc: 0.3917 | test_loss: 1.0546 | test_acc: 0.4848
Epoch: 3 | train_loss: 1.0521 | train_acc: 0.4604 | test_loss: 1.0301 | test_acc: 0.4962
Epoch: 4 | train_loss: 1.0191 | train_acc: 0.5354 | test_loss: 1.0047 | test_acc: 0.6506
Epoch: 5 | train_loss: 0.9945 | train_acc: 0.5854 | test_loss: 0.9720 | test_acc: 0.6913
[INFO] Saving model to: models/07_effnetb2_data_20_percent_aug_5_epochs_0.0001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 5
[INFO] Model: effnetb3
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.001
cuda
[INFO] Created new effnetb3 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb3/5_epochs_0.001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9770 | train_acc: 0.6250 | test_loss: 0.8357 | test_acc: 0.8352
Epoch: 2 | train_loss: 0.7626 | train_acc: 0.7854 | test_loss: 0.6803 | test_acc: 0.8248
Epoch: 3 | train_loss: 0.6138 | train_acc: 0.8333 | test_loss: 0.5690 | test_acc: 0.8561
Epoch: 4 | train_loss: 0.5070 | train_acc: 0.8812 | test_loss: 0.5007 | test_acc: 0.8447
Epoch: 5 | train_loss: 0.4791 | train_acc: 0.8833 | test_loss: 0.4643 | test_acc: 0.8248
[INFO] Saving model to: models/07_effnetb3_data_20_percent_aug_5_epochs_0.001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 6
[INFO] Model: effnetb3
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.0001
cuda
[INFO] Created new effnetb3 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb3/5_epochs_0.0001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.1033 | train_acc: 0.3604 | test_loss: 1.1002 | test_acc: 0.2983
Epoch: 2 | train_loss: 1.0686 | train_acc: 0.4792 | test_loss: 1.0671 | test_acc: 0.4650
Epoch: 3 | train_loss: 1.0373 | train_acc: 0.4667 | test_loss: 1.0307 | test_acc: 0.5682
Epoch: 4 | train_loss: 1.0104 | train_acc: 0.5917 | test_loss: 0.9872 | test_acc: 0.6799
Epoch: 5 | train_loss: 0.9813 | train_acc: 0.6542 | test_loss: 0.9693 | test_acc: 0.6913
[INFO] Saving model to: models/07_effnetb3_data_20_percent_aug_5_epochs_0.0001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 7
[INFO] Model: effnetb5
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.001
[INFO] Created new effnetb5 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb5/5_epochs_0.001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9675 | train_acc: 0.6167 | test_loss: 0.8103 | test_acc: 0.8447
Epoch: 2 | train_loss: 0.7756 | train_acc: 0.7917 | test_loss: 0.6391 | test_acc: 0.8968
Epoch: 3 | train_loss: 0.6511 | train_acc: 0.8000 | test_loss: 0.5296 | test_acc: 0.9167
Epoch: 4 | train_loss: 0.6160 | train_acc: 0.8063 | test_loss: 0.4310 | test_acc: 0.9072
Epoch: 5 | train_loss: 0.5626 | train_acc: 0.8458 | test_loss: 0.4016 | test_acc: 0.9271
[INFO] Saving model to: models/07_effnetb5_data_20_percent_aug_5_epochs_0.001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 8
[INFO] Model: effnetb5
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 5
[INFO] Number of epochs: 0.0001
[INFO] Created new effnetb5 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb5/5_epochs_0.0001_learning_rate...


  0%|          | 0/5 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0712 | train_acc: 0.4208 | test_loss: 1.0433 | test_acc: 0.6288
Epoch: 2 | train_loss: 1.0546 | train_acc: 0.4792 | test_loss: 1.0045 | test_acc: 0.7206
Epoch: 3 | train_loss: 1.0235 | train_acc: 0.5146 | test_loss: 0.9740 | test_acc: 0.7216
Epoch: 4 | train_loss: 1.0103 | train_acc: 0.6125 | test_loss: 0.9300 | test_acc: 0.8021
Epoch: 5 | train_loss: 0.9831 | train_acc: 0.6479 | test_loss: 0.9111 | test_acc: 0.8125
[INFO] Saving model to: models/07_effnetb5_data_20_percent_aug_5_epochs_0.0001_learning_rate.pth
--------------------------------------------------

[INFO] Experiment number: 9
[INFO] Model: effnetb0
[INFO] DataLoader: data_20_percent_aug
[INFO] Number of epochs: 10
[INFO] Number of epochs: 0.001
[INFO] Created new effnetb0 model.
[INFO] Created SummaryWriter, saving to: runs/2024-04-30_lr/data_20_percent_aug/effnetb0/10_epochs_0.001_learning_rate...


  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9665 | train_acc: 0.5708 | test_loss: 0.6638 | test_acc: 0.8759
Epoch: 2 | train_loss: 0.7166 | train_acc: 0.7979 | test_loss: 0.5489 | test_acc: 0.9072
Epoch: 3 | train_loss: 0.6131 | train_acc: 0.7979 | test_loss: 0.4284 | test_acc: 0.9479
Epoch: 4 | train_loss: 0.5944 | train_acc: 0.7792 | test_loss: 0.4607 | test_acc: 0.8873
Epoch: 5 | train_loss: 0.4907 | train_acc: 0.8542 | test_loss: 0.3948 | test_acc: 0.9290
Epoch: 6 | train_loss: 0.5208 | train_acc: 0.8396 | test_loss: 0.3487 | test_acc: 0.9489
Epoch: 7 | train_loss: 0.4420 | train_acc: 0.8562 | test_loss: 0.2987 | test_acc: 0.9176
Epoch: 8 | train_loss: 0.3916 | train_acc: 0.8875 | test_loss: 0.2933 | test_acc: 0.9479
Epoch: 9 | train_loss: 0.3971 | train_acc: 0.8792 | test_loss: 0.3595 | test_acc: 0.8883
Epoch: 10 | train_loss: 0.4040 | train_acc: 0.8729 | test_loss: 0.2793 | test_acc: 0.9072
[INFO] Saving model to: models/07_effnetb0_data_20_percent_aug_10_epochs_0.001_learning_rate.pth
------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0827 | train_acc: 0.3771 | test_loss: 1.0462 | test_acc: 0.3608
Epoch: 2 | train_loss: 1.0396 | train_acc: 0.4938 | test_loss: 1.0308 | test_acc: 0.5256
Epoch: 3 | train_loss: 1.0201 | train_acc: 0.5500 | test_loss: 0.9804 | test_acc: 0.6278
Epoch: 4 | train_loss: 1.0184 | train_acc: 0.5312 | test_loss: 0.9751 | test_acc: 0.6411
Epoch: 5 | train_loss: 0.9804 | train_acc: 0.5625 | test_loss: 0.9246 | test_acc: 0.7131
Epoch: 6 | train_loss: 0.9556 | train_acc: 0.6458 | test_loss: 0.8909 | test_acc: 0.8248
Epoch: 7 | train_loss: 0.9048 | train_acc: 0.7146 | test_loss: 0.8471 | test_acc: 0.8447
Epoch: 8 | train_loss: 0.8874 | train_acc: 0.7333 | test_loss: 0.8462 | test_acc: 0.8248
Epoch: 9 | train_loss: 0.8641 | train_acc: 0.7646 | test_loss: 0.8465 | test_acc: 0.7945
Epoch: 10 | train_loss: 0.8526 | train_acc: 0.7688 | test_loss: 0.7899 | test_acc: 0.8655
[INFO] Saving model to: models/07_effnetb0_data_20_percent_aug_10_epochs_0.0001_learning_rate.pth
-----------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9968 | train_acc: 0.4958 | test_loss: 0.7718 | test_acc: 0.8655
Epoch: 2 | train_loss: 0.7824 | train_acc: 0.7250 | test_loss: 0.6388 | test_acc: 0.9167
Epoch: 3 | train_loss: 0.6271 | train_acc: 0.8458 | test_loss: 0.5410 | test_acc: 0.9072
Epoch: 4 | train_loss: 0.5617 | train_acc: 0.8500 | test_loss: 0.5007 | test_acc: 0.9176
Epoch: 5 | train_loss: 0.5188 | train_acc: 0.8583 | test_loss: 0.4340 | test_acc: 0.9271
Epoch: 6 | train_loss: 0.4827 | train_acc: 0.8646 | test_loss: 0.4190 | test_acc: 0.9280
Epoch: 7 | train_loss: 0.4158 | train_acc: 0.9062 | test_loss: 0.3653 | test_acc: 0.9176
Epoch: 8 | train_loss: 0.4195 | train_acc: 0.8500 | test_loss: 0.3931 | test_acc: 0.9081
Epoch: 9 | train_loss: 0.4191 | train_acc: 0.8792 | test_loss: 0.4055 | test_acc: 0.9384
Epoch: 10 | train_loss: 0.4349 | train_acc: 0.8438 | test_loss: 0.3346 | test_acc: 0.9384
[INFO] Saving model to: models/07_effnetb2_data_20_percent_aug_10_epochs_0.001_learning_rate.pth
------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.1069 | train_acc: 0.3333 | test_loss: 1.0673 | test_acc: 0.4328
Epoch: 2 | train_loss: 1.0873 | train_acc: 0.3917 | test_loss: 1.0546 | test_acc: 0.4848
Epoch: 3 | train_loss: 1.0521 | train_acc: 0.4604 | test_loss: 1.0301 | test_acc: 0.4962
Epoch: 4 | train_loss: 1.0191 | train_acc: 0.5354 | test_loss: 1.0047 | test_acc: 0.6506
Epoch: 5 | train_loss: 0.9945 | train_acc: 0.5854 | test_loss: 0.9720 | test_acc: 0.6913
Epoch: 6 | train_loss: 0.9730 | train_acc: 0.5687 | test_loss: 0.9473 | test_acc: 0.7936
Epoch: 7 | train_loss: 0.9314 | train_acc: 0.7042 | test_loss: 0.9234 | test_acc: 0.7945
Epoch: 8 | train_loss: 0.9252 | train_acc: 0.6312 | test_loss: 0.9227 | test_acc: 0.7737
Epoch: 9 | train_loss: 0.9149 | train_acc: 0.6521 | test_loss: 0.8925 | test_acc: 0.8049
Epoch: 10 | train_loss: 0.8892 | train_acc: 0.7021 | test_loss: 0.8419 | test_acc: 0.8958
[INFO] Saving model to: models/07_effnetb2_data_20_percent_aug_10_epochs_0.0001_learning_rate.pth
-----------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9770 | train_acc: 0.6250 | test_loss: 0.8357 | test_acc: 0.8352
Epoch: 2 | train_loss: 0.7626 | train_acc: 0.7854 | test_loss: 0.6803 | test_acc: 0.8248
Epoch: 3 | train_loss: 0.6138 | train_acc: 0.8333 | test_loss: 0.5690 | test_acc: 0.8561
Epoch: 4 | train_loss: 0.5070 | train_acc: 0.8812 | test_loss: 0.5007 | test_acc: 0.8447
Epoch: 5 | train_loss: 0.4791 | train_acc: 0.8833 | test_loss: 0.4643 | test_acc: 0.8248
Epoch: 6 | train_loss: 0.5046 | train_acc: 0.8750 | test_loss: 0.4232 | test_acc: 0.8665
Epoch: 7 | train_loss: 0.4166 | train_acc: 0.8771 | test_loss: 0.4212 | test_acc: 0.8551
Epoch: 8 | train_loss: 0.4411 | train_acc: 0.8542 | test_loss: 0.4289 | test_acc: 0.8864
Epoch: 9 | train_loss: 0.4411 | train_acc: 0.8562 | test_loss: 0.3958 | test_acc: 0.8759
Epoch: 10 | train_loss: 0.4003 | train_acc: 0.8542 | test_loss: 0.3921 | test_acc: 0.8456
[INFO] Saving model to: models/07_effnetb3_data_20_percent_aug_10_epochs_0.001_learning_rate.pth
------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.1033 | train_acc: 0.3604 | test_loss: 1.1002 | test_acc: 0.2983
Epoch: 2 | train_loss: 1.0686 | train_acc: 0.4792 | test_loss: 1.0671 | test_acc: 0.4650
Epoch: 3 | train_loss: 1.0373 | train_acc: 0.4667 | test_loss: 1.0307 | test_acc: 0.5682
Epoch: 4 | train_loss: 1.0104 | train_acc: 0.5917 | test_loss: 0.9872 | test_acc: 0.6799
Epoch: 5 | train_loss: 0.9813 | train_acc: 0.6542 | test_loss: 0.9693 | test_acc: 0.6913
Epoch: 6 | train_loss: 0.9659 | train_acc: 0.6438 | test_loss: 0.9360 | test_acc: 0.7121
Epoch: 7 | train_loss: 0.9249 | train_acc: 0.7375 | test_loss: 0.9098 | test_acc: 0.7633
Epoch: 8 | train_loss: 0.9221 | train_acc: 0.6937 | test_loss: 0.9018 | test_acc: 0.7737
Epoch: 9 | train_loss: 0.8971 | train_acc: 0.7521 | test_loss: 0.8785 | test_acc: 0.8229
Epoch: 10 | train_loss: 0.8762 | train_acc: 0.7521 | test_loss: 0.8486 | test_acc: 0.7633
[INFO] Saving model to: models/07_effnetb3_data_20_percent_aug_10_epochs_0.0001_learning_rate.pth
-----------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 0.9675 | train_acc: 0.6167 | test_loss: 0.8103 | test_acc: 0.8447
Epoch: 2 | train_loss: 0.7756 | train_acc: 0.7917 | test_loss: 0.6391 | test_acc: 0.8968
Epoch: 3 | train_loss: 0.6511 | train_acc: 0.8000 | test_loss: 0.5296 | test_acc: 0.9167
Epoch: 4 | train_loss: 0.6160 | train_acc: 0.8063 | test_loss: 0.4310 | test_acc: 0.9072
Epoch: 5 | train_loss: 0.5626 | train_acc: 0.8458 | test_loss: 0.4016 | test_acc: 0.9271
Epoch: 6 | train_loss: 0.4790 | train_acc: 0.8625 | test_loss: 0.4578 | test_acc: 0.8674
Epoch: 7 | train_loss: 0.5202 | train_acc: 0.8354 | test_loss: 0.3568 | test_acc: 0.8968
Epoch: 8 | train_loss: 0.5361 | train_acc: 0.8104 | test_loss: 0.3598 | test_acc: 0.9062
Epoch: 9 | train_loss: 0.4960 | train_acc: 0.8521 | test_loss: 0.3431 | test_acc: 0.9062
Epoch: 10 | train_loss: 0.4608 | train_acc: 0.8542 | test_loss: 0.3572 | test_acc: 0.8864
[INFO] Saving model to: models/07_effnetb5_data_20_percent_aug_10_epochs_0.001_learning_rate.pth
------------

  0%|          | 0/10 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 1.0712 | train_acc: 0.4208 | test_loss: 1.0433 | test_acc: 0.6288
Epoch: 2 | train_loss: 1.0546 | train_acc: 0.4792 | test_loss: 1.0045 | test_acc: 0.7206
Epoch: 3 | train_loss: 1.0235 | train_acc: 0.5146 | test_loss: 0.9740 | test_acc: 0.7216
Epoch: 4 | train_loss: 1.0103 | train_acc: 0.6125 | test_loss: 0.9300 | test_acc: 0.8021
Epoch: 5 | train_loss: 0.9831 | train_acc: 0.6479 | test_loss: 0.9111 | test_acc: 0.8125
Epoch: 6 | train_loss: 0.9402 | train_acc: 0.6875 | test_loss: 0.9209 | test_acc: 0.7850
Epoch: 7 | train_loss: 0.9469 | train_acc: 0.6583 | test_loss: 0.8596 | test_acc: 0.8343
Epoch: 8 | train_loss: 0.9354 | train_acc: 0.6917 | test_loss: 0.8563 | test_acc: 0.8248
Epoch: 9 | train_loss: 0.9052 | train_acc: 0.7375 | test_loss: 0.8471 | test_acc: 0.8456
Epoch: 10 | train_loss: 0.8907 | train_acc: 0.7333 | test_loss: 0.8328 | test_acc: 0.8153
[INFO] Saving model to: models/07_effnetb5_data_20_percent_aug_10_epochs_0.0001_learning_rate.pth
-----------

## Exercise 3. Scale up the dataset to turn FoodVision Mini into FoodVision Big using the entire [Food101 dataset from `torchvision.models`](https://pytorch.org/vision/stable/generated/torchvision.datasets.Food101.html#torchvision.datasets.Food101)
    
* You could take the best performing model from your various experiments or even the EffNetB2 feature extractor we created in this notebook and see how it goes fitting for 5 epochs on all of Food101.
* If you try more than one model, it would be good to have the model's results tracked.
* If you load the Food101 dataset from `torchvision.models`, you'll have to create PyTorch DataLoaders to use it in training.
* **Note:** Due to the larger amount of data in Food101 compared to our pizza, steak, sushi dataset, this model will take longer to train.

### Preapare data

In [None]:
# TODO: your code

import torchvision

# Create a transform to normalize data distribution to be inline with ImageNet
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], # values per colour channel [red, green, blue]
                                 std=[0.229, 0.224, 0.225])

# Create a transform pipeline
simple_transform = transforms.Compose([
                                       transforms.Resize((224, 224)),
                                       transforms.ToTensor(), # get image values between 0 & 1
                                       normalize
])

train_data = torchvision.datasets.Food101(root='data_food', split='train', transform=simple_transform, download=True)
test_data = torchvision.datasets.Food101(root='data_food', split='test', transform=simple_transform, download=True)


Downloading https://data.vision.ee.ethz.ch/cvl/food-101.tar.gz to data_food/food-101.tar.gz


100%|██████████| 4996278331/4996278331 [05:04<00:00, 16399565.26it/s]


Extracting data_food/food-101.tar.gz to data_food


In [None]:
len(train_data), len(test_data)

(75750, 25250)

In [None]:
import os
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

BATCH_SIZE = 32
NUM_WORKERS = os.cpu_count()

train_dataloader_food = DataLoader(train_data, batch_size=BATCH_SIZE, shuffle=True, num_workers=NUM_WORKERS, pin_memory=True)
test_dataloader_food = DataLoader(test_data, batch_size=BATCH_SIZE, shuffle=False, num_workers=NUM_WORKERS, pin_memory=True)
NUM_WORKERS

2

In [None]:
device

'cuda'

### Prepare model
The most efficient NN was EffNetb2.

In [None]:
model_weight = torchvision.models.EfficientNet_B2_Weights.IMAGENET1K_V1
big_model = torchvision.models.efficientnet_b2(weights=model_weight).to(device)

In [None]:
for param in big_model.features.parameters():
  param.requires_grad = False


big_model.classifier = nn.Sequential(
    nn.Dropout(p=0.2),
    nn.Linear(in_features=1408, out_features=101)
).to(device)

### Train big model


In [None]:
%%time
epochs = 5
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(params=big_model.parameters(), lr=0.001)

train(model=big_model,
      train_dataloader=train_dataloader_food,
      test_dataloader=test_dataloader_food,
      optimizer=optimizer,
      loss_fn=loss_fn,
      epochs=epochs,
      device=device,
      writer=create_writer(experiment_name='food_data',
                            model_name='big_model',
                            extra=f"{epochs}_epochs"))