### 07. PyTorch Experiment Tracking

What is `Experiment tracking`
- It's important and integral to machine learning.
- Helps to figure out what works and what doesn't work.

Why `track experiments`?

- As the number of experiments starts to increase, this is a good way to track results with different dates and other priorities

Different ways to `track` machine learning `experiments`

| **Method** | **Setup** | **Pros** | **Cons** | **Cost** |
| ----- | ----- | ----- | ----- | ----- |
| `Python dictionaries`, `CSV files`, `print outs` | None | Easy to setup, runs in pure Python | Hard to keep track of large numbers of experiments | Free |
| [TensorBoard](https://www.tensorflow.org/tensorboard/get_started) | Minimal, install [`tensorboard`](https://pypi.org/project/tensorboard/) | Extensions built into PyTorch, widely recognized and used, easily scales. | User-experience not as nice as other options. | Free |
| [Weights & Biases Experiment Tracking](https://wandb.ai/site/experiment-tracking) | Minimal, install [`wandb`](https://docs.wandb.ai/quickstart), make an account | Incredible user experience, make experiments public, tracks almost anything. | Requires external resource outside of PyTorch. | Free for personal use | 
| [MLFlow](https://mlflow.org/) | Minimal, install `mlflow` and starting tracking | Fully open-source MLOps lifecycle management, many integrations. | Little bit harder to setup a remote tracking server than other services. | Free | 

<img src="https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/07-different-places-to-track-experiments.png" alt="various places to track machine learning experiments" width=900/>

In [1]:
try:
    import torch
    import torchvision

    # torch version should be 1.12+
    assert int(torch.__version__.split(".")[1]) >= 12 

    # torchvision version should be 1.12+
    assert int(torchvision.__version__.split(".")[1]) >= 13 

    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")

except:
    print(f"torch/torchvision versions not as required, installing nightly versions...")
    !pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    import torch
    import torchvision
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")


torch version: 1.12.1+cu102
torchvision version: 0.13.1+cu102


  from .autonotebook import tqdm as notebook_tqdm


In [2]:
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn 
from torchvision import transforms

# try to get torchinfo
try: 
    from torchinfo import summary
except:
    print(f"Couldn't find torchinfo. Installing it...")
    !pip install -q torchinfo
    from torchinfo import summary

# import custom modules
try:
    from going_modular.going_modular import data_setup, engine
except:
    print("Coudn't find going_modular scripts..")

In [3]:
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Device: {device}")

Device: cuda


In [4]:
# set seeds
def set_seeds(seed: int = 42):
    # set the seed for general torch operations
    torch.manual_seed(seed=seed)

    # set the seed for CUDA torch operations (one that happen on GPUs)
    torch.cuda.manual_seed(seed=seed)

#### 1. Get data

In [5]:
import os 
from pathlib import Path

# data directory
DATA_DIR = Path('../data/')
if not DATA_DIR.is_dir():
    DATA_DIR.mkdir(parents=True, exist_ok=True)

# dataset directory
DATASET_FOLDER_NAME = "pizza_steak_sushi"
DATASET_FOLDER_PATH = DATA_DIR.joinpath(DATASET_FOLDER_NAME)
if not DATASET_FOLDER_PATH.is_dir():
    print(f"The {DATASET_FOLDER_NAME} folder doesn't exist")

# train directory
train_dir = DATASET_FOLDER_PATH.joinpath("train")

# test directory
test_dir = DATASET_FOLDER_PATH.joinpath("test")

In [6]:
# walk through the training directory
for filepaths, dirnames, filenames in os.walk(train_dir):
    if len(filenames) > 0:
        print(f"There are  {len(filenames)} images in {filepaths}")

There are  78 images in ../data/pizza_steak_sushi/train/pizza
There are  75 images in ../data/pizza_steak_sushi/train/steak
There are  72 images in ../data/pizza_steak_sushi/train/sushi


In [7]:
# walk through the testing directory
for filepaths, dirnames, filenames in os.walk(test_dir):
    if len(filenames) > 0:
        print(f"There are  {len(filenames)} images in {filepaths}")

There are  25 images in ../data/pizza_steak_sushi/test/pizza
There are  19 images in ../data/pizza_steak_sushi/test/steak
There are  31 images in ../data/pizza_steak_sushi/test/sushi


#### 2. Create Datasets and DataLoaders

##### 2.1 Create DataLoaders using manually created transforms

In [8]:
# setup ImageNet normalization 
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

# create transforms manually
manual_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    normalize
])

print(f"Manually created transforms: {manual_transforms}")

# create data loaders
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(
    train_dir=train_dir, 
    test_dir=test_dir, 
    transform=manual_transforms, # use manually created transforms
    batch_size=32
)

train_dataloader, test_dataloader, class_names

Manually created transforms: Compose(
    Resize(size=(224, 224), interpolation=bilinear, max_size=None, antialias=None)
    ToTensor()
    Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
)


(<torch.utils.data.dataloader.DataLoader at 0x7f2d8c3f48e0>,
 <torch.utils.data.dataloader.DataLoader at 0x7f2cf69130d0>,
 ['pizza', 'steak', 'sushi'])

##### 2.2 Create DataLoaders using automatically created transforms

#### 3. Getting a pretrained model, freezing the base layers and changing the classifier head


In [9]:
# the pretrained weights for EfficientNet_B0
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT 

# setup the model with the pretained weights and send it to the target device
model = torchvision.models.efficientnet_b0(weights=weights).to(device)

# view the output of the model
model

EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActivat

In [10]:
# freeze all base layers by setting requires_grad attribute to False
for param in model.features.parameters():
    param.requires_grad = False

# we're creating a new layer with random weights (torch.nn.Linear)
set_seeds()

# update the classifier head to suit our problem
model.classifier = torch.nn.Sequential(
    nn.Dropout(p=0.2, inplace=True),
    nn.Linear(in_features=1280, out_features=len(class_names), bias=True)
)

# send to device
model = model.to(device)

#### 4. Train model and track results


In [11]:
# define loss and optimizer
loss_fn = nn.CrossEntropyLoss()

# optimizer
optimizer = torch.optim.Adam(params=model.parameters(), lr=0.001)

In [12]:
# track results with `SummaryWriter`
from torch.utils.tensorboard import SummaryWriter
from datetime import datetime

# create a writer with all default settings
writer = SummaryWriter(log_dir=f"../tensorboards/tadac/{datetime.now().strftime('%Y-%m-%d')}/example_01")

2023-09-19 17:11:45.893466: E tensorflow/compiler/xla/stream_executor/cuda/cuda_dnn.cc:7630] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-09-19 17:11:45.893501: E tensorflow/compiler/xla/stream_executor/cuda/cuda_fft.cc:609] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-09-19 17:11:45.893533: E tensorflow/compiler/xla/stream_executor/cuda/cuda_blas.cc:1518] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2023-09-19 17:11:45.900963: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Instructions for updating:
The TensorFlow Distributions library has moved to TensorFlow Probability (https://github.com/tensorflow/probability). You should update all references to use `tfp.distributions` instead of `tf.distributions`.
Instructions for updating:
The TensorFlow Distributions library has moved to TensorFlow Probability (https://github.com/tensorflow/probability). You should update all references to use `tfp.distributions` instead of `tf.distributions`.


In [13]:
from typing import Dict, List
from tqdm.auto import tqdm

from going_modular.going_modular.engine import train_step, test_step

# add writer parameter to train()
def train(model: torch.nn.Module, 
          train_dataloader: torch.utils.data.DataLoader,
          test_dataloader: torch.utils.data.DataLoader,
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int, 
          device: torch.device, 
          writer: torch.utils.tensorboard.writer.SummaryWriter) -> Dict[str, List]:
    results = {
        "train_loss": [],
        "train_acc": [],
        "test_loss": [],
        "test_acc": [],
    }

    # loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                           dataloader=train_dataloader,
                                           loss_fn=loss_fn,
                                           optimizer=optimizer, 
                                           device=device)
        
        test_loss, test_acc = test_step(model=model,
                                        dataloader=test_dataloader,
                                        loss_fn=loss_fn,
                                        device=device)
        
        # print out what's happening
        print(
            f"Epoch: {epoch + 1} | "
            f"train_loss: {train_loss:.4f} | "
            f"train_acc: {train_acc:.4f} | "
            f"test_loss: {test_loss:.4f} | "
            f"test_acc: {test_acc:.4f}")
        
        # update results dictionary 
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

        # add loss results to SummaryWriter
        writer.add_scalars(main_tag="Loss",
                          tag_scalar_dict={"train_loss": train_loss, "test_loss": test_loss},
                          global_step=epoch)
        
        # add accuracy results to SummaryWriter
        writer.add_scalars(main_tag="Accuracy",
                           tag_scalar_dict={"train_acc": train_acc, "test_acc": test_acc},
                           global_step=epoch)
        
    # close the writer
    writer.close()
        
    # return the filled results at the end of the epochs
    return results

In [14]:
# train model
set_seeds()
results = train(model=model,
                train_dataloader=train_dataloader,
                test_dataloader=test_dataloader,
                optimizer=optimizer,
                loss_fn=loss_fn,
                epochs=5,
                device=device,
                writer=writer)

 20%|██        | 1/5 [00:01<00:06,  1.54s/it]

Epoch: 1 | train_loss: 1.0895 | train_acc: 0.4414 | test_loss: 0.9202 | test_acc: 0.5085


 40%|████      | 2/5 [00:02<00:03,  1.33s/it]

Epoch: 2 | train_loss: 0.8682 | train_acc: 0.7734 | test_loss: 0.8022 | test_acc: 0.7434


 60%|██████    | 3/5 [00:04<00:02,  1.34s/it]

Epoch: 3 | train_loss: 0.7771 | train_acc: 0.7812 | test_loss: 0.7399 | test_acc: 0.7737


 80%|████████  | 4/5 [00:05<00:01,  1.48s/it]

Epoch: 4 | train_loss: 0.7249 | train_acc: 0.7422 | test_loss: 0.6472 | test_acc: 0.8864


100%|██████████| 5/5 [00:06<00:00,  1.39s/it]

Epoch: 5 | train_loss: 0.6445 | train_acc: 0.7812 | test_loss: 0.6244 | test_acc: 0.8968





#### 5. View our model's results in TensorBoard

In [15]:
# check out the model results
results

{'train_loss': [1.0894909799098969,
  0.8682032749056816,
  0.7770947962999344,
  0.7248531728982925,
  0.6445005983114243],
 'train_acc': [0.44140625, 0.7734375, 0.78125, 0.7421875, 0.78125],
 'test_loss': [0.9202313423156738,
  0.8021665811538696,
  0.7398908138275146,
  0.6471782922744751,
  0.6243552962938944],
 'test_acc': [0.5085227272727273,
  0.743371212121212,
  0.7736742424242425,
  0.8863636363636364,
  0.8967803030303031]}

#### 6. Create a helper function to build `SummaryWriter()` instances

In [16]:
def create_writer(experiment_name: str,
                    model_name: str, 
                    extra: str = None) -> torch.utils.tensorboard.writer.SummaryWriter():
    from datetime import datetime
    import os 

    timestamp = datetime.now().strftime("%Y-%m-%d")

    if extra:
        log_dir = os.path.join("../tensorboards/tadac/", timestamp, experiment_name, model_name, extra)
    else:
        log_dir = os.path.join("../tensorboards/tadac/", timestamp, experiment_name, model_name)
    
    print(f"Created SummaryWriter, saving to: {log_dir}")

    return SummaryWriter(log_dir=log_dir)

In [17]:
# create an example writer
example_writer = create_writer(experiment_name="data_10_percent",
                               model_name="efficientnetb0",
                               extra="5_epochs")

Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_10_percent/efficientnetb0/5_epochs


#### 6.1 Update the `train()` function to include a `writer` parameter

In [18]:
from typing import Dict, List
from tqdm.auto import tqdm

from going_modular.going_modular.engine import train_step, test_step

# add writer parameter to train()
def train(model: torch.nn.Module, 
          train_dataloader: torch.utils.data.DataLoader,
          test_dataloader: torch.utils.data.DataLoader,
          optimizer: torch.optim.Optimizer,
          loss_fn: torch.nn.Module,
          epochs: int, 
          device: torch.device, 
          writer: torch.utils.tensorboard.writer.SummaryWriter) -> Dict[str, List]:
    results = {
        "train_loss": [],
        "train_acc": [],
        "test_loss": [],
        "test_acc": [],
    }

    # loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                           dataloader=train_dataloader,
                                           loss_fn=loss_fn,
                                           optimizer=optimizer, 
                                           device=device)
        
        test_loss, test_acc = test_step(model=model,
                                        dataloader=test_dataloader,
                                        loss_fn=loss_fn,
                                        device=device)
        
        # print out what's happening
        print(
            f"Epoch: {epoch + 1} | "
            f"train_loss: {train_loss:.4f} | "
            f"train_acc: {train_acc:.4f} | "
            f"test_loss: {test_loss:.4f} | "
            f"test_acc: {test_acc:.4f}")
        
        # update results dictionary 
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

        if writer:

            # add loss results to SummaryWriter
            writer.add_scalars(main_tag="Loss",
                            tag_scalar_dict={"train_loss": train_loss, "test_loss": test_loss},
                            global_step=epoch)
            
            # add accuracy results to SummaryWriter
            writer.add_scalars(main_tag="Accuracy",
                            tag_scalar_dict={"train_acc": train_acc, "test_acc": test_acc},
                            global_step=epoch)
            
            # close the writer
            writer.close()
        else:
            pass 
                
    # return the filled results at the end of the epochs
    return results

#### 7. Setting up a series of modelling experiments

##### 7.1 What kind of experiments should you run?

##### 7.2 What experiments are we going to run?

| Experiment number | Training Dataset | Model (pretrained on ImageNet) | Number of epochs |
| ----- | ----- | ----- | ----- |
| 1 | Pizza, Steak, Sushi 10% percent | EfficientNetB0 | 5 |
| 2 | Pizza, Steak, Sushi 10% percent | EfficientNetB2 | 5 | 
| 3 | Pizza, Steak, Sushi 10% percent | EfficientNetB0 | 10 | 
| 4 | Pizza, Steak, Sushi 10% percent | EfficientNetB2 | 10 |
| 5 | Pizza, Steak, Sushi 20% percent | EfficientNetB0 | 5 |
| 6 | Pizza, Steak, Sushi 20% percent | EfficientNetB2 | 5 |
| 7 | Pizza, Steak, Sushi 20% percent | EfficientNetB0 | 10 |
| 8 | Pizza, Steak, Sushi 20% percent | EfficientNetB2 | 10 |

##### 7.3 Download different datasets

In [19]:
from helper_functions import download_data

# download 10 percent and 20 percent training data (if necessary)
data_10_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                                     destination="pizza_steak_sushi",
                                     data_path=Path("../data"))

data_20_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip",
                                     destination="pizza_steak_sushi_20_percent",
                                     data_path=Path("../data"))

[INFO] ../data/pizza_steak_sushi directory exists, skipping download.
[INFO] ../data/pizza_steak_sushi_20_percent directory exists, skipping download.


In [20]:
# setup training directory paths
train_dir_10_percent = data_10_percent_path / "train"
train_dir_20_percent = data_20_percent_path / "train"

# setup testing directory paths
test_dir = data_10_percent_path / "test" # test data in both datasets are the same

# check the directories
print(f"Training directory 10%: {train_dir_10_percent}")
print(f"Training directory 20%: {train_dir_20_percent}")
print(f"Testing directory: {test_dir}")

Training directory 10%: ../data/pizza_steak_sushi/train
Training directory 20%: ../data/pizza_steak_sushi_20_percent/train
Testing directory: ../data/pizza_steak_sushi/test


#### 7.4 Transform Datasets and create DataLoaders

In [21]:
from torchvision import transforms

# create a transform to normalize data distribution to be inline with ImageNet
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

# compose transforms into a pineline
simple_transform = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(), # turn tensors with values betwween 0 and 1
    normalize
])

In [22]:
# batch size
batch_size = 32

# create 10% training and testing dataloaders
train_dataloader_10_percent, test_dataloader, class_names = data_setup.create_dataloaders(
    train_dir=train_dir_10_percent,
    test_dir=test_dir,
    transform=simple_transform,
    batch_size=batch_size
)

# create 20% training and testing dataloaders
train_dataloader_20_percent, test_dataloader, class_names = data_setup.create_dataloaders(
    train_dir=train_dir_20_percent,
    test_dir=test_dir,
    transform=simple_transform,
    batch_size=batch_size
)

# print out data
print(f"len(train_dataloader_10_percent): {len(train_dataloader_10_percent)}")
print(f"len(train_dataloader_20_percent): {len(train_dataloader_20_percent)}")
print(f"len(test_dataloader): {len(test_dataloader)}")
print(f"class names: {class_names}")

len(train_dataloader_10_percent): 8
len(train_dataloader_20_percent): 15
len(test_dataloader): 3
class names: ['pizza', 'steak', 'sushi']


##### 7.5 Create feature extractor models

In [23]:
import torchvision
from torchinfo import summary

# create an instance of EffNetB2
effnetb2_weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT

effnetb2 = torchvision.models.efficientnet_b2(weights=effnetb2_weights)

effnetb2

EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActivat

In [24]:
# get the number of in_features of the EfficientNetB2 classifier
print(f"Number of in_features to final layer of EfficientNetB2: {len(effnetb2.classifier.state_dict()['1.weight'][0])}")

Number of in_features to final layer of EfficientNetB2: 1408


In [25]:
import torchvision
from torch import nn 

# create an EffNetB0 feature extractor
def create_effnetb0(class_names: list):

    # get the base model with the pretrained weights and send to target device
    weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT

    model = torchvision.models.efficientnet_b0(weights=weights).to(device)

    # freeze the base model layers
    for param in model.features.parameters():
        param.requires_grad = False

    # set the seeds
    set_seeds()

    # change the classifier head
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.2),
        nn.Linear(in_features=1280, out_features=len(class_names))
    ).to(device)

    # set the model a name
    model.name = "effnetb0"

    print(f"Created a new {model.name} model")


    return model

# create an EffNetB2feature extractor
def create_effnetb2(class_names: list):

    # get the base model with the pretrained weights and send to target device
    weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT

    model = torchvision.models.efficientnet_b2(weights=weights).to(device)

    # freeze the base model layers
    for param in model.features.parameters():
        param.requires_grad = False

    # set the seeds
    set_seeds()

    # change the classifier head
    model.classifier = nn.Sequential(
        nn.Dropout(p=0.3),
        nn.Linear(in_features=1408, out_features=len(class_names))
    ).to(device)

    # set the model a name
    model.name = "effnetb2"

    print(f"Created a new {model.name} model")
    return model

In [26]:
# create effnetb0
effnetb0 = create_effnetb0(class_names=class_names)

summary(model=effnetb0,
        input_size=(32, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"])

Created a new effnetb0 model


Layer (type:depth-idx)                                  Input Shape               Output Shape              Param #                   Trainable
EfficientNet                                            [32, 3, 224, 224]         [32, 3]                   --                        Partial
├─Sequential: 1-1                                       [32, 3, 224, 224]         [32, 1280, 7, 7]          --                        False
│    └─Conv2dNormActivation: 2-1                        [32, 3, 224, 224]         [32, 32, 112, 112]        --                        False
│    │    └─Conv2d: 3-1                                 [32, 3, 224, 224]         [32, 32, 112, 112]        (864)                     False
│    │    └─BatchNorm2d: 3-2                            [32, 32, 112, 112]        [32, 32, 112, 112]        (64)                      False
│    │    └─SiLU: 3-3                                   [32, 32, 112, 112]        [32, 32, 112, 112]        --                        --
│    └─Sequential

In [27]:
# create effnetb2
effnetb2 = create_effnetb2(class_names=class_names)

effnetb2
# summary(model=effnetb2,
#         input_size=(32, 3, 224, 224),
#         col_names=["input_size", "output_size", "num_params", "trainable"])

Created a new effnetb2 model


EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActivat

##### 7.6 Create experiments and set up training code

In [28]:
# create epoch list
num_epochs = [5, 10]

# create models list 
models = ["effnetb0", "effnetb2"]

# create dataloaders dictionary for various dataloaders
train_dataloaders = {
    "data_10_percent": train_dataloader_10_percent,
    "data_20_percent": train_dataloader_20_percent,
}

In [38]:
# !pip install torch_tb_profiler (PyTorch 1.8.1+)
# https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool/

In [41]:
torch.__version__
import torch

In [46]:
# select the model
print(f"Creating a new `effnetb2` model...")
model = create_effnetb2(class_names=class_names)

# create a new loss and optimizer 
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(params=model.parameters(), lr=0.001)

# loop through each number of epochs
for epochs in num_epochs:

    with torch.profiler.profile(
        schedule=torch.profiler.schedule(
            wait=2,
            warmup=2,
            active=6,
            repeat=1),
        on_trace_ready=torch.profiler.tensorboard_trace_handler("../logs/tadac/effnetb2_profiler"),
        with_stack=True
    ) as profiler:

        # train 
        train(model=model, 
                train_dataloader= train_dataloader_10_percent, 
                test_dataloader= test_dataloader, 
                optimizer= optimizer, 
                loss_fn= loss_fn, 
                epochs= epochs, 
                device= device, 
                writer= create_writer(experiment_name="train_dataloader_10_percent", 
                                        model_name= "effnetb2_profiler", 
                                        extra=f"{epochs}_epochs"))

Creating a new `effnetb2` model...
Created a new effnetb2 model
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/train_dataloader_10_percent/effnetb2_profiler/5_epochs


 20%|██        | 1/5 [00:02<00:08,  2.24s/it]

Epoch: 1 | train_loss: 1.0928 | train_acc: 0.3711 | test_loss: 0.9557 | test_acc: 0.6610


 40%|████      | 2/5 [00:03<00:05,  1.87s/it]

Epoch: 2 | train_loss: 0.9248 | train_acc: 0.6445 | test_loss: 0.8711 | test_acc: 0.8144


 60%|██████    | 3/5 [00:05<00:03,  1.72s/it]

Epoch: 3 | train_loss: 0.8086 | train_acc: 0.7656 | test_loss: 0.7511 | test_acc: 0.9176


 80%|████████  | 4/5 [00:06<00:01,  1.61s/it]

Epoch: 4 | train_loss: 0.7191 | train_acc: 0.8867 | test_loss: 0.7150 | test_acc: 0.9081


100%|██████████| 5/5 [00:08<00:00,  1.66s/it]


Epoch: 5 | train_loss: 0.6850 | train_acc: 0.7695 | test_loss: 0.7076 | test_acc: 0.8873
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/train_dataloader_10_percent/effnetb2_profiler/10_epochs


 10%|█         | 1/10 [00:01<00:13,  1.49s/it]

Epoch: 1 | train_loss: 0.6111 | train_acc: 0.7812 | test_loss: 0.6325 | test_acc: 0.9280


 20%|██        | 2/10 [00:03<00:12,  1.58s/it]

Epoch: 2 | train_loss: 0.6126 | train_acc: 0.8008 | test_loss: 0.6404 | test_acc: 0.8769


 30%|███       | 3/10 [00:04<00:11,  1.66s/it]

Epoch: 3 | train_loss: 0.5202 | train_acc: 0.9336 | test_loss: 0.6200 | test_acc: 0.8977


 40%|████      | 4/10 [00:06<00:09,  1.60s/it]

Epoch: 4 | train_loss: 0.5426 | train_acc: 0.8008 | test_loss: 0.6227 | test_acc: 0.8466


 50%|█████     | 5/10 [00:07<00:07,  1.57s/it]

Epoch: 5 | train_loss: 0.4909 | train_acc: 0.8125 | test_loss: 0.5871 | test_acc: 0.8873


 60%|██████    | 6/10 [00:09<00:06,  1.52s/it]

Epoch: 6 | train_loss: 0.5430 | train_acc: 0.8125 | test_loss: 0.5473 | test_acc: 0.8873


 70%|███████   | 7/10 [00:10<00:04,  1.49s/it]

Epoch: 7 | train_loss: 0.4405 | train_acc: 0.8164 | test_loss: 0.4957 | test_acc: 0.9176


 80%|████████  | 8/10 [00:12<00:02,  1.46s/it]

Epoch: 8 | train_loss: 0.4350 | train_acc: 0.8086 | test_loss: 0.5119 | test_acc: 0.8665


 90%|█████████ | 9/10 [00:13<00:01,  1.45s/it]

Epoch: 9 | train_loss: 0.4251 | train_acc: 0.8125 | test_loss: 0.4654 | test_acc: 0.9176


100%|██████████| 10/10 [00:15<00:00,  1.51s/it]

Epoch: 10 | train_loss: 0.5012 | train_acc: 0.7969 | test_loss: 0.5081 | test_acc: 0.8977





In [None]:
!tensorboard --logdir="../tensorboards/tadac/2023-09-19/train_dataloader_10_percent/effnetb2_profiler"

In [29]:
%%time
from going_modular.going_modular.utils import save_model
from datetime import  datetime

# set seed
set_seeds(seed=42)

# keep trach of experiment numbers
experiment_number = 0


# loop through each model name and create a new model based on the name
for model_name in models:

    # select the model
    if model_name == "effnetb0":
        print(f"Creating a new `effnetb0` model...")
        model = create_effnetb0(class_names=class_names)
    else:
        print(f"Creating a new `effnetb2` model...")
        model = create_effnetb2(class_names=class_names)

    # loop through each DataLoader
    for dataloader_name, train_dataloader in train_dataloaders.items():

        # loop through each number of epochs
        for epochs in num_epochs:

            # create infomation
            experiment_number += 1
            print(f"Experiment number: {experiment_number}")
            print(f"Model: {model_name}")
            print(f"Dataloader: {dataloader_name}")
            print(f"NUmber of epochs: {epochs}")

            # create a new loss and optimizer 
            loss_fn = nn.CrossEntropyLoss()
            optimizer = torch.optim.Adam(params=model.parameters(), lr=0.001)


            # train 
            train(model=model, 
                    train_dataloader= train_dataloader, 
                    test_dataloader= test_dataloader, 
                    optimizer= optimizer, 
                    loss_fn= loss_fn, 
                    epochs= epochs, 
                    device= device, 
                    writer= create_writer(experiment_name=dataloader_name, 
                                            model_name= model_name, 
                                            extra=f"{epochs}_epochs"))

            # save model
            save_filepath = f"07_{model_name}_{dataloader_name}_{epochs}_epochs.pth"
            save_model(model=model, 
                        target_dir=f"../models/tadac/{datetime.now().strftime('%Y-%m-%d')}/07_tadac_following", 
                        model_name=save_filepath)

                        
    

Creating a new `effnetb0` model...
Created a new effnetb0 model
Experiment number: 1
Model: effnetb0
Dataloader: data_10_percent
NUmber of epochs: 5
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_10_percent/effnetb0/5_epochs


 20%|██        | 1/5 [00:01<00:05,  1.31s/it]

Epoch: 1 | train_loss: 1.0528 | train_acc: 0.4961 | test_loss: 0.9220 | test_acc: 0.4678


 40%|████      | 2/5 [00:02<00:03,  1.28s/it]

Epoch: 2 | train_loss: 0.8747 | train_acc: 0.6992 | test_loss: 0.8139 | test_acc: 0.6203


 60%|██████    | 3/5 [00:03<00:02,  1.24s/it]

Epoch: 3 | train_loss: 0.8100 | train_acc: 0.6445 | test_loss: 0.7176 | test_acc: 0.8258


 80%|████████  | 4/5 [00:05<00:01,  1.24s/it]

Epoch: 4 | train_loss: 0.7098 | train_acc: 0.7578 | test_loss: 0.5898 | test_acc: 0.8864


100%|██████████| 5/5 [00:06<00:00,  1.27s/it]


Epoch: 5 | train_loss: 0.5980 | train_acc: 0.9141 | test_loss: 0.5677 | test_acc: 0.8864
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb0_data_10_percent_5_epochs.pth
Experiment number: 2
Model: effnetb0
Dataloader: data_10_percent
NUmber of epochs: 10
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_10_percent/effnetb0/10_epochs


 10%|█         | 1/10 [00:01<00:11,  1.24s/it]

Epoch: 1 | train_loss: 0.6044 | train_acc: 0.7969 | test_loss: 0.5470 | test_acc: 0.9280


 20%|██        | 2/10 [00:02<00:10,  1.29s/it]

Epoch: 2 | train_loss: 0.5104 | train_acc: 0.8164 | test_loss: 0.5085 | test_acc: 0.9072


 30%|███       | 3/10 [00:03<00:08,  1.26s/it]

Epoch: 3 | train_loss: 0.4462 | train_acc: 0.9531 | test_loss: 0.4842 | test_acc: 0.8968


 40%|████      | 4/10 [00:04<00:07,  1.23s/it]

Epoch: 4 | train_loss: 0.5266 | train_acc: 0.7891 | test_loss: 0.4805 | test_acc: 0.8873


 50%|█████     | 5/10 [00:06<00:06,  1.23s/it]

Epoch: 5 | train_loss: 0.4460 | train_acc: 0.8164 | test_loss: 0.4394 | test_acc: 0.9072


 60%|██████    | 6/10 [00:07<00:04,  1.23s/it]

Epoch: 6 | train_loss: 0.3794 | train_acc: 0.9648 | test_loss: 0.4362 | test_acc: 0.8968


 70%|███████   | 7/10 [00:08<00:03,  1.22s/it]

Epoch: 7 | train_loss: 0.3750 | train_acc: 0.9297 | test_loss: 0.4362 | test_acc: 0.8968


 80%|████████  | 8/10 [00:09<00:02,  1.21s/it]

Epoch: 8 | train_loss: 0.4673 | train_acc: 0.8320 | test_loss: 0.3978 | test_acc: 0.9072


 90%|█████████ | 9/10 [00:11<00:01,  1.23s/it]

Epoch: 9 | train_loss: 0.4680 | train_acc: 0.8281 | test_loss: 0.4072 | test_acc: 0.9280


100%|██████████| 10/10 [00:12<00:00,  1.23s/it]


Epoch: 10 | train_loss: 0.3736 | train_acc: 0.8281 | test_loss: 0.3618 | test_acc: 0.9072
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb0_data_10_percent_10_epochs.pth
Experiment number: 3
Model: effnetb0
Dataloader: data_20_percent
NUmber of epochs: 5
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_20_percent/effnetb0/5_epochs


 20%|██        | 1/5 [00:01<00:06,  1.58s/it]

Epoch: 1 | train_loss: 0.4170 | train_acc: 0.8521 | test_loss: 0.3197 | test_acc: 0.9384


 40%|████      | 2/5 [00:03<00:04,  1.52s/it]

Epoch: 2 | train_loss: 0.3225 | train_acc: 0.8854 | test_loss: 0.3112 | test_acc: 0.9489


 60%|██████    | 3/5 [00:04<00:03,  1.51s/it]

Epoch: 3 | train_loss: 0.3675 | train_acc: 0.8812 | test_loss: 0.2937 | test_acc: 0.9593


 80%|████████  | 4/5 [00:06<00:01,  1.52s/it]

Epoch: 4 | train_loss: 0.3388 | train_acc: 0.9042 | test_loss: 0.2557 | test_acc: 0.9176


100%|██████████| 5/5 [00:07<00:00,  1.53s/it]


Epoch: 5 | train_loss: 0.2850 | train_acc: 0.9354 | test_loss: 0.2690 | test_acc: 0.9593
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb0_data_20_percent_5_epochs.pth
Experiment number: 4
Model: effnetb0
Dataloader: data_20_percent
NUmber of epochs: 10
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_20_percent/effnetb0/10_epochs


 10%|█         | 1/10 [00:01<00:13,  1.52s/it]

Epoch: 1 | train_loss: 0.3317 | train_acc: 0.8750 | test_loss: 0.2568 | test_acc: 0.9384


 20%|██        | 2/10 [00:03<00:12,  1.53s/it]

Epoch: 2 | train_loss: 0.3402 | train_acc: 0.8604 | test_loss: 0.2894 | test_acc: 0.9489


 30%|███       | 3/10 [00:04<00:10,  1.54s/it]

Epoch: 3 | train_loss: 0.3049 | train_acc: 0.9021 | test_loss: 0.2446 | test_acc: 0.9384


 40%|████      | 4/10 [00:06<00:09,  1.53s/it]

Epoch: 4 | train_loss: 0.2699 | train_acc: 0.8979 | test_loss: 0.2238 | test_acc: 0.9384


 50%|█████     | 5/10 [00:07<00:07,  1.54s/it]

Epoch: 5 | train_loss: 0.2427 | train_acc: 0.9354 | test_loss: 0.2470 | test_acc: 0.9489


 60%|██████    | 6/10 [00:09<00:06,  1.54s/it]

Epoch: 6 | train_loss: 0.2724 | train_acc: 0.9208 | test_loss: 0.2166 | test_acc: 0.9489


 70%|███████   | 7/10 [00:10<00:04,  1.54s/it]

Epoch: 7 | train_loss: 0.3088 | train_acc: 0.9042 | test_loss: 0.2316 | test_acc: 0.9384


 80%|████████  | 8/10 [00:12<00:03,  1.54s/it]

Epoch: 8 | train_loss: 0.2154 | train_acc: 0.9375 | test_loss: 0.2079 | test_acc: 0.9489


 90%|█████████ | 9/10 [00:13<00:01,  1.56s/it]

Epoch: 9 | train_loss: 0.2336 | train_acc: 0.9479 | test_loss: 0.2310 | test_acc: 0.9593


100%|██████████| 10/10 [00:15<00:00,  1.55s/it]

Epoch: 10 | train_loss: 0.2860 | train_acc: 0.9313 | test_loss: 0.2123 | test_acc: 0.9384
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb0_data_20_percent_10_epochs.pth
Creating a new `effnetb2` model...





Created a new effnetb2 model
Experiment number: 5
Model: effnetb2
Dataloader: data_10_percent
NUmber of epochs: 5
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_10_percent/effnetb2/5_epochs


 20%|██        | 1/5 [00:01<00:05,  1.41s/it]

Epoch: 1 | train_loss: 1.0928 | train_acc: 0.3711 | test_loss: 0.9557 | test_acc: 0.6610


 40%|████      | 2/5 [00:02<00:04,  1.36s/it]

Epoch: 2 | train_loss: 0.9248 | train_acc: 0.6445 | test_loss: 0.8711 | test_acc: 0.8144


 60%|██████    | 3/5 [00:04<00:02,  1.34s/it]

Epoch: 3 | train_loss: 0.8086 | train_acc: 0.7656 | test_loss: 0.7511 | test_acc: 0.9176


 80%|████████  | 4/5 [00:05<00:01,  1.35s/it]

Epoch: 4 | train_loss: 0.7191 | train_acc: 0.8867 | test_loss: 0.7150 | test_acc: 0.9081


100%|██████████| 5/5 [00:06<00:00,  1.35s/it]


Epoch: 5 | train_loss: 0.6850 | train_acc: 0.7695 | test_loss: 0.7076 | test_acc: 0.8873
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb2_data_10_percent_5_epochs.pth
Experiment number: 6
Model: effnetb2
Dataloader: data_10_percent
NUmber of epochs: 10
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_10_percent/effnetb2/10_epochs


 10%|█         | 1/10 [00:01<00:11,  1.32s/it]

Epoch: 1 | train_loss: 0.5847 | train_acc: 0.9062 | test_loss: 0.6331 | test_acc: 0.8873


 20%|██        | 2/10 [00:02<00:10,  1.31s/it]

Epoch: 2 | train_loss: 0.5926 | train_acc: 0.7969 | test_loss: 0.6287 | test_acc: 0.8769


 30%|███       | 3/10 [00:04<00:09,  1.40s/it]

Epoch: 3 | train_loss: 0.5152 | train_acc: 0.8164 | test_loss: 0.5921 | test_acc: 0.8873


 40%|████      | 4/10 [00:05<00:08,  1.46s/it]

Epoch: 4 | train_loss: 0.5061 | train_acc: 0.7969 | test_loss: 0.5833 | test_acc: 0.8570


 50%|█████     | 5/10 [00:07<00:07,  1.44s/it]

Epoch: 5 | train_loss: 0.4593 | train_acc: 0.8086 | test_loss: 0.5523 | test_acc: 0.8873


 60%|██████    | 6/10 [00:08<00:05,  1.41s/it]

Epoch: 6 | train_loss: 0.5171 | train_acc: 0.8086 | test_loss: 0.5154 | test_acc: 0.8873


 70%|███████   | 7/10 [00:09<00:04,  1.39s/it]

Epoch: 7 | train_loss: 0.4124 | train_acc: 0.8203 | test_loss: 0.4684 | test_acc: 0.9176


 80%|████████  | 8/10 [00:11<00:02,  1.37s/it]

Epoch: 8 | train_loss: 0.4219 | train_acc: 0.8203 | test_loss: 0.4880 | test_acc: 0.8665


 90%|█████████ | 9/10 [00:12<00:01,  1.36s/it]

Epoch: 9 | train_loss: 0.3984 | train_acc: 0.8320 | test_loss: 0.4421 | test_acc: 0.9176


100%|██████████| 10/10 [00:13<00:00,  1.37s/it]


Epoch: 10 | train_loss: 0.4753 | train_acc: 0.8047 | test_loss: 0.4836 | test_acc: 0.8977
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb2_data_10_percent_10_epochs.pth
Experiment number: 7
Model: effnetb2
Dataloader: data_20_percent
NUmber of epochs: 5
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_20_percent/effnetb2/5_epochs


 20%|██        | 1/5 [00:01<00:07,  1.76s/it]

Epoch: 1 | train_loss: 0.3722 | train_acc: 0.9167 | test_loss: 0.3971 | test_acc: 0.9489


 40%|████      | 2/5 [00:03<00:05,  1.77s/it]

Epoch: 2 | train_loss: 0.3793 | train_acc: 0.8646 | test_loss: 0.4062 | test_acc: 0.8674


 60%|██████    | 3/5 [00:05<00:03,  1.84s/it]

Epoch: 3 | train_loss: 0.3032 | train_acc: 0.9167 | test_loss: 0.3700 | test_acc: 0.8873


 80%|████████  | 4/5 [00:07<00:01,  1.92s/it]

Epoch: 4 | train_loss: 0.3160 | train_acc: 0.8854 | test_loss: 0.3350 | test_acc: 0.9176


100%|██████████| 5/5 [00:09<00:00,  1.88s/it]


Epoch: 5 | train_loss: 0.3369 | train_acc: 0.8792 | test_loss: 0.3748 | test_acc: 0.8769
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb2_data_20_percent_5_epochs.pth
Experiment number: 8
Model: effnetb2
Dataloader: data_20_percent
NUmber of epochs: 10
Created SummaryWriter, saving to: ../tensorboards/tadac/2023-09-19/data_20_percent/effnetb2/10_epochs


 10%|█         | 1/10 [00:01<00:17,  1.90s/it]

Epoch: 1 | train_loss: 0.2664 | train_acc: 0.9125 | test_loss: 0.3290 | test_acc: 0.8873


 20%|██        | 2/10 [00:03<00:15,  1.93s/it]

Epoch: 2 | train_loss: 0.3286 | train_acc: 0.8583 | test_loss: 0.3480 | test_acc: 0.8873


 30%|███       | 3/10 [00:05<00:13,  1.98s/it]

Epoch: 3 | train_loss: 0.2099 | train_acc: 0.9437 | test_loss: 0.3545 | test_acc: 0.8873


 40%|████      | 4/10 [00:07<00:11,  1.89s/it]

Epoch: 4 | train_loss: 0.3049 | train_acc: 0.9083 | test_loss: 0.3441 | test_acc: 0.8873


 50%|█████     | 5/10 [00:09<00:09,  1.84s/it]

Epoch: 5 | train_loss: 0.2121 | train_acc: 0.9500 | test_loss: 0.3877 | test_acc: 0.8570


 60%|██████    | 6/10 [00:11<00:07,  1.79s/it]

Epoch: 6 | train_loss: 0.2120 | train_acc: 0.9604 | test_loss: 0.2931 | test_acc: 0.8977


 70%|███████   | 7/10 [00:12<00:05,  1.77s/it]

Epoch: 7 | train_loss: 0.2097 | train_acc: 0.9437 | test_loss: 0.3310 | test_acc: 0.8977


 80%|████████  | 8/10 [00:14<00:03,  1.78s/it]

Epoch: 8 | train_loss: 0.1953 | train_acc: 0.9563 | test_loss: 0.3104 | test_acc: 0.8665


 90%|█████████ | 9/10 [00:16<00:01,  1.76s/it]

Epoch: 9 | train_loss: 0.1941 | train_acc: 0.9479 | test_loss: 0.3326 | test_acc: 0.8665


100%|██████████| 10/10 [00:18<00:00,  1.80s/it]

Epoch: 10 | train_loss: 0.2073 | train_acc: 0.9437 | test_loss: 0.3477 | test_acc: 0.8665
[INFO] Saving model to: ../models/tadac/2023-09-19/07_tadac_following/07_effnetb2_data_20_percent_10_epochs.pth
CPU times: user 27.3 s, sys: 35.9 s, total: 1min 3s
Wall time: 1min 30s





##### 8. View experiments in TensorBoard

##### 9. Load in the best model and make predictions with it

In [33]:
# setup the best model filepath
best_model_path = f"../models/tadac/{datetime.now().strftime('%Y-%m-%d')}/07_tadac_following/07_effnetb2_data_20_percent_10_epochs.pth"

# instantiate a new instance of EffNetB2
best_model = create_effnetb2(class_names=class_names)

# load the saved best model state_dict()
best_model.load_state_dict(torch.load(best_model_path))

Created a new effnetb2 model


<All keys matched successfully>

In [34]:
from pathlib import Path

# get the model size in bytes then convert to megabytes
model_size = Path(best_model_path).stat().st_size // (1024 * 1024) 
print(f"Model size: {model_size} MB") # 1 MB -> 1024 KB ->  10244 * 1024 B

Model size: 29 MB


In [None]:
## visualization

import random

from going_modular.going_modular.predictions import pred_and_plot_image

# get 3 random images 
num_images = 3
test_image_paths = list(Path(data_20_percent_path / "test").glob("*/*.jpg"))

print(f"len(test_image_paths): {len(test_image_paths)}")

random_image_paths = random.sample(population=test_image_paths, k=num_images)

print(f"Random image paths: {random_image_paths}")

# predict and plot them
for random_image_path in random_image_paths:
    pred_and_plot_image(model=best_model,
                        image_path=random_image_path,
                        class_names=class_names,
                        image_size=(224, 224))

##### 9.1 Predict on a custom image with the best model

In [None]:
# test image
test_image_path = "../data/steak_test.jpg"

# predict on custom image
pred_and_plot_image(model=model, 
                    image_path= test_image_path, 
                    class_names=class_names)

#### References

1. https://pytorch.org/blog/introducing-torchvision-new-multi-weight-support-api/
2. https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html
3. https://medium.com/mlearning-ai/profiling-a-training-task-with-pytorch-profiler-and-viewing-it-on-tensorboard-2cb7e0fef30e
4. https://pytorch.org/blog/introducing-pytorch-profiler-the-new-and-improved-performance-tool/
5. https://code.visualstudio.com/docs/datascience/pytorch-support