# 09. Pytorch Model Deployment

What is model deployment?

Machine learning model deployment is the act of making your machine learning models availble to someone or something else.

## 0. Get setup

In [None]:
# For this notebook to run with updated APIs, we need torch 1.12+ and torchvision 0.13+
try:
    import torch
    import torchvision
    assert int(torch.__version__.split(".")[1]) >= 12, "torch version should be 1.12+"
    assert int(torchvision.__version__.split(".")[1]) >= 13, "torchvision version should be 0.13+"
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")
except:
    print(f"[INFO] torch/torchvision versions not as required, installing nightly versions.")
    !pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    import torch
    import torchvision
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")

[INFO] torch/torchvision versions not as required, installing nightly versions.
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113
Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m23.7/23.7 MB[0m [31m47.5 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m823.6/823.6 kB[0m [31m52.1 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m14.1/14.1 MB[0m [31m78.2 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting nvidia-

In [None]:
# Continue with regular imports
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    !pip install -q torchinfo
    from torchinfo import summary

# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
    from going_modular.going_modular import data_setup, engine
    from helper_functions import download_data, set_seeds, plot_loss_curves
except:
    # Get the going_modular scripts
    print("[INFO] Couldn't find going_modular or helper_functions scripts... downloading them from GitHub.")
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    !mv pytorch-deep-learning/going_modular .
    !mv pytorch-deep-learning/helper_functions.py . # get the helper_functions.py script
    !rm -rf pytorch-deep-learning
    from going_modular.going_modular import data_setup, engine
    from helper_functions import download_data, set_seeds, plot_loss_curves

[INFO] Couldn't find torchinfo... installing it.
[INFO] Couldn't find going_modular or helper_functions scripts... downloading them from GitHub.
Cloning into 'pytorch-deep-learning'...
remote: Enumerating objects: 4056, done.[K
remote: Counting objects: 100% (1234/1234), done.[K
remote: Compressing objects: 100% (110/110), done.[K
remote: Total 4056 (delta 1141), reused 1124 (delta 1124), pack-reused 2822[K
Receiving objects: 100% (4056/4056), 649.94 MiB | 31.30 MiB/s, done.
Resolving deltas: 100% (2386/2386), done.
Updating files: 100% (248/248), done.


In [None]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

## 1. Getting data

The dataset we're going to use for deploying a FoodVision Mini model is...

Pizza, steak, sushi 20% dataset (pizza, steak, sushi classes from Food101, random 20% of the samples)

In [None]:
# Download pizza, steak, sushi images from GitHub
data_20_percent_path = download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip",
                                     destination="pizza_steak_sushi_20_percent")

data_20_percent_path

[INFO] Did not find data/pizza_steak_sushi_20_percent directory, creating one...
[INFO] Downloading pizza_steak_sushi_20_percent.zip from https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi_20_percent.zip...
[INFO] Unzipping pizza_steak_sushi_20_percent.zip data...


PosixPath('data/pizza_steak_sushi_20_percent')

In [None]:
train_dir = data_20_percent_path / "train"
test_dir = data_20_percent_path / "test"

train_dir, test_dir

(PosixPath('data/pizza_steak_sushi_20_percent/train'),
 PosixPath('data/pizza_steak_sushi_20_percent/test'))

## 2. FoodVision Mini model deployment experiment outline

### 3 questions
1. What is my most ideal machine learning model deployment scenario?
2. Where is my model going to go?
3. How is my model going to function?

**FoodVision Mini ideal use case:** A model that performs well and fast.

* Performs well: 95% + accuracy
* Fast: as close to real-time (or faster),  as possible (30fps+ or 30ms latency)
  * Latency = time for predictions to take place

To try and achieve these goals we are going to build 2 model experiments:

1. EffNetB2 feature extractor (just like in 07. Pytorch Experiment tracking)
2. ViT feature extractor (just like in 08. Pytorch paper replicating)

## 3. Creating an EffnetB2 feature extractor

Feature extractor = a term for a transfer learning model that has its base layers froze and the output layers (or head/classifier) customized to fit a specific problem

In [None]:
import torchvision

# 1. Setup pretrained EffNetB2 weights
effnetb2_weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT # DEFAULT is = best available

# 2. Get EffNetB2 transforms
effnetb2_transforms = effnetb2_weights.transforms()

# 3. Setup a pretrained model instance
effnetb2 = torchvision.models.efficientnet_b2(weights=effnetb2_weights) # Could alse use weights = "DEFAULT"

# 4. freeze the base layers in the model (will stop all layers from training)
for param in effnetb2.parameters():
  param.requires_grad = False



Downloading: "https://download.pytorch.org/models/efficientnet_b2_rwightman-c35c1473.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b2_rwightman-c35c1473.pth
100%|██████████| 35.2M/35.2M [00:00<00:00, 76.4MB/s]


In [None]:
effnetb2

EfficientNet(
  (features): Sequential(
    (0): Conv2dNormActivation(
      (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (2): SiLU(inplace=True)
    )
    (1): Sequential(
      (0): MBConv(
        (block): Sequential(
          (0): Conv2dNormActivation(
            (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
            (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
            (2): SiLU(inplace=True)
          )
          (1): SqueezeExcitation(
            (avgpool): AdaptiveAvgPool2d(output_size=1)
            (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
            (fc2): Conv2d(8, 32, kernel_size=(1, 1), stride=(1, 1))
            (activation): SiLU(inplace=True)
            (scale_activation): Sigmoid()
          )
          (2): Conv2dNormActivat

In [None]:
from torchinfo import summary

# Print EffNetB2 model summary (uncomment for full output)
summary(effnetb2,
        input_size=(1, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 1000]            --                   False
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1408, 7, 7]      --                   False
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 32, 112, 112]    --                   False
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 32, 112, 112]    (864)                False
│    │    └─BatchNorm2d (1)                                  [1, 32, 112, 112]    [1, 32, 112, 112]    (64)                 False
│    │    └─SiLU (2)                                         [1, 32, 112, 112]    [1, 32, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 32, 112, 112]    [1, 16,

In [None]:
# Set the seeds for reproducibility
set_seeds()

effnetb2.classifier = nn.Sequential(
    nn.Dropout(p=0.3, inplace=True),
    nn.Linear(in_features=1408, out_features=3, bias=True)
)

In [None]:
from torchinfo import summary

# Print EffNetB2 model summary (uncomment for full output)
summary(effnetb2,
        input_size=(1, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 3]               --                   Partial
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1408, 7, 7]      --                   False
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 32, 112, 112]    --                   False
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 32, 112, 112]    (864)                False
│    │    └─BatchNorm2d (1)                                  [1, 32, 112, 112]    [1, 32, 112, 112]    (64)                 False
│    │    └─SiLU (2)                                         [1, 32, 112, 112]    [1, 32, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 32, 112, 112]    [1, 1

### 3.1 Create a function to make an EffNetB2 feature extractor

In [None]:
def create_effnetb2_model(num_classes:int=3, # Default put classes: [pizza, steak, sushi]
                          seed:int=42):
  # 1, 2, 3 Create EffNetB2 pretrained weights, transforms and model
  weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT

  transforms = weights.transforms()

  model = torchvision.models.efficientnet_b2(weights=weights)

  # 4. Freeze all the base layers
  for param in model.parameters():
    param.requires_grad = False

  # 5. change classifier head with random seed for reproducibilty
  torch.manual_seed(seed)
  model.classifier = nn.Sequential(
      nn.Dropout(p=0.3, inplace=True),
      nn.Linear(in_features=1408, out_features=num_classes)
  )

  return model, transforms

In [None]:
effnetb2, effnetb2_transforms = create_effnetb2_model()

In [None]:
effnetb2_transforms, effnetb2

(ImageClassification(
     crop_size=[288]
     resize_size=[288]
     mean=[0.485, 0.456, 0.406]
     std=[0.229, 0.224, 0.225]
     interpolation=InterpolationMode.BICUBIC
 ),
 EfficientNet(
   (features): Sequential(
     (0): Conv2dNormActivation(
       (0): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
       (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
       (2): SiLU(inplace=True)
     )
     (1): Sequential(
       (0): MBConv(
         (block): Sequential(
           (0): Conv2dNormActivation(
             (0): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False)
             (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
             (2): SiLU(inplace=True)
           )
           (1): SqueezeExcitation(
             (avgpool): AdaptiveAvgPool2d(output_size=1)
             (fc1): Conv2d(32, 8, kernel_size=(1, 1), stride=(1, 1))
   

In [None]:
from torchinfo import summary

# Print EffNetB2 model summary (uncomment for full output)
summary(effnetb2,
        input_size=(1, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 3]               --                   Partial
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1408, 7, 7]      --                   False
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 32, 112, 112]    --                   False
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 32, 112, 112]    (864)                False
│    │    └─BatchNorm2d (1)                                  [1, 32, 112, 112]    [1, 32, 112, 112]    (64)                 False
│    │    └─SiLU (2)                                         [1, 32, 112, 112]    [1, 32, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 32, 112, 112]    [1, 1

In [None]:
from going_modular.going_modular import data_setup

train_dataloader_effnetb2, test_dataloader_effnetb2, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=effnetb2_transforms,
                                                                               batch_size=32)
train_dataloader_effnetb2, test_dataloader_effnetb2, class_names

(<torch.utils.data.dataloader.DataLoader at 0x7a46b4ac1b70>,
 <torch.utils.data.dataloader.DataLoader at 0x7a46b4ac1b40>,
 ['pizza', 'steak', 'sushi'])

### 3.3 Training EffNetB2 feature extractor

In [None]:
from going_modular.going_modular import engine

# Loss function
loss_fn = torch.nn.CrossEntropyLoss()

# Optimizer
optimizer = torch.optim.Adam(params=effnetb2.parameters(),
                             lr=1e-3)

# training function (engine.py)
set_seeds()
effnetb2_results = engine.train(model=effnetb2,
                                train_dataloader=train_dataloader_effnetb2,
                                test_dataloader=test_dataloader_effnetb2,
                                optimizer=optimizer,
                                loss_fn=loss_fn,
                                epochs=10,
                                device=device)

  0%|          | 0/10 [00:00<?, ?it/s]

KeyboardInterrupt: 

### 3.4 Plot the loss curves

In [None]:
from helper_functions import plot_loss_curves

plot_loss_curves(effnetb2_results)

### 3.5 Saving EffNetB2 feature extractor to file


In [None]:
from going_modular.going_modular import utils

# Save the model
utils.save_model(model=effnetb2,
                 target_dir="models",
                 model_name="09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth")

### 3.6 Inspecting the size of our EffNetB2 feature extractor

Why would it be important to consider the size of a saved model?

If we're deploying our model to be used on a mobile app/webstie, there may be limited compute resources.

so if our model file is too large,  we may not be able to store/run it on our target device


In [None]:
from pathlib import Path

# Get the model size in bytes then convert to megabytes
pretrained_effnetb2_model_size = Path("models/09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth").stat().st_size / (1024*1024)
print(f" Pretrained EffnetB2 feature extractor model size: {round(pretrained_effnetb2_model_size, 2)} MB")

### 3.7 Collecting EffnetB2 feature extractor stats

In [None]:
# Count number of parameters in EffNetB2
effnetb2_total_params = sum(torch.numel(param) for param in effnetb2.parameters())
effnetb2_total_params


In [None]:
# Create a dictionary with effnetb2 statistics
effnetb2_stats = {"test_loss": effnetb2_results["test_loss"][-1],
                  "test_acc": effnetb2_results["test_acc"][-1],
                  "number_of_params": effnetb2_total_params,
                  "model_size_(MB)": pretrained_effnetb2_model_size}
effnetb2_stats

## 4. Creating a ViT feature extractor

We're up to our second modeling experiment, repeating the steps for EffNetB2 but this time with a ViT feature extractor

In [None]:
# Check out the vision transformer heads layer
vit = torchvision.models.vit_b_16(weights="DEFAULT")
vit.heads

In [None]:
def create_vit_model(num_classes:int=3,
                     seed:int=42):
  # Create a Vit_B_16 pretrained weights, transforms and model
  weights = torchvision.models.ViT_B_16_Weights.DEFAULT
  transforms = weights.transforms()
  model = torchvision.models.vit_b_16(weights=weights)

  # Freeze all of the base layers
  for param in model.parameters():
    param.requires_grad = False

  # Change the classifier head to suit our needs
  torch.manual_seed(seed)
  model.heads = nn.Sequential(
      nn.Linear(in_features=768, out_features=num_classes)
  )

  return model, transforms


In [None]:
vit, vit_transforms = create_vit_model()
vit_transforms

In [None]:
from torchinfo import summary

# Print EffNetB2 model summary (uncomment for full output)
summary(vit,
        input_size=(1, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

### 4.1 Creating dataloaders

In [None]:

train_dataloader_vit, test_dataloader_vit, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                                       test_dir=test_dir,
                                                                                       transform=vit_transforms,
                                                                                       batch_size=32)

len(train_dataloader_vit), len(test_dataloader_vit), class_names

In [None]:
from going_modular.going_modular import engine

# Setup optimizer
optimizer = torch.optim.Adam(params=vit.parameters(),
                             lr=0.001)

# Setup loss function
loss_fn = torch.nn.CrossEntropyLoss()

# Train ViT feature extractor with seeds set for reproducibility
set_seeds()

vit_results = engine.train(model=vit,
                           train_dataloader=train_dataloader_vit,
                           test_dataloader=test_dataloader_vit,
                           optimizer=optimizer,
                           loss_fn=loss_fn,
                           epochs=10,
                           device=device)

### 4.3 Plot the loss curves of ViT feature extractor

In [None]:
from helper_functions import plot_loss_curves

plot_loss_curves(vit_results)

### 4.4 Saving ViT Feature extractor

In [None]:
# Save model
from going_modular.going_modular import utils

utils.save_model(model=vit,
                 target_dir="models",
                 model_name="09_pretrained_vit_feature_extractor_pizza_steak_sushi_20_percent.pth")

In [None]:
from pathlib import Path

# Get the model size in bytes then convert to megabytes
pretrained_vit_model_size = Path("models/09_pretrained_vit_feature_extractor_pizza_steak_sushi_20_percent.pth").stat().st_size / (1024 * 1024)
print(f"pretrained Vit feature extractor model size: {round(pretrained_vit_model_size, 2)} MB")

In [None]:
vit_total_params = sum(torch.numel(param) for param in vit.parameters())
vit_total_params

In [None]:
# Create a dictionary with effnetb2 statistics
vit_stats = {"test_loss": vit_results["test_loss"][-1],
                  "test_acc": vit_results["test_acc"][-1],
                  "number_of_params": vit_total_params,
                  "model_size_(MB)": pretrained_vit_model_size}
vit_stats

## 5. Making predictions with our trained models and time them

Our goal:
1. performs well (95%+ test accuracy)
2. fast (30+fps)

To test criteria two:
1. loop through test images
2. Time how long each model takes to make a prediction on the image

Let's work towards making a function called `pred_and_store()` to do so.

First we will need a list of test image paths.

In [None]:
from pathlib import Path

# Get all test data Paths
test_data_paths = list(Path(test_dir).glob("*/*.jpg"))
test_data_paths[:5]

### 5.1 Creating a function to make across the test dataset

1. Create a function that takes a list of paths and a trained pytorch model and a series of transforms, a list of target class names and a target device
2. Create an empty lis (can return a full list of all predictions later).
3. Loop through the target input paths (the rest of the steps will take place inside of the loop).
4. Create an empty dictionary for each sample (predictions statistics will go in here)
5. Get the sample path and ground truth class from the filepath.
6. Start the prediction timer.
7. Open the image using `PIL.Image.open(path)`.
8. Transform the image to be usable with a given model
9. Prepare the model for inference by sending it to the target device and turning on eval mode.
10. Turn on torch inference mode and pass the target transformed image to the model and perform forward pass + calculate pre prob + pred class.
11. Add the pred prob + pred class to empty dictionary from step 4.
12. end the prediction timer started in step 6 and add the time to the prediction dictionary.
13. See if the predicted class matches the ground truth class.
14. Append the updated prediction dictionary to the empty list of predictions we created in step 2.
15. return the list of prediction dictionaries

In [None]:
from re import I
import pathlib
import torch

from PIL import Image
from timeit import default_timer as timer
from tqdm.auto import tqdm
from typing import List, Dict

# 1.Create a function that takes a list of paths and a trained pytorch model and a series of transforms, a list of target class names and a target device
def pred_and_store(paths: List[pathlib.Path],
                   model: torch.nn.Module,
                   transform: torchvision.transforms,
                   class_names: List[str],
                   device: str = "cuda" if torch.cuda.is_available() else "cpu") -> List[Dict]:
  # 2. Create an empty lis (can return a full list of all predictions later).
  pred_list = []

  # 3. Loop through the target input paths (the rest of the steps will take place inside of the loop).
  for path in tqdm(paths):

    # 4. Create an empty dictionary for each sample (predictions statistics will go in here)
    pred_dict = {}

    # 5. Get the sample path and ground truth class from the filepath.
    pred_dict["image_path"] = path
    class_name = path.parent.stem
    pred_dict["class_name"] = class_name

    # 6. Start the prediction timer.
    start_time = timer()

    # 7. Open the image using PIL.Image.open(path).
    img = Image.open(path)

    # 8. Transform the image to be usable with a given model (also add a batch dimension, and send it to the target device)
    transformed_image = transform(img).unsqueeze(0).to(device)

    # 9. Prepare the model for inference by sending it to the target device and turning on eval mode
    model = model.to(device)
    model.eval()

    # 10. Turn on torch inference mode and pass the target transformed image to the model and perform forward pass + calculate pre prob + pred class.
    with torch.inference_mode():
      pred_logit = model(transformed_image)
      pred_prob = torch.softmax(pred_logit, dim=1) # Turn logit into prediction probabilities
      pred_label = torch.argmax(pred_prob, dim=1) # turn prediction probaility into prediction label
      pred_class = class_names[pred_label.cpu()] # hardcode prediction class to be on the cpu (python variables live on the cpu)

      # 11. Add the pred prob + pred class to empty dictionary from step 4.
      pred_dict["pred_prob"] = round(pred_prob.unsqueeze(0).max().cpu().item(), 4)
      pred_dict["pred_class"] = pred_class

      # 12. end the prediction timer started in step 6 and add the time to the prediction dictionary.
      end_time = timer()
      pred_dict["time_for_pred"] = round(end_time - start_time, 4)

    # 13. See if the predicted class matches the ground truth class
    pred_dict["correct"] = class_name == pred_class

    # 14. Append the updated prediction dictionary to the empty list of predictions we created in step 2
    pred_list.append(pred_dict)

  # 15.return the list of prediction dictionaries
  return pred_list


### 5.2 Making and timing predictions with EffNetB2

Let's test our `pred_and_store()` function

Two things to note:
1. Device - we're going to hardcode our predictions to happen on CPU (because you wont always be sure of having a GPU when you deploy your model).
2. Transforms - we want to make sure each of the models are prediction on images that have been prepared with the appropriate transforms (e.g. EffNetB2 with `effnetb2_transforms`)

In [None]:
# Make predictions on test dataset with EffNetB2
effnetB2_test_pred_dict = pred_and_store(paths=test_data_paths,
                                         model=effnetb2,
                                         transform=effnetb2_transforms,
                                         class_names=class_names,
                                         device="cpu") # Hardcode predictions to happen on CPU

In [None]:
effnetB2_test_pred_dict[:2]

In [None]:
# Turn the test_pred_dicts into a DataFrame
import pandas as pd
effnetb2_test_pred_df = pd.DataFrame(effnetB2_test_pred_dict)
effnetb2_test_pred_df.head()

In [None]:
effnetb2_test_pred_df["correct"].value_counts()

In [None]:
effnetb2_average_time_per_pred = round(effnetb2_test_pred_df.time_for_pred.mean(), 4)
effnetb2_average_time_per_pred

In [None]:
effnetb2_stats["time_per_pred_cpu"] = effnetb2_average_time_per_pred
effnetb2_stats

### 5.3 Making and timing predictions with VIT

In [None]:
# Make predictions on test dataset with vit
vit_test_pred_dict = pred_and_store(paths=test_data_paths,
                                         model=vit,
                                         transform=vit_transforms,
                                         class_names=class_names,
                                         device="cpu") # Hardcode predictions to happen on CPU

In [None]:
vit_test_pred_dict[:2]

In [None]:
# Turn vit_test_pred_dicts into a dataframe
import pandas as pd
vit_test_pred_df = pd.DataFrame(vit_test_pred_dict)
vit_test_pred_df.head()

In [None]:
vit_test_pred_df["correct"].value_counts()

In [None]:
# Calculate average time for prediction for VIT model
vit_average_time_per_pred = round(vit_test_pred_df.time_for_pred.mean(), 4)
vit_average_time_per_pred

In [None]:
# add average time per prediction to VIT stats
vit_stats["time_per_pred_cpu"] = vit_average_time_per_pred
vit_stats

## 6. Comparing model results, prediction times and size

In [None]:
# Turn stat dictionaries into a Dataframe
df = pd.DataFrame([effnetb2_stats, vit_stats])

# Add column for model names
df["Model"] = ["EffNetB2", "ViT"]

# convert accuracy to percentages
df["test_acc"] = round(df.test_acc * 100, 2)

df

Which model is better?

* `test_loss` (lower is better) - ViT
* `test_acc` (higher is better) - ViT
* `number_of_patams` (generally lower is better) - EffNetB2, if a model has more parameters it generally will take longer to compute
  * *somtimes models with higher parameters can still perform fast
* `model_size_(MB)` - EffNetB2 (for our use case of deploying to a mobile device, generally lower is better)
* `time_per_pred_cpu` - (lower is better,  will be highly dependant on the hardware youre running on) - EffNetB2

Both models fail to achieve our goal of 30+fps... however we could always try and use EffNetB2 and see how it goes

In [None]:
# Compare ViT to EffNetB2 across different characteristics
pd.DataFrame(df.set_index("Model").loc["ViT"] / df.set_index("Model").loc["EffNetB2"],
             columns=["ViT to EffNetB2 ratios"]).T

### 6.1 Visualizing the speed vs performance tradeoff

So we've compared our EffNetB2 and ViT feature extractor models, now lets visualize the comparison with a speed vs performance plot.

We can do so with matplotlib:
1. Create a scatter plot from the comparison DataFrame to compare EffNetB2 and ViT across test accuracy and prediction time.
2. Add titles and labels to make our plot look nice
3. Annotat the samples on the scatter plot so we know whats going on.
4. Create a legend based on the model sizes (`model_size (MB)`)

In [None]:
# 1. Create a plot from model comparison DataFrame
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(12, 8))
scatter = ax.scatter(data=df,
                     x="time_per_pred_cpu",
                     y="test_acc",
                     c=["blue", "orange"],
                     s="model_size_(MB)")

# 2. Add titles and labels to make our plot look good
ax.set_title("FoodVision Mini Inference Speed vs Performace", fontsize=18)
ax.set_xlabel("Prediction time per image (seconds)", fontsize=14)
ax.set_ylabel("Test accuracy (%)", fontsize=14)
ax.tick_params(axis="both", labelsize=12)
ax.grid(True)

# 3. Annotate the samples on scatter plot so we know whats going on
for index, row in df.iterrows():
  ax.annotate(text=row["Model"],
              xy=(row["time_per_pred_cpu"] + 0.0006, row["test_acc"] + 0.04),
              size=12)

# 4. Create a legend based on the model sizes (model_size (MB))
handles, labels = scatter.legend_elements(prop="sizes", alpha=0.5)
model_size_legend = ax.legend(handles,
                              labels,
                              loc="lower right",
                              title="Model_size_(MB)",
                              fontsize=12)

# Save the figure
plt.savefig("09-foodvision-mini-inference-speed-vs-performance.png")

## 7. Bringing Foodvision Mini to life by creating a Gradio demo

We've chosen to deploy EffNetB2 as it fulfills our criteria the best.

What is gradio?

> Gradio is the fastest way to demo your machine learning model with a friendly web interface so that anyone can use it, anywhere!

In [None]:
try:
  import gradio as gr
  print("found gradio, importing it...")
except:
  print("Couldn't find gradio, installing it...")
  !pip -q install gradio
  import gradio as gr

In [None]:
gr.__version__

### 7.1 Gradio overview

Gradio helps you create machine learning demos.

Why create a demo?

So other people can try our models and we can test them in the real-world

Deployment is as important as training

The overall premise of Gradio is to map inputs -> function/model -> outputs

### 7.2 Creating a function to map our inputs and outputs (what we are feeding to the gradio interface class)


In [None]:
# 3 Put our model on the cpu
effnetb2 = effnetb2.to("cpu") # gradio will run on cpu

# Check the device
next(iter(effnetb2.parameters())).device

Let's Create a function called `predict()` to go from:

```
images of food -> ML model (EffNetB2) -> outputs (food label)
```

In [None]:
from typing import Tuple, Dict

def predict(img) -> Tuple[Dict, float]:
  # Start a timer
  start_time = timer()

  # Transform the input image for use with EffNetB2
  img = effnetb2_transforms(img).unsqueeze(0) # unsqueeze = add batch dimension on the 0th dimension

  # Put model into eval mode, make prediction
  effnetb2.eval()
  with torch.inference_mode():
    # Pass transformed image through the model and turn the prediction logits into probabilities
    pred_probs = torch.softmax(effnetb2(img), dim=1)

  # Create a prediction label and prediction probability dictionary
  pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in range(len(class_names))}

  # Calculate the pred time
  end_time = timer()
  pred_time = round(end_time - start_time, 4)

  return pred_labels_and_probs, pred_time

In [None]:
import random
from PIL import Image

# Get a list of all test image filepaths

test_data_paths = list(Path(test_dir).glob("*/*.jpg"))

# Randomly select a test image path
random_image_path = random.sample(test_data_paths, k=1)[0] # [0] gets the path itself
random_image_path

# Open the target image
image = Image.open(random_image_path)
print(f"[INFO] Predicting on image at path: {random_image_path}\n")

# Predict on the target image and print out the outputs
pred_dict, pred_time = predict(image)
print(pred_dict)
print(pred_time)

### 7.3 Creating a List of example images

In [None]:
# Create a list of example inputs to our gradio demo
example_list = [[str(filepath)] for filepath in random.sample(test_data_paths, k=3)]

### 7.4  BUilding a gradio interface

Let's use `gr.Interface` to go from:

```
input: image -> transform -> predict with effnetb2 - > output: pred, pred prob, time
```

In [None]:
import gradio as gr

# Create title, description, and article

title = "FoodVision Mini🍕🥩🍥"
description = "An [EfficientNetB2 feature extractor](https://pytorch.org/vision/main/models/generated/torchvision.models.efficientnet_b2.html#torchvision.models.efficientnet_b2) computer vision model to classify images as pizza, steak, or sushi"
article = "Created at [09. Pytorch Model Deployment](https://www.learnpytorch.io/09_pytorch_model_deployment/#1-getting-data)"

# Create the gradio demo
demo = gr.Interface(fn=predict, # Maps our inputs to outputs
                    inputs=gr.Image(type="pil"),
                    outputs=[gr.Label(num_top_classes=3, label="predictions"),
                             gr.Number(label="Prediction time (s)")],
                    examples=example_list,
                    title=title,
                    description=description,
                    article=article)

# Launch the demo
demo.launch(debug=False, # print errors locally?
            share=True) # generate a publically shareable URL


## 8. Turning our FoodVision Mini Gradio Demo into a deployable app

Our gradio demos from google colab are fantastic,  but they expire after 72 hours

To fix this, we're going to prepare our app files so we can host them on Hugging Face spaces: https://huggingface.co/docs/hub/spaces

### 8.1 What is hugging face spaces?

Hugging Face Spaces offer a simple way to host ML demo apps directly on your profile or your organization’s profile. This allows you to create your ML portfolio, showcase your projects at conferences or to stakeholders, and work collaboratively with other people in the ML ecosystem

If GitHub is a place to show your coding ability, Hugging face spaces is a place to show your machine learning ability (through sharing ML demos that you've built)

### 8.2 Deployed gradio app structure

Let's start to put all of our app files into a single directory:

```
Colab -> folder with all gradio files -> upload app files to hugging face spaces -> deploy
```

By the end our file structure will look like:
```
demos/
└── foodvision_mini/
    ├── 09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth
    ├── app.py
    ├── examples/
    │   ├── example_1.jpg
    │   ├── example_2.jpg
    │   └── example_3.jpg
    ├── model.py
    └── requirements.txt
```

Why use this structure?
 because its one of the simplest we could start with.

### 8.3 Creating a `deoms` folder to store our foodvision mini app files

In [None]:
import shutil
from pathlib import Path

# Create Foodvision mini demo path
foodvision_mini_demo_path = Path("demos/foodvision_mini/")

# Remove files that might exists and create a new directory
if foodvision_mini_demo_path.exists():
  shutil.rmtree(foodvision_mini_demo_path)
  foodvision_mini_demo_path.mkdir(parents=True,
                                  exist_ok=True)
else:
  foodvision_mini_demo_path.mkdir(parents=True,
                                  exist_ok=True)

!ls demos/foodvision_mini/

### 8.4 Creating a folder of example images to use with our foodvision mini demo

what we want:
* 3 images in an `examples/` directory
* images should be from the test set

In [None]:
import shutil
from pathlib import Path

# Create an example directory
foodvision_mini_examples_path = foodvision_mini_demo_path / "examples"
foodvision_mini_examples_path.mkdir(parents=True,
                                    exist_ok=True)

# Collect three random test dataset image paths
foodvision_mini_examples = [Path('data/pizza_steak_sushi_20_percent/test/sushi/592799.jpg'),
                            Path('data/pizza_steak_sushi_20_percent/test/steak/3622237.jpg'),
                            Path('data/pizza_steak_sushi_20_percent/test/pizza/2582289.jpg')]

# Copy the three images to the examples directory
for example in foodvision_mini_examples:
  destination = foodvision_mini_examples_path / example.name
  print(f"[INFO] Copying {example} to {destination}")
  shutil.copy2(src=example,
               dst=destination)

gradio takes in a format of a list of list, so lets verify that we can get a list of list from our `examples/`directory

In [None]:
example_list

In [None]:
import os

# Get example file paths in a list of lists
example_list = [["examples/" + example] for example in os.listdir(foodvision_mini_examples_path)]
example_list

### 8.5 Moving our trained EffnetB2 model to our Foodvision Mini demo directory

In [None]:
import shutil

# create a source path for our target model
effnetb2_foodvision_mini_model_path = "models/09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth"

# Create a destination path for our target model
effnetb2_foodvision_mini_model_destination = foodvision_mini_demo_path / effnetb2_foodvision_mini_model_path.split("/")[1]

# Try to move the model file
try:
  print(f"[INFO] attempting to move {effnetb2_foodvision_mini_model_path} to {effnetb2_foodvision_mini_model_destination}")

  # Move the model
  shutil.move(src=effnetb2_foodvision_mini_model_path,
              dst=effnetb2_foodvision_mini_model_destination)

  print(f"[INFO] Model move complete.")

# If the model has already been moved, check if it exists
except:
  print(f"[INFO] No model found at {effnetb2_foodvision_mini_model_path}, perhaps its already been moved")
  print(f"[INFO] model exists at {effnetb2_foodvision_mini_model_destination}: {effnetb2_foodvision_mini_model_destination.exists()}")



### 8.6 Turning off EffNetB2 model into a python script (`model.py`)

we have a saved `.pth` model `stat_dict` and want to load it into a model instance.

Let's move our `create_effnetb2_model()` function to a script so we can reuse it

In [None]:
%%writefile demos/foodvision_mini/model.py

import torch
import torchvision

from torch import nn

def create_effnetb2_model(num_classes:int=3, # Default put classes: [pizza, steak, sushi]
                          seed:int=42):
  # 1, 2, 3 Create EffNetB2 pretrained weights, transforms and model
  weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT

  transforms = weights.transforms()

  model = torchvision.models.efficientnet_b2(weights=weights)

  # 4. Freeze all the base layers
  for param in model.parameters():
    param.requires_grad = False

  # 5. change classifier head with random seed for reproducibilty
  torch.manual_seed(seed)
  model.classifier = nn.Sequential(
      nn.Dropout(p=0.3, inplace=True),
      nn.Linear(in_features=1408, out_features=num_classes)
  )

  return model, transforms

In [None]:
from demos.foodvision_mini import model

effnetb2_model, effnetb2_transforms_import = model.create_effnetb2_model()
effnetb2_model

### 8.7 Turning our Foodvision Mini Gradio app into a Python script (`app.py`)

The `app.py` file will have four major parts:
1. Imports and class names setup
2. Model and transforms preparation
3. Predict function (`predict()`)
4. Gradio app - our Gradio interface + launch command

In [None]:
%%writefile demos/foodvision_mini/app.py
### 1. Imports and class names setup ###
import gradio as gr
import os
import torch


from model import create_effnetb2_model
from timeit import default_timer as timer
from typing import Tuple, Dict

# Setup class names
class_names = ['pizza', 'steak', 'sushi']

### 2. Model and transforms preparation ###
effnetb2, effnetb2_transforms = create_effnetb2_model(
    num_classes=len(class_names)
)

# Load the saved weights
effnetb2.load_state_dict(torch.load(
    f="foodvision_mini/09_pretrained_effnetb2_feature_extractor_pizza_steak_sushi_20_percent.pth",
    map_location=torch.device("cpu") # load the model to the cpu
))

### 3. Predict function ###

def predict(img) -> Tuple[Dict, float]:
  # Start a timer
  start_time = timer()

  # Transform the input image for use with EffNetB2
  img = effnetb2_transforms(img).unsqueeze(0) # unsqueeze = add batch dimension on the 0th dimension

  # Put model into eval mode, make prediction
  effnetb2.eval()
  with torch.inference_mode():
    # Pass transformed image through the model and turn the prediction logits into probabilities
    pred_probs = torch.softmax(effnetb2(img), dim=1)

  # Create a prediction label and prediction probability dictionary
  pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in range(len(class_names))}

  # Calculate the pred time
  end_time = timer()
  pred_time = round(end_time - start_time, 4)

  return pred_labels_and_probs, pred_time

### 4. Gradio app ###

# Create title, description, and article

title = "FoodVision Mini🍕🥩🍥"
description = "An [EfficientNetB2 feature extractor](https://pytorch.org/vision/main/models/generated/torchvision.models.efficientnet_b2.html#torchvision.models.efficientnet_b2) computer vision model to classify images as pizza, steak, or sushi"
article = "Created at [09. Pytorch Model Deployment](https://www.learnpytorch.io/09_pytorch_model_deployment/#1-getting-data)"

# Create example list
example_list = [["examples/" + example] for example in os.listdir("examples")]

# Create the gradio demo
demo = gr.Interface(fn=predict, # Maps our inputs to outputs
                    inputs=gr.Image(type="pil"),
                    outputs=[gr.Label(num_top_classes=3, label="predictions"),
                             gr.Number(label="Prediction time (s)")],
                    examples=example_list,
                    title=title,
                    description=description,
                    article=article)

# Launch the demo
demo.launch(debug=False, # print errors locally?
            share=True) # generate a publically shareable URL


### 8.8 Creating a requirements file for foodvision mini (`requirements.txt`)

The requirements file will tell our hugging face space what software dependencies our app requires.

The three main ones are:
* `torch`
* `torhvision`
* `gradio`

In [None]:
%%writefile demos/foodvision_mini/requirements.txt
torch==2.2.1
torchvision==0.17.1
gradio==4.22.0

In [None]:
torch.__version__

In [None]:
torchvision.__version__

In [None]:
gr.__version__

## 9. Deploying our Foodvision mini app HuggingFace Spaces
here are two main options for uploading to a Hugging Face Space (also called a Hugging Face Repository, similar to a git repository):

* Uploading via the Hugging Face Web interface (easiest).
* ploading via the command line or terminal.
 * Bonus: You can also use the huggingface_hub library to interact with Hugging Face, this would be a good extension to the above two options.


### 9.1 downloading our foodvision mini app files

we want to download our `foodvision_mini` demo app so we can upload it to Hugging face spaces.

In [None]:
!ls demos/foodvision_mini/examples

In [None]:
# Change into the foodvision_mini directory and then zip it from the inside

!cd demos/foodvision_mini && zip -r ../foodvision_mini.zip * -x "*.pyc" "*.ipynb" "*__pycache__*" "*ipynb_checkpoint*" # '-x' means exclude

In [None]:
!pwd # print working directory

In [None]:
# Download
try:
  from google.colab import files
  files.download("demos/foodvision_mini.zip")
except:
  print(f"Not running in Google Colab, cant use google.colab.files.download() pleace download foodvision_mini.zip manually")

### 9.2 Running our Gradio demo app locally
running the app locally: https://www.learnpytorch.io/09_pytorch_model_deployment/#92-running-our-foodvision-mini-demo-locally

### 9.3 Uploading our foodvision Mini gradio demo to hugging face spaces

## 10. Creating Foodvision BIG!

Foodvision Mini works well with 3 classes (pizza, steak, sushi)

So all of experimenting is paying off..

Let's Step things up a notch and make Foodvision BIG!! using all of the Food101 classes

In [None]:
# Create Food101 model and transforms
effnetb2_food101, effnetb2_transforms = create_effnetb2_model(num_classes=101)

In [None]:
from torchinfo import summary

# Print EffNetB2 model summary (uncomment for full output)
summary(effnetb2_food101,
        input_size=(1, 3, 224, 224),
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"])

In [None]:
effnetb2_transforms

Since we are working with a larger dataset, we may want to introduce some data augmentation techniques:
* This is because with larger datasets and larger models, overfitting becomes more of a problem.
* Because we are working with a large number of classes, lets use TrivialAugment as our data augmentation technique.

In [None]:
# Create training data transforms
food_101_trasnforms = torchvision.transforms.Compose([
    torchvision.transforms.TrivialAugmentWide(),
    effnetb2_transforms
])

food_101_trasnforms

In [None]:
# Testing data transforms, we dont want to augment our testing data
effnetb2_transforms

### 10.2 Getting data for FoodVision Big

In [None]:
from torchvision import datasets

# Setup data directory
from pathlib import Path
data_dir = Path("data")

# Get the training data (~750 x 101 classes)
train_data = datasets.Food101(root=data_dir,
                              split="train",
                              transform=food_101_trasnforms, # apply data augmentation to training data
                              download=True)# Get the training data (~750 x 101 classes)

# Get the testing data (~250 x 101 classes)
test_data = datasets.Food101(root=data_dir,
                              split="test",
                              transform=effnetb2_transforms, # apply data augmentation to testing data
                              download=True)

print(train_data)
print(test_data)

In [None]:
(750 * 101) + (250 * 101)

In [None]:
# get Food101 class names
food101_class_names = train_data.classes

# View the first 10
food101_class_names[:10]

### 10.3 Creating a subset of the Food101 dataset for faster experimenting

Why create a subset?

We want our first few experiments to run as quick as possible

We know Foodvision Mini works pretty well but thi is the first we've upgraded 101 classes

To do so, let's make a subset of 20% of the data from the Food101 dataset (training and test).

our short term goal: to beat the original Food101 paper results of 56.40% accuracy on the test dataset (see paper: https://data.vision.ee.ethz.ch/cvl/datasets_extra/food-101/static/bossard_eccv14_food-101.pdf)
we want to beat this result using modern deep leaning techniques and only 20% of the data

In [None]:
from torch.utils.data import random_split

def split_dataset(dataset:torchvision.datasets,
                  split_size:float=0.2,
                  seed:int=42):
  # Create split lengths based on original dataset length
  length_1 = int(len(dataset) * split_size)
  length_2 = len(dataset) - length_1

  # print out info
  print(f"[INFO] splitting dataset off length {len(dataset)} into splits of size: {length_1} and {length_2}")

  # Create splits with given random seeds
  random_split_1, random_split_2 = random_split(dataset,
                                                lengths=[length_1, length_2],
                                                generator=torch.manual_seed(seed))

  return random_split_1, random_split_2

In [None]:
# Create training 20% split Food101
train_data_food101_20_percent, _ = split_dataset(dataset=train_data,
                                                 split_size=0.2)

test_data_food101_20_percent, _ = split_dataset(dataset=test_data,
                                                 split_size=0.2)

In [None]:
len(train_data_food101_20_percent), len(test_data_food101_20_percent)

In [None]:
import os
import torch

BATCH_SIZE = 32

# Create Food101 20 percent training Dataloader
train_dataloader_food101_20_percent = torch.utils.data.DataLoader(dataset=train_data_food101_20_percent,
                                                                  batch_size=BATCH_SIZE,
                                                                  shuffle=True,
                                                                  num_workers=os.cpu_count())

# Create Food101 20 percent testing dataloader
test_dataloader_food101_20_percent = torch.utils.data.DataLoader(dataset=test_data_food101_20_percent,
                                                                  batch_size=BATCH_SIZE,
                                                                  shuffle=False,
                                                                  num_workers=os.cpu_count())

len(train_dataloader_food101_20_percent), len(test_dataloader_food101_20_percent)

### 10.5 Training Foodvision Big!!!

Things for training:
* 5 epochs
* Optimizer `torch.optim.Adam(lr=le-3)`
* Loss function: `torch.nn.CrossEntropyLoss(label_smoothing=0.1)`

Why use label smoothing?
Label smoothing helps to prevent overfitting (it's a regularization technique).

Without label smoothing and 5 classes:

```
[0.00, 0.00, 0.99, 0.01, 0.00]
```
With label smoothing and 5 classes:

```
[0.01, 0.01, 0.96, 0.01, 0.01]
```


In [None]:
from going_modular.going_modular import engine

# Setup the optimizer
optimizer = torch.optim.Adam(params=effnetb2_food101.parameters(),
                             lr=1e-3)

# Setup the loss function
loss_fn = torch.nn.CrossEntropyLoss(label_smoothing=0.1)

# Want to beat the original Food101 papers results of 56.4% on the test dataset with 20% of the data
set_seeds()

# effnetb2_food101_results = engine.train(model=effnetb2_food101,
#                                         train_dataloader=train_dataloader_food101_20_percent,
#                                         test_dataloader=test_dataloader_food101_20_percent,
#                                         optimizer=optimizer,
#                                         loss_fn=loss_fn,
#                                         epochs=5,
#                                         device=device)

### 10.6 inspecting loss curves of FoodVision Big model

In [None]:
from helper_functions import plot_loss_curves

plot_loss_curves(effnetb2_food101_results)

### 10.7 Save and load Foodvision Big Model

In [None]:
from going_modular.going_modular import utils

# create model path
effnetb2_food101_model_path = "09_pretrained_effnetb2_feature_extractor_20_percent.pth"

# Save the foodvision big model
utils.save_model(model=effnetb2_food101,
                 target_dir="models/",
                 model_name=effnetb2_food101_model_path)

In [None]:
# Create a Food101 compatible EffNetB2 instance
loaded_effnetb2_food101, effnetb2_transforms = create_effnetb2_model(num_classes=101)

# Load the save models state_dict()
loaded_effnetb2_food101.load_state_dict(torch.load("models/09_pretrained_effnetb2_feature_extractor_20_percent.pth"))

### 10.8 Checking the FoodVision Big model size

In [None]:
# Get the model size in bytes then convert to megabytes
pretrained_effnetb2_food101_model_size = Path("models", effnetb2_food101_model_path).stat().st_size // (1024 * 1024)
print(f"Pretrained effnetb2 feature extractor Food101 model size: {pretrained_effnetb2_food101_model_size} MB")

## 11. Turning our Foodvision Big model into a deployable app

Why deploy a model?

Deploying a model allows you to see how your model goes in the real-world (the ultimate test set)

Let's Create an outline for our FoodVision Big app:

```
demos/
  foodvision_big/
    09_pretrained_effnetb2_feature_extractor_food101_20_percent.pth
    app.py
    class_names.txt
    examples/
      example_1.jpg
    model.py
    requirements.txt
```


In [None]:
from pathlib import Path

# Create Foodvision Big demo path
foodvision_big_demo_path = Path("demos/foodvision_big/")

# Make Foodvision Big demo directory
foodvision_big_demo_path.mkdir(parents=True,
                               exist_ok=True)

# Make Foodvision Big demo examples directory
(foodvision_big_demo_path / "examples").mkdir(parents=True,
                                              exist_ok=True)


In [None]:
!ls demos/foodvision_big # shows what sub directories are in there

### 11.1 Download an example image and moving it to the `examples` directory

In [None]:
image_path = "data/food-101/images/apple_pie/1043283.jpg"

In [None]:
!mv models/09_pretrained_effnetb2_feature_extractor_20_percent.pth demos/foodvision_big # moves the model to the foodvision_big directory

### 11.2 Saving Food101 class names to file (class_names.txt)

Let's save all of the food101 class names to a .txt file so we can import them and use them in our app.

In [None]:
food101_class_names = train_data.classes

In [None]:
food101_class_names[:10]

In [None]:
# Create a path to Food101 class names
foodvision_big_class_names_path = foodvision_big_demo_path / "class_names.txt"

foodvision_big_class_names_path

In [None]:
# Write Food101 class names to text file
with open(foodvision_big_class_names_path, "w") as f:
  print(f"[INFO] Saving Food101 class names to {foodvision_big_class_names_path}")
  f.write("\n".join(food101_class_names)) # New line per class name

In [None]:
# Open Food101 class names file and read each line into a list
with open(foodvision_big_class_names_path, "r") as f:
  food101_class_names_loaded = [food.strip() for food in f.readlines()]

food101_class_names_loaded[:5]

### 11.3 Turning our Foodvision Big model into a python script (`model.py`)


In [None]:
%%writefile demos/foodvision_big/model.py

import torch
import torchvision

from torch import nn

def create_effnetb2_model(num_classes:int=101, # Default put classes: [pizza, steak, sushi]
                          seed:int=42):
  # 1, 2, 3 Create EffNetB2 pretrained weights, transforms and model
  weights = torchvision.models.EfficientNet_B2_Weights.DEFAULT

  transforms = weights.transforms()

  model = torchvision.models.efficientnet_b2(weights=weights)

  # 4. Freeze all the base layers
  for param in model.parameters():
    param.requires_grad = False

  # 5. change classifier head with random seed for reproducibilty
  torch.manual_seed(seed)
  model.classifier = nn.Sequential(
      nn.Dropout(p=0.3, inplace=True),
      nn.Linear(in_features=1408, out_features=num_classes)
  )

  return model, transforms

### 11.4 Turning our Foodvision Big Gradio app into a Python script (`app.py`)

The `app.py` file will have four major parts:
1. Imports and class names setup - for class names,  we'll need to import from `class_names.txt`
2. Model and transforms preparation - we'll need to make sure our model is suitable for Foodvision Big
3. Predict function (`predict()`) - This can stay the same as the original `predict()`
4. Gradio app - our Gradio interface + launch command - this will change slightly from Foodvision Mini to reflect the Foodvision big updates

In [None]:
%%writefile demos/foodvision_big/app.py

### 1. Imports and class names setup ###
import gradio as gr
import os
import torch

from model import create_effnetb2_model
from timeit import default_timer as timer
from typing import Tuple, Dict

# Set up the class names
with open("class_names.txt", "r") as f:
  class_names = [food.strip() for food in f.readlines()]

### 2. Model and transforms preparation ###
# Create model and transforms
effnetb2, effnetb2_transforms = create_effnetb2_model(num_classes=101)

# Load the saved weights
effnetb2.load_state_dict(
    torch.load(f="09_pretrained_effnetb2_feature_extractor_20_percent.pth",
                                    map_location=torch.device("cpu")) # Load it to the cpu
)

### 3. Predict function ###


def predict(img) -> Tuple[Dict, float]:
  # Start a timer
  start_time = timer()

  # Transform the input image for use with EffNetB2
  img = effnetb2_transforms(img).unsqueeze(0) # unsqueeze = add batch dimension on the 0th dimension

  # Put model into eval mode, make prediction
  effnetb2.eval()
  with torch.inference_mode():
    # Pass transformed image through the model and turn the prediction logits into probabilities
    pred_probs = torch.softmax(effnetb2(img), dim=1)

  # Create a prediction label and prediction probability dictionary
  pred_labels_and_probs = {class_names[i]: float(pred_probs[0][i]) for i in range(len(class_names))}

  # Calculate the pred time
  end_time = timer()
  pred_time = round(end_time - start_time, 4)

  return pred_labels_and_probs, pred_time

### 4. Gradio App ###
# Create title, description, and article

title = "FoodVision BIG 🍔👁️💪🏽"
description = "An [EfficientNetB2 feature extractor](https://pytorch.org/vision/main/models/generated/torchvision.models.efficientnet_b2.html#torchvision.models.efficientnet_b2) computer vision model to classify images (101 classes of food from the Food101 dataset)"
article = "Created at [09. Pytorch Model Deployment](https://www.learnpytorch.io/09_pytorch_model_deployment/#11-turning-our-foodvision-big-model-into-a-deployable-app)"

# Create example list
example_list = [["examples/" + example] for example in os.listdir("examples")]

# Create the gradio demo
demo = gr.Interface(fn=predict, # Maps our inputs to outputs
                    inputs=gr.Image(type="pil"),
                    outputs=[gr.Label(num_top_classes=5, label="predictions"),
                             gr.Number(label="Prediction time (s)")],
                    examples=example_list,
                    title=title,
                    description=description,
                    article=article)

# Launch the demo
demo.launch(debug=False, # print errors locally?
            share=True) # generate a publically shareable URL

### 11.5 Creating a requirements file for Foodvision Big (`requirements.txt`)

In [None]:
%%writefile demos/foodvision_big/requirements.txt
torch==2.2.1
torchvision==0.17.1
gradio==4.22.0


### 11.6 Downloading our Foodvision big app files

In [None]:
# Change into the foodvision_mini directory and then zip it from the inside

!cd demos/foodvision_big && zip -r ../foodvision_big.zip * -x "*.pyc" "*.ipynb" "*__pycache__*" "*ipynb_checkpoint*" # '-x' means exclude

In [None]:
# Download
try:
  from google.colab import files
  files.download("demos/foodvision_big.zip")
except:
  print(f"Not running in Google Colab, cant use google.colab.files.download() pleace download foodvision_big.zip manually")

### 11.7 Deploying our Foodvision Big model app to HuggingFace spaces

Let's bring foodvision Big to life by deploying it to the world
https://huggingface.co/spaces/burhanji1/foodvision_big_1

### Exercises and Extra curriculum

### Exercise 1. Make and time predictions with both feature extractor models on the test dataset using the GPU (device="cuda").

### Exercise 2. The ViT feature extractor seems to have more learning capacity (due to more parameters) than EffNetB2, how does it go on the larger 20% split of the entire Food101 dataset?

Train a ViT feature extractor on the 20% Food101 dataset for 5 epochs, just like we did with EffNetB2 in section 10. Creating FoodVision Big