##Git and GitHub
Define access credentials and repo details.

The following environment variables (or secret keys) are assumed to be configured:
* GITHUB_USERNAME - github username
* GITHUB_USER_EMAIL - email configured for the github user
* GITHUB_TOKEN - github personal access token with relevant privileges (`repo` is enough)


In [1]:
from google.colab import userdata

github_user_email = userdata.get('GITHUB_USER_EMAIL')

In [3]:
token = userdata.get('GITHUB_TOKEN')
username = userdata.get('GITHUB_USERNAME')
repo = "mini-food-pic-classifier"

In [17]:
!git config --global user.name {username}
!git config --global user.email {github_user_email}
# !git config --global user.password ""

Get the latest code from the project repo.

Change the repo name and path as you need.

Create new repo in case you want to use a new repo.

In [4]:
!git clone https://{token}@github.com/{username}/{repo}

Cloning into 'mini-food-pic-classifier'...
remote: Enumerating objects: 4, done.[K
remote: Counting objects: 100% (4/4), done.[K
remote: Compressing objects: 100% (4/4), done.[K
remote: Total 4 (delta 0), reused 0 (delta 0), pack-reused 0 (from 0)[K
Receiving objects: 100% (4/4), done.


Change the working path to be inside the repo folder from this point and on.

In [5]:
%cd {repo}

/content/mini-food-pic-classifier


In [6]:
# project_name = "mini-food-pic-classifier"
project_path = "/content/" + repo

Currently assume the repo does not have the most updated version of this notebook, so copy it from the google drive:
* Connect to the drive
* Copy the latest notebook from the drive to its place in the local git repo

In [7]:
import os
load_notebooks_path = os.path.join(project_path, "notebooks")
if not os.path.exists(load_notebooks_path):
  os.mkdir(load_notebooks_path)

In [39]:
notebook_path = "\"/content/drive/MyDrive/Colab Notebooks/youtube/Learn PyTorch for deep learning in a day/05_pytorch_going_modular/pytorch_going_modular_cell_mode.ipynb\""

In [8]:
!cp  {notebook_path} ./notebooks/

In [9]:
!git status

On branch main
Your branch is up to date with 'origin/main'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	[31mnotebooks/[m

nothing added to commit but untracked files present (use "git add" to track)


##0. Running a notebook in cell mode

This notebook is intended to run normally, one cell at a time.

## 1. Get data
We're going to start by downloading the `pizza_steak_sushi` datast with images of pizza, steak and sushi.

In [29]:
import os
import zipfile

from pathlib import Path

import requests

# Setup path to data folder
data_path = Path(os.path.join(project_path, "data/"))
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it...
if image_path.is_dir():
    print(f"{image_path} directory exists.")
else:
    print(f"Did not find {image_path} directory, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)

# Download pizza, steak, sushi data
with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
    request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
    print("Downloading pizza, steak, sushi data...")
    f.write(request.content)

# Unzip pizza, steak, sushi data
with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
    print("Unzipping pizza, steak, sushi data...")
    zip_ref.extractall(image_path)

# Remove zip file
os.remove(data_path / "pizza_steak_sushi.zip")

Did not find /content/mini-food-pic-classifier/data/pizza_steak_sushi directory, creating one...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


In [30]:
# Setup train and testing paths
train_dir = image_path / "train"
test_dir = image_path / "test"

train_dir, test_dir

(PosixPath('/content/mini-food-pic-classifier/data/pizza_steak_sushi/train'),
 PosixPath('/content/mini-food-pic-classifier/data/pizza_steak_sushi/test'))

###Update Git

In [10]:
!git checkout -b cell_mode_get_data

Switched to a new branch 'cell_mode_get_data'


In [19]:
!git add notebooks
!git commit -m "Get data in cell mode"

[cell_mode_get_data 06fef79] Get data in cell mode
 1 file changed, 1 insertion(+)
 create mode 100644 notebooks/pytorch_going_modular_cell_mode.ipynb


In [20]:
!git status

On branch cell_mode_get_data
nothing to commit, working tree clean


In [22]:
!git checkout main
!git merge cell_mode_get_data
!git push origin main

Everything up-to-date


##2. Create Datasets and DataLoaders
Let's turn our data into PyTorch `Dataset`'s and `DataLoader`'s and find out a few useful attributes from them such as `classes` and their lengths.

In [31]:
from torchvision import datasets, transforms

# Create simple transform
data_transform = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor(),
])

# Use ImageFolder to create dataset(s)
train_data = datasets.ImageFolder(root=train_dir, # target folder of images
                                  transform=data_transform, # transforms to perform on data (images)
                                  target_transform=None) # transforms to perform on labels (if necessary)

test_data = datasets.ImageFolder(root=test_dir,
                                 transform=data_transform)

print(f"Train data:\n{train_data}\nTest data:\n{test_data}")

Train data:
Dataset ImageFolder
    Number of datapoints: 225
    Root location: /content/mini-food-pic-classifier/data/pizza_steak_sushi/train
    StandardTransform
Transform: Compose(
               Resize(size=(64, 64), interpolation=bilinear, max_size=None, antialias=True)
               ToTensor()
           )
Test data:
Dataset ImageFolder
    Number of datapoints: 75
    Root location: /content/mini-food-pic-classifier/data/pizza_steak_sushi/test
    StandardTransform
Transform: Compose(
               Resize(size=(64, 64), interpolation=bilinear, max_size=None, antialias=True)
               ToTensor()
           )


In [32]:
# Get class names as a list
class_names = train_data.classes
class_names

['pizza', 'steak', 'sushi']

In [33]:
# Can also get class names as a dict
class_dict = train_data.class_to_idx
class_dict

{'pizza': 0, 'steak': 1, 'sushi': 2}

In [34]:
# Check the lengths
len(train_data), len(test_data)

(225, 75)

In [35]:
# Turn train and test Datasets into DataLoaders
from torch.utils.data import DataLoader

train_dataloader = DataLoader(dataset=train_data,
                              batch_size=1, # how many samples per batch?
                              num_workers=1, # how many subprocesses to use for data loading? (higher = more)
                              shuffle=True) # shuffle the data?

test_dataloader = DataLoader(dataset=test_data,
                             batch_size=1,
                             num_workers=1,
                             shuffle=False) # don't usually need to shuffle testing data

train_dataloader, test_dataloader

(<torch.utils.data.dataloader.DataLoader at 0x7ba6fb41b190>,
 <torch.utils.data.dataloader.DataLoader at 0x7ba6fb41b100>)

In [36]:
# Check out single image size/shape
img, label = next(iter(train_dataloader))

# Batch size will now be 1, try changing the batch_size parameter above and see what happens
print(f"Image shape: {img.shape} -> [batch_size, color_channels, height, width]")
print(f"Label shape: {label.shape}")

Image shape: torch.Size([1, 3, 64, 64]) -> [batch_size, color_channels, height, width]
Label shape: torch.Size([1])


###Update Git

In [41]:
!cp {notebook_path} ./notebooks/
!git checkout -b create_datasets_and_dataloaders

M	notebooks/pytorch_going_modular_cell_mode.ipynb
Already on 'create_datasets_and_dataloaders'


In [42]:
!git add notebooks/pytorch_going_modular_cell_mode.ipynb
!git commit -m "Create Datasets and DataLoaders"

[create_datasets_and_dataloaders bec1031] Create Datasets and DataLoaders
 1 file changed, 1 insertion(+), 1 deletion(-)
 rewrite notebooks/pytorch_going_modular_cell_mode.ipynb (75%)
Everything up-to-date


In [44]:
!git checkout main
!git merge create_datasets_and_dataloaders
!git push origin main

Switched to branch 'main'
Your branch is up to date with 'origin/main'.
Updating 4a19b62..bec1031
Fast-forward
 notebooks/pytorch_going_modular_cell_mode.ipynb | 2 [32m+[m[31m-[m
 1 file changed, 1 insertion(+), 1 deletion(-)
Enumerating objects: 11, done.
Counting objects: 100% (11/11), done.
Delta compression using up to 2 threads
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 3.20 KiB | 1.60 MiB/s, done.
Total 8 (delta 4), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (4/4), completed with 2 local objects.[K
To https://github.com/guyfuchs/mini-food-pic-classifier
   4a19b62..bec1031  main -> main


##3. Making a model (TinyVGG)
We're going to use the TinyVGG model from the CNN Explainer website.


In [45]:
import torch

from torch import nn

class TinyVGG(nn.Module):
    """Creates the TinyVGG architecture.

    Replicates the TinyVGG architecture from the CNN explainer website in PyTorch.
    See the original architecture here: https://poloclub.github.io/cnn-explainer/

    Args:
    input_shape: An integer indicating number of input channels.
    hidden_units: An integer indicating number of hidden units between layers.
    output_shape: An integer indicating number of output units.
    """
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int) -> None:
        super().__init__()
        self.conv_block_1 = nn.Sequential(
          nn.Conv2d(in_channels=input_shape,
                    out_channels=hidden_units,
                    kernel_size=3,
                    stride=1,
                    padding=0),
          nn.ReLU(),
          nn.Conv2d(in_channels=hidden_units,
                    out_channels=hidden_units,
                    kernel_size=3,
                    stride=1,
                    padding=0),
          nn.ReLU(),
          nn.MaxPool2d(kernel_size=2,
                        stride=2)
        )
        self.conv_block_2 = nn.Sequential(
          nn.Conv2d(hidden_units, hidden_units, kernel_size=3, padding=0),
          nn.ReLU(),
          nn.Conv2d(hidden_units, hidden_units, kernel_size=3, padding=0),
          nn.ReLU(),
          nn.MaxPool2d(2)
        )
        self.classifier = nn.Sequential(
          nn.Flatten(),
          # Where did this in_features shape come from?
          # It's because each layer of our network compresses and changes the shape of our inputs data.
          nn.Linear(in_features=hidden_units*13*13,
                    out_features=output_shape)
        )

    def forward(self, x: torch.Tensor):
        x = self.conv_block_1(x)
        x = self.conv_block_2(x)
        x = self.classifier(x)
        return x
        # return self.classifier(self.block_2(self.block_1(x))) # <- leverage the benefits of operator fusion

Now let's create an instance of TinyVGG and put it on the target device.

>Note: If you're using Google Colab, and you'd like to use a GPU (recommended), you can turn one on via going to Runtime -> Change runtime type -> Hardware accelerator -> GPU.

In [46]:
import torch

device = "cuda" if torch.cuda.is_available() else "cpu"

# Instantiate an instance of the model
torch.manual_seed(42)
model_0 = TinyVGG(input_shape=3, # number of color channels (3 for RGB)
                  hidden_units=10,
                  output_shape=len(train_data.classes)).to(device)
model_0

TinyVGG(
  (conv_block_1): Sequential(
    (0): Conv2d(3, 10, kernel_size=(3, 3), stride=(1, 1))
    (1): ReLU()
    (2): Conv2d(10, 10, kernel_size=(3, 3), stride=(1, 1))
    (3): ReLU()
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (conv_block_2): Sequential(
    (0): Conv2d(10, 10, kernel_size=(3, 3), stride=(1, 1))
    (1): ReLU()
    (2): Conv2d(10, 10, kernel_size=(3, 3), stride=(1, 1))
    (3): ReLU()
    (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (classifier): Sequential(
    (0): Flatten(start_dim=1, end_dim=-1)
    (1): Linear(in_features=1690, out_features=3, bias=True)
  )
)


Let's check out our model by doing a dummy forward pass.

In [47]:
# 1. Get a batch of images and labels from the DataLoader
img_batch, label_batch = next(iter(train_dataloader))

# 2. Get a single image from the batch and unsqueeze the image so its shape fits the model
img_single, label_single = img_batch[0].unsqueeze(dim=0), label_batch[0]
print(f"Single image shape: {img_single.shape}\n")

# 3. Perform a forward pass on a single image
model_0.eval()
with torch.inference_mode():
    pred = model_0(img_single.to(device))

# 4. Print out what's happening and convert model logits -> pred probs -> pred label
print(f"Output logits:\n{pred}\n")
print(f"Output prediction probabilities:\n{torch.softmax(pred, dim=1)}\n")
print(f"Output prediction label:\n{torch.argmax(torch.softmax(pred, dim=1), dim=1)}\n")
print(f"Actual label:\n{label_single}")

Single image shape: torch.Size([1, 3, 64, 64])

Output logits:
tensor([[ 0.0208, -0.0020,  0.0095]])

Output prediction probabilities:
tensor([[0.3371, 0.3295, 0.3333]])

Output prediction label:
tensor([0])

Actual label:
0


###Update Git

In [49]:
!cp {notebook_path} ./notebooks/
!git checkout -b add_model

Switched to a new branch 'add_model'


In [None]:
!git add notebooks/pytorch_going_modular_cell_mode.ipynb
!git commit -m "add model to notebook"

[create_datasets_and_dataloaders bec1031] Create Datasets and DataLoaders
 1 file changed, 1 insertion(+), 1 deletion(-)
 rewrite notebooks/pytorch_going_modular_cell_mode.ipynb (75%)
Everything up-to-date


In [None]:
!git checkout main
!git merge add_model
!git push origin main

Switched to branch 'main'
Your branch is up to date with 'origin/main'.
Updating 4a19b62..bec1031
Fast-forward
 notebooks/pytorch_going_modular_cell_mode.ipynb | 2 [32m+[m[31m-[m
 1 file changed, 1 insertion(+), 1 deletion(-)
Enumerating objects: 11, done.
Counting objects: 100% (11/11), done.
Delta compression using up to 2 threads
Compressing objects: 100% (8/8), done.
Writing objects: 100% (8/8), 3.20 KiB | 1.60 MiB/s, done.
Total 8 (delta 4), reused 0 (delta 0), pack-reused 0
remote: Resolving deltas: 100% (4/4), completed with 2 local objects.[K
To https://github.com/guyfuchs/mini-food-pic-classifier
   4a19b62..bec1031  main -> main
