## 1. Turn the code to get the data (from section 1. Get Data) into a Python script, such as `get_data.py`.

* When you run the script using `python get_data.py` it should check if the data already exists and skip downloading if it does.
* If the data download is successful, you should be able to access the `pizza_steak_sushi` images from the `data` directory.

In [1]:
%%writefile get_data.py

import os
import zipfile

from pathlib import Path
import requests

data_path = Path("data/")
image_path = data_path / "pizza_sushi_steak"

if image_path.is_dir():
    print(f"Directory: {image_path} exists.")
else:
    print(f"Directory: {image_path} no found, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)
    
    with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
        request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
        print("Downloading pizza, steak, sushi data...")
        f.write(request.content)
        print("Done.")
    
    with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
        print(f"Unzipping pizza, steak, sushi data...")
        zip_ref.extractall(image_path)
        print("Done.")
    os.remove(data_path / "pizza_steak_sushi.zip")

Overwriting get_data.py


## 2. Use [Python's `argparse` module](https://docs.python.org/3/library/argparse.html) to be able to send the `train.py` custom hyperparameter values for training procedures.
* Add an argument flag for using a different:
  * Training/testing directory
  * Learning rate
  * Batch size
  * Number of epochs to train for
  * Number of hidden units in the TinyVGG model
    * Keep the default values for each of the above arguments as what they already are (as in notebook 05).
* For example, you should be able to run something similar to the following line to train a TinyVGG model with a learning rate of 0.003 and a batch size of 64 for 20 epochs: `python train.py --learning_rate 0.003 batch_size 64 num_epochs 20`.
* **Note:** Since `train.py` leverages the other scripts we created in section 05, such as, `model_builder.py`, `utils.py` and `engine.py`, you'll have to make sure they're available to use too. You can find these in the [`going_modular` folder on the course GitHub](https://github.com/mrdbourke/pytorch-deep-learning/tree/main/going_modular/going_modular). 

In [2]:
%%writefile data_setup.py

import os
from torchvision import datasets
from torchvision.transforms import v2
from torch.utils.data import DataLoader

NUM_WORKERS = os.cpu_count()

def create_dataloaders(
        train_dir: str,
        test_dir: str,
        transform: v2,
        batch_size: int,
        num_workers: int = NUM_WORKERS
):
    
    train_data = datasets.ImageFolder(
        train_dir,
        transform,
    )
    
    test_data = datasets.ImageFolder(
        test_dir,
        transform
    )
    
    class_names = train_data.classes
    
    train_dataloader = DataLoader(
        train_data, batch_size, shuffle=True, num_workers=num_workers, pin_memory=True
    )
    
    test_dataloader = DataLoader(
        test_data, batch_size, shuffle=False, num_workers=num_workers, pin_memory=True
    )
    
    return train_dataloader, test_dataloader, class_names

Overwriting data_setup.py


In [3]:
%%writefile engine.py

import torch
from tqdm.auto import tqdm
from typing import Dict, List, Tuple

def train_step(
        model: torch.nn.Module,
        dataloader: torch.utils.data.DataLoader,
        loss_func: torch.nn.Module,
        optimizer: torch.optim.Optimizer,
        device: torch.device
) -> Tuple[float, float]:
    
    model.train()
    train_loss, train_acc = 0, 0
    correct_samples, total_samples = 0, 0
    
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        
        train_logits = model(X)
        loss = loss_func(train_logits, y)
        train_loss += loss.item()
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        train_probs = torch.softmax(train_logits, dim=1)
        train_preds = torch.argmax(train_probs, dim=1)
        correct_samples += (train_preds == y).sum().item()
        total_samples += len(y)
    
    train_loss = train_loss / len(dataloader)
    train_acc = correct_samples / total_samples
    
    return train_loss, train_acc

def test_step(
        model: torch.nn.Module,
        dataloader: torch.utils.data.DataLoader,
        loss_func: torch.nn.Module,
        device: torch.device
) -> Tuple[float, float]:
    
    model.eval()
    
    test_loss, test_acc = 0, 0
    correct_samples, total_samples = 0, 0
    
    with torch.inference_mode():
        for batch, (X, y) in enumerate(dataloader):
            X, y = X.to(device), y.to(device)
            
            test_logits = model(X)
            loss = loss_func(test_logits, y)
            test_loss += loss.item()
            
            test_preds = torch.argmax(test_logits, dim=1)
            correct_samples += (y == test_preds).sum().item()
            total_samples += len(y)
        
        test_loss = test_loss / len(dataloader)
        test_acc = correct_samples / total_samples
    
    return test_loss, test_acc

def train(
        model: torch.nn.Module,
        train_dataloader: torch.utils.data.DataLoader,
        test_dataloader: torch.utils.data.DataLoader,
        loss_func: torch.nn.Module,
        optimizer: torch.optim.Optimizer,
        epochs: int,
        device: torch.device
) -> Dict[str, List]:
    
    results = {
        "train_loss": [],
        "train_acc": [],
        "test_loss": [],
        "test_acc": [],
    }
    
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model, train_dataloader, loss_func, optimizer, device)
        test_loss, test_acc = test_step(model, test_dataloader, loss_func, device)
        
        print(
            f"Epochs: {epoch + 1} |"
            f"Train_loss: {train_loss:.4f} |"
            f"Train_acc: {train_acc:.2f} |"
            f"Test_loss: {test_loss:.4f} |"
            f"Test_acc: {test_acc:.2f} |"
        )
        
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)
    
    return results

Overwriting engine.py


In [4]:
%%writefile model_builder.py

import torch
from torch import nn

class TinyVGG(nn.Module):
    def __init__(self, input_shape: int, hidden_unit: int, output_shape: int) -> None:
        super().__init__()
        
        self.block_1 = nn.Sequential(
            nn.Conv2d(input_shape, hidden_unit, kernel_size=3, stride=1, padding=0),
            nn.ReLU(),
            nn.Conv2d(hidden_unit, hidden_unit, kernel_size=3, stride=1, padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        
        self.block_2 = nn.Sequential(
            nn.Conv2d(hidden_unit, hidden_unit, kernel_size=3, stride=1, padding=0),
            nn.ReLU(),
            nn.Conv2d(hidden_unit, hidden_unit, kernel_size=3, stride=1, padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(hidden_unit * 13 * 13, output_shape)
        )
    
    def forward(self, x: torch.Tensor):
        y = self.classifier(self.block_2(self.block_1(x)))
        return y

Overwriting model_builder.py


In [5]:
%%writefile utils.py

import torch
from pathlib import Path

def save_model(
        model: torch.nn.Module,
        target_dir: str,
        model_name: str
):
    target_dir_path = Path(target_dir)
    target_dir_path.mkdir(parents=True, exist_ok=True)
    
    assert model_name.endswith(".pt") or model_name.endswith(".pth"), "model_name should end with '.pt' or '.pth'"
    
    model_save_path = target_dir_path / model_name
    print(f"[INFO] Saving model to: {model_save_path}")
    torch.save(model.state_dict(), f=model_save_path)

Overwriting utils.py


In [8]:
%%writefile train.py

import os
import argparse
import torch
from torchvision.transforms import v2
import data_setup, engine, model_builder, utils

parser = argparse.ArgumentParser(description="Get some hyperparameters.")

parser.add_argument("--num_epochs", default=10, type=int, help="the number of epochs to train for")
parser.add_argument("--batch_size", default=32, type=int, help="the number of samples per batch")
parser.add_argument("--hidden_units", default=10, type=int, help="the number of hidden units in hidden layers")
parser.add_argument("--learning_rate", default=1e-3, type=float, help="learning rate to use for model")
parser.add_argument("--train_dir", default="data/pizza_sushi_steak/train", type=str, help="directory file path to training data in standard image classification format")
parser.add_argument("--test_dir", default="data/pizza_sushi_steak/test", type=str, help="directory file path to testing data in standard image classification format")

args = parser.parse_args()

NUM_EPOCHS = args.num_epochs
BATCH_SIZE = args.batch_size
HIDDEN_UNITS = args.hidden_units
LEARNING_RATE = args.learning_rate
print(f"[INFO] Training a model for {NUM_EPOCHS} epochs with batch size {BATCH_SIZE} using {HIDDEN_UNITS} hidden units and a learning rate of {LEARNING_RATE}")

train_dir = args.train_dir
test_dir = args.test_dir
print(f"[INFO] Training data file: {train_dir}")
print(f"[INFO] Testing data file: {test_dir}")

device = "cuda" if torch.cuda.is_available() else "cpu"

data_transform = v2.Compose([
    v2.Resize(size=(64, 64)),
    v2.ToImage(),
    v2.ToDtype(torch.float32, scale=True)
])

if __name__ == '__main__':
    train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir, test_dir, data_transform, BATCH_SIZE)

    model = model_builder.TinyVGG(input_shape=3, hidden_unit=HIDDEN_UNITS, output_shape=len(class_names)).to(device)
    
    loss_fn = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)
    
    engine.train(model, train_dataloader, test_dataloader, loss_fn, optimizer, NUM_EPOCHS, device)
    
    utils.save_model(model, "models", "05_going_modular_script_mode_tinyvgg_model.pth")

Overwriting train.py


In [9]:
!python train.py --num_epochs 5 --batch_size 128 --hidden_units 128 --learning_rate 3e-4

[INFO] Training a model for 5 epochs with batch size 128 using 128 hidden units and a learning rate of 0.0003
[INFO] Training data file: data/pizza_sushi_steak/train
[INFO] Testing data file: data/pizza_sushi_steak/test
Epochs: 1 |Train_loss: 1.1072 |Train_acc: 0.35 |Test_loss: 1.0988 |Test_acc: 0.33 |
Epochs: 2 |Train_loss: 1.0923 |Train_acc: 0.35 |Test_loss: 1.0820 |Test_acc: 0.33 |
Epochs: 3 |Train_loss: 1.0811 |Train_acc: 0.37 |Test_loss: 1.0669 |Test_acc: 0.41 |
Epochs: 4 |Train_loss: 1.0603 |Train_acc: 0.48 |Test_loss: 1.0425 |Test_acc: 0.49 |
Epochs: 5 |Train_loss: 1.0168 |Train_acc: 0.52 |Test_loss: 1.0178 |Test_acc: 0.44 |
[INFO] Saving model to: models\05_going_modular_script_mode_tinyvgg_model.pth



  0%|          | 0/5 [00:00<?, ?it/s]
 20%|██        | 1/5 [00:00<00:03,  1.06it/s]
 40%|████      | 2/5 [00:01<00:02,  1.23it/s]
 60%|██████    | 3/5 [00:02<00:01,  1.33it/s]
 80%|████████  | 4/5 [00:03<00:00,  1.39it/s]
100%|██████████| 5/5 [00:03<00:00,  1.41it/s]
100%|██████████| 5/5 [00:03<00:00,  1.35it/s]
