# Lab 3: Complete Pipeline

In this final lab, we'll complete our modular PyTorch project by creating:
- **`utils.py`**: Utility functions (model saving)
- **`train.py`**: Main training script that ties everything together

By the end of this lab, you'll be able to train a model with a single command:
```bash
python train.py
```

## Install Dependencies

In [None]:
!pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu
!pip install matplotlib requests tqdm

## Import Libraries

In [None]:
import os
import requests
import zipfile
from pathlib import Path

import torch
from torch import nn
from torchvision import transforms

print(f"PyTorch version: {torch.__version__}")

## 1. Setup: Create All Previous Scripts

Let's create all the scripts from previous labs in one place.

In [None]:
going_modular_path = Path("going_modular")
going_modular_path.mkdir(parents=True, exist_ok=True)
print(f"Created directory: {going_modular_path}")

### Create data_setup.py

In [None]:
%%writefile going_modular/data_setup.py

import os

from torchvision import datasets, transforms
from torch.utils.data import DataLoader

NUM_WORKERS = os.cpu_count()

def create_dataloaders(
    train_dir: str, 
    test_dir: str, 
    transform: transforms.Compose, 
    batch_size: int, 
    num_workers: int = NUM_WORKERS
):
    train_data = datasets.ImageFolder(train_dir, transform=transform)
    test_data = datasets.ImageFolder(test_dir, transform=transform)
    class_names = train_data.classes

    train_dataloader = DataLoader(
        train_data,
        batch_size=batch_size,
        shuffle=True,
        num_workers=num_workers,
        pin_memory=True,
    )
    test_dataloader = DataLoader(
        test_data,
        batch_size=batch_size,
        shuffle=False,
        num_workers=num_workers,
        pin_memory=True,
    )

    return train_dataloader, test_dataloader, class_names

### Create model_builder.py

In [None]:
%%writefile going_modular/model_builder.py
import torch
from torch import nn

class TinyVGG(nn.Module):
    def __init__(self, input_shape: int, hidden_units: int, output_shape: int) -> None:
        super().__init__()
        self.conv_block_1 = nn.Sequential(
            nn.Conv2d(input_shape, hidden_units, 3, 1, 0),  
            nn.ReLU(),
            nn.Conv2d(hidden_units, hidden_units, 3, 1, 0),
            nn.ReLU(),
            nn.MaxPool2d(2, 2)
        )
        self.conv_block_2 = nn.Sequential(
            nn.Conv2d(hidden_units, hidden_units, 3, padding=0),
            nn.ReLU(),
            nn.Conv2d(hidden_units, hidden_units, 3, padding=0),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.classifier = nn.Sequential(
            nn.Flatten(),
            nn.Linear(hidden_units*13*13, output_shape)
        )
    
    def forward(self, x: torch.Tensor):
        return self.classifier(self.conv_block_2(self.conv_block_1(x)))

### Create engine.py

In [None]:
%%writefile going_modular/engine.py
import torch
from tqdm.auto import tqdm
from typing import Dict, List, Tuple

def train_step(
    model: torch.nn.Module, 
    dataloader: torch.utils.data.DataLoader, 
    loss_fn: torch.nn.Module, 
    optimizer: torch.optim.Optimizer,
    device: torch.device
) -> Tuple[float, float]:
    model.train()
    train_loss, train_acc = 0, 0
    
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        y_pred = model(X)
        loss = loss_fn(y_pred, y)
        train_loss += loss.item() 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        y_pred_class = torch.argmax(torch.softmax(y_pred, dim=1), dim=1)
        train_acc += (y_pred_class == y).sum().item() / len(y_pred)

    train_loss = train_loss / len(dataloader)
    train_acc = train_acc / len(dataloader)
    return train_loss, train_acc

def test_step(
    model: torch.nn.Module, 
    dataloader: torch.utils.data.DataLoader, 
    loss_fn: torch.nn.Module,
    device: torch.device
) -> Tuple[float, float]:

    model.eval() 
    test_loss, test_acc = 0, 0
    
    with torch.inference_mode():
        for batch, (X, y) in enumerate(dataloader):
            X, y = X.to(device), y.to(device)
            test_pred_logits = model(X)
            loss = loss_fn(test_pred_logits, y)
            test_loss += loss.item()
            test_pred_labels = test_pred_logits.argmax(dim=1)
            test_acc += (test_pred_labels == y).sum().item() / len(test_pred_labels)
            
    test_loss = test_loss / len(dataloader)
    test_acc = test_acc / len(dataloader)
    return test_loss, test_acc

def train(
    model: torch.nn.Module, 
    train_dataloader: torch.utils.data.DataLoader, 
    test_dataloader: torch.utils.data.DataLoader, 
    optimizer: torch.optim.Optimizer,
    loss_fn: torch.nn.Module,
    epochs: int,
    device: torch.device
) -> Dict[str, List]:

    results = {
        "train_loss": [],
        "train_acc": [],
        "test_loss": [],
        "test_acc": []
    }
    
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(
            model=model,
            dataloader=train_dataloader,
            loss_fn=loss_fn,
            optimizer=optimizer,
            device=device
        )
        test_loss, test_acc = test_step(
            model=model,
            dataloader=test_dataloader,
            loss_fn=loss_fn,
            device=device
        )
        
        print(
            f"Epoch: {epoch+1} | "
            f"train_loss: {train_loss:.4f} | "
            f"train_acc: {train_acc:.4f} | "
            f"test_loss: {test_loss:.4f} | "
            f"test_acc: {test_acc:.4f}"
        )

        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)
        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

    return results

## 2. Create `utils.py`

Utility functions are common operations that you'll use frequently. The most important one is saving trained models.

Let's create a `save_model()` function that:
- Creates the target directory if it doesn't exist
- Validates the model filename
- Saves the model's state_dict

Here, It saves a PyTorch model to a target directory.
```
    Args:
        model: A target PyTorch model to save.
        target_dir: A directory for saving the model to.
        model_name: A filename for the saved model. Should include
            either ".pth" or ".pt" as the file extension.
```

In [None]:
def save_model(
    model: torch.nn.Module,
    target_dir: str,
    model_name: str
):
    target_dir_path = Path(target_dir)
    target_dir_path.mkdir(parents=True, exist_ok=True)
    
    # Create model save path
    assert model_name.endswith(".pth") or model_name.endswith(".pt"), \
        "model_name should end with '.pt' or '.pth'"
    model_save_path = target_dir_path / model_name

    # Save the model state_dict()
    print(f"[INFO] Saving model to: {model_save_path}")
    torch.save(obj=model.state_dict(), f=model_save_path)

### Test the `save_model()` Function

In [None]:
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

if image_path.is_dir():
    print(f"{image_path} directory exists.")
else:
    print(f"Did not find {image_path} directory, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)
    
    with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
        request = requests.get("https://raw.githubusercontent.com/poridhioss/Introduction-to-Deep-Learning-with-Pytorch-Resources/main/Going-module/pizza_steak_sushi.zip")
        print("Downloading pizza, steak, sushi data...")
        f.write(request.content)

    with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
        print("Unzipping pizza, steak, sushi data...")
        zip_ref.extractall(image_path)

    os.remove(data_path / "pizza_steak_sushi.zip")
    print("Download complete!")

In [None]:
from going_modular import data_setup, model_builder, engine

device = "cuda" if torch.cuda.is_available() else "cpu"

data_transform = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor()
])

train_dir = image_path / "train"
test_dir = image_path / "test"

train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(
    train_dir=train_dir,
    test_dir=test_dir,
    transform=data_transform,
    batch_size=32,
    num_workers=0
)

torch.manual_seed(42)
model = model_builder.TinyVGG(
    input_shape=3,
    hidden_units=10,
    output_shape=len(class_names)
).to(device)

print(f"Created model with {len(class_names)} output classes")

In [None]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

results = engine.train(
    model=model,
    train_dataloader=train_dataloader,
    test_dataloader=test_dataloader,
    optimizer=optimizer,
    loss_fn=loss_fn,
    epochs=5,
    device=device
)

In [None]:
save_model(
    model=model,
    target_dir="models",
    model_name="test_tinyvgg_model.pth"
)

print(f"\nFiles in models directory: {os.listdir('models')}")

### Save `utils.py`

In [None]:
%%writefile going_modular/utils.py

import torch
from pathlib import Path

def save_model(
    model: torch.nn.Module,
    target_dir: str,
    model_name: str
):
    target_dir_path = Path(target_dir)
    target_dir_path.mkdir(parents=True, exist_ok=True)
    
    assert model_name.endswith(".pth") or model_name.endswith(".pt"), \
        "model_name should end with '.pt' or '.pth'"
    model_save_path = target_dir_path / model_name

    print(f"[INFO] Saving model to: {model_save_path}")
    torch.save(obj=model.state_dict(), f=model_save_path)

## 3. Create `train.py`

The `train.py` script is the main entry point for training. It:
1. Imports all necessary modules
2. Sets up hyperparameters
3. Prepares data
4. Creates the model
5. Trains the model
6. Saves the trained model

This allows you to train a model with just: `python train.py`

In [None]:
%%writefile going_modular/train.py

import os
import torch
import data_setup, engine, model_builder, utils

from torchvision import transforms

NUM_EPOCHS = 5
BATCH_SIZE = 32
HIDDEN_UNITS = 10
LEARNING_RATE = 0.001

train_dir = "data/pizza_steak_sushi/train"
test_dir = "data/pizza_steak_sushi/test"

device = "cuda" if torch.cuda.is_available() else "cpu"

data_transform = transforms.Compose([
    transforms.Resize((64, 64)),
    transforms.ToTensor()
])

train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(
    train_dir=train_dir,
    test_dir=test_dir,
    transform=data_transform,
    batch_size=BATCH_SIZE,
    num_workers=2
)

model = model_builder.TinyVGG(
    input_shape=3,
    hidden_units=HIDDEN_UNITS,
    output_shape=len(class_names)
).to(device)

loss_fn = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)

engine.train(
    model=model,
    train_dataloader=train_dataloader,
    test_dataloader=test_dataloader,
    loss_fn=loss_fn,
    optimizer=optimizer,
    epochs=NUM_EPOCHS,
    device=device
)

utils.save_model(
    model=model,
    target_dir="models",
    model_name="05_going_modular_tinyvgg_model.pth"
)

## 4. Test Running `train.py`

Let's test running our training script from the command line!

In [None]:
import shutil

gm_data_path = going_modular_path / "data" / "pizza_steak_sushi"
if not gm_data_path.exists():
    shutil.copytree(image_path, gm_data_path)
    print(f"Copied data to {gm_data_path}")
else:
    print(f"Data already exists at {gm_data_path}")

In [None]:
!cd going_modular && python train.py

## 5. Check Final Directory Structure

In [None]:
print("Files in going_modular/:")
for item in sorted(os.listdir("going_modular")):
    item_path = going_modular_path / item
    if item_path.is_file():
        print(f"  - {item}")
    elif item_path.is_dir():
        print(f"  - {item}/")

In [None]:
models_path = going_modular_path / "models"
if models_path.exists():
    print("\nSaved models:")
    for model_file in os.listdir(models_path):
        file_size = os.path.getsize(models_path / model_file) / 1024  # KB
        print(f"  - {model_file} ({file_size:.1f} KB)")

## 6. Bonus: Create `train.py` with Command-Line Arguments

For more flexibility, you can create a version of `train.py` that accepts command-line arguments using Python's `argparse` module.

In [None]:
%%writefile going_modular/train_with_args.py

import os
import argparse
import torch
import data_setup, engine, model_builder, utils

from torchvision import transforms

def main():
    # Setup argument parser
    parser = argparse.ArgumentParser(description="Train a TinyVGG model on image classification data.")
    
    parser.add_argument("--epochs", type=int, default=5, help="Number of epochs to train for (default: 5)")
    parser.add_argument("--batch_size", type=int, default=32, help="Batch size (default: 32)")
    parser.add_argument("--hidden_units", type=int, default=10, help="Number of hidden units (default: 10)")
    parser.add_argument("--lr", type=float, default=0.001, help="Learning rate (default: 0.001)")
    parser.add_argument("--train_dir", type=str, default="data/pizza_steak_sushi/train", help="Path to training data")
    parser.add_argument("--test_dir", type=str, default="data/pizza_steak_sushi/test", help="Path to test data")
    
    args = parser.parse_args()
    
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Using device: {device}")
    
    print(f"\nHyperparameters:")
    print(f"  Epochs: {args.epochs}")
    print(f"  Batch size: {args.batch_size}")
    print(f"  Hidden units: {args.hidden_units}")
    print(f"  Learning rate: {args.lr}")
    print(f"  Train dir: {args.train_dir}")
    print(f"  Test dir: {args.test_dir}\n")
    
    data_transform = transforms.Compose([
        transforms.Resize((64, 64)),
        transforms.ToTensor()
    ])
    
    train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(
        train_dir=args.train_dir,
        test_dir=args.test_dir,
        transform=data_transform,
        batch_size=args.batch_size,
        num_workers=0
    )
    
    model = model_builder.TinyVGG(
        input_shape=3,
        hidden_units=args.hidden_units,
        output_shape=len(class_names)
    ).to(device)
    
    loss_fn = torch.nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
    
    engine.train(
        model=model,
        train_dataloader=train_dataloader,
        test_dataloader=test_dataloader,
        loss_fn=loss_fn,
        optimizer=optimizer,
        epochs=args.epochs,
        device=device
    )
    
    utils.save_model(
        model=model,
        target_dir="models",
        model_name="tinyvgg_model.pth"
    )

if __name__ == "__main__":
    main()

In [None]:
!cd going_modular && python train_with_args.py --epochs 3 --batch_size 16 --lr 0.0005

## Summary

Congratulations! You've completed the "Going Modular with PyTorch" series!

We created a complete modular PyTorch project:

```
going_modular/
├── data_setup.py       # DataLoader creation
├── model_builder.py    # TinyVGG model
├── engine.py           # Training/testing functions
├── utils.py            # Utility functions (save_model)
├── train.py            # Main training script
├── train_with_args.py  # Training script with CLI arguments
├── data/               # Dataset
│   └── pizza_steak_sushi/
└── models/             # Saved models
    └── tinyvgg_model.pth
```

### Key Takeaways:

1. **Modular code is reusable** - Write once, use many times
2. **Scripts enable automation** - Train models from command line
3. **argparse adds flexibility** - Customize hyperparameters without code changes
4. **Good structure scales** - Same patterns work for larger projects

### Next Steps:

- Add data augmentation to `data_setup.py`
- Create a `predict.py` script for inference
- Add model checkpointing to save best models
- Implement learning rate scheduling

## Exercises

1. **Create a `predict.py` script** that loads a trained model and makes predictions on a single image.

2. **Add a `plot_loss_curves()` function** to `utils.py` that visualizes training and test loss/accuracy.

3. **Implement model checkpointing** - Save the model whenever test accuracy improves.

4. **Add data augmentation** - Modify `data_setup.py` to support different transforms for training vs testing.