# 05. PyTorch Going Modular Exercises

## Resources

These exercises/solutions are based on section 05. PyTorch Going Modular of the Learn PyTorch for Deep Learning course by Zero to Mastery: https://www.learnpytorch.io/05_pytorch_going_modular/

## 1. Turn the code to get the data into a Python script

When you run the script using python `get_data.py` it should check if the data already exists and skip downloading if it does.
If the data download is successful, you should be able to access the `pizza_steak_sushi` images from the data directory.

In [41]:
%%writefile src/get_data.py
"""
Contains function to download data
"""

import os
import requests
import zipfile
from pathlib import Path


def get_data(
    data_dir_str: str = "data/",
    image_path_str: str = "pizza_steak_sushi",
    data_url_str: str = "https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
    file_name_str: str = "pizza_steak_sushi.zip"
) -> None:
    """Downloads data from GitHub

    Args:
        data_dir_str (str, optional): Path do data directory.
            Defaults to "../data/".
        image_path_str (str, optional): Name of the folder where data will
            be stored. Defaults to "pizza_steak_sushi".
        data_url_str (_type_, optional): Link to site from where data will
            be downloaded. Defaults to "https://github.com/mrdbourke/ \
            pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip".
        file_name_str (str, optional): Name of the downloaded file.
            Defaults to "pizza_steak_sushi.zip".
    """
    
    # Setup path to data folder
    data_dir = Path(data_dir_str)
    image_path = data_dir / image_path_str
    
    # Check if data folder exists
    if image_path.exists():
        print(f"{image_path} exists...")
    else:
        print(f"{image_path} does not exists, creating...")
        image_path.mkdir(parents=True, exist_ok=True)
    
    # Check if data is already downloaded
    if len(list(image_path.glob("*/*/*"))) == 0:
        
        # Download data
        with open(data_dir / file_name_str, "wb") as f:
            print(f"Downloading {file_name_str}...")
            request = requests.get(data_url_str)
            f.write(request.content)
        
        # Unzip data
        with zipfile.ZipFile(data_dir / file_name_str, "r") as z:
            print(f"Extracting {file_name_str}...")
            z.extractall(image_path)
            
        # Remove zip file
        print(f"Deleting {file_name_str}...")
        os.remove(data_dir / file_name_str)
    else:
        print(f"Data in {image_path} already exits, skipping downloading and unzipping...")
        
    print("Finished getting data...")
    
if __name__=="__main__":
    get_data()

Overwriting src/get_data.py


In [None]:
# Example running of get_data.py
!python src/get_data.py

data/pizza_steak_sushi does not exists, creating...
Downloading pizza_steak_sushi.zip...
Extracting pizza_steak_sushi.zip...
Deleting pizza_steak_sushi.zip...
Finished getting data...


## 2. Create Datasets and DataLoaders

## 3. Create a model (TinyVGG)

## 4. Turn training and testing functions into scripts

## 5. Turn saving functions into scripts

## 6. Train evaluate and save model

## 7. Use `argparse` module to be able to send the `train.py` custom hyperparameter values for training procedures

Add an argument flag for using a different:
- Training/testing directory
- Learning rate
- Batch size
- Number of epochs to train for
- Number of hidden units in the TinyVGG model
    - Keep the default values for each of the above arguments as what they already are

For example, you should be able to run something similar to the following line to train a TinyVGG model with a learning rate of 0.003 and a batch size of 64 for 20 epochs: `python train.py --learning_rate 0.003 batch_size 64 num_epochs 20`

## 8. Create a Python script to predict on a target image given a file path with a saved model

- For example, you should be able to run the command python `predict.py` `some_image.jpeg` and have a trained PyTorch model predict on the image and return its prediction.
- You may also have to write code to load in a trained model.