# 05. Making PyTorch Modular
Ref - https://www.learnpytorch.io/05_pytorch_going_modular/

## What is going modular?  
Going modular involves turning notebook code (from a Jupyter Notebook or Google Colab notebook) into a series of different Python scripts that offer similar functionality.

For example, we could turn our notebook code from a series of cells into the following Python files:

*data_setup.py* - a file to prepare and download data if needed.  
*engine.py* - a file containing various training functions.  
*model_builder.py or model.py* - a file to create a PyTorch model.  
*train.py* - a file to leverage all other files and train a target PyTorch model.   
*utils.py* - a file dedicated to helpful utility functions.

For example, you might be instructed to run code like the following in a terminal/command line to train a model:

<code>python train.py --model MODEL_NAME --batch_size BATCH_SIZE --lr LEARNING_RATE --num_epochs NUM_EPOCHS</code>

![image.png](https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/05-python-train-command-line-annotated.png)

There are two notebooks for this section:

1. Cell mode - this notebook is run as a traditional Jupyter Notebook/Google Colab notebook and is a condensed version of notebook 04.
2. Script mode - this notebook is the same as number 1 but with added functionality to turn each of the major sections into Python scripts, such as, data_setup.py and train.py.

## Things to note
*Docstrings* - Writing reproducible and understandable code is important. And with this in mind, each of the functions/classes we'll be putting into scripts has been created with Google's Python docstring style in mind.
*Imports at the top of scripts* - Since all of the Python scripts we're going to create could be considered a small program on their own, all of the scripts require their input modules be imported at the start of the script for example:

<code>
# Import modules required for train.py<br>     
import os<br>    
import torch<br>
import data_setup, engine, model_builder, utils<br>    

from torchvision import transforms
</code>

## 1. Getting Data

In [2]:
import os
import requests
import zipfile
from pathlib import Path

# Setup path to data folder
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it... 
if image_path.is_dir():
    print(f"{image_path} directory exists.")
else:
    print(f"Did not find {image_path} directory, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)

# Download pizza, steak, sushi data
with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
    request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
    print("Downloading pizza, steak, sushi data...")
    f.write(request.content)

# Unzip pizza, steak, sushi data
with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
    print("Unzipping pizza, steak, sushi data...") 
    zip_ref.extractall(image_path)

# Remove zip file
os.remove(data_path / "pizza_steak_sushi.zip")

Did not find data\pizza_steak_sushi directory, creating one...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


## 2. Create Datasets and DataLoaders (data_setup.py)
## 3. 3. Making a model (model_builder.py)
## 4. Creating train_step() and test_step() functions and train() to combine them
## 5. Creating a function to save the model (utils.py)
## 6. Train, evaluate and save the model (train.py)

To create train.py we'll go through the following steps:

1. Import the various dependencies, namely torch, os, torchvision.transforms and all of the scripts from the going_modular directory, data_setup, engine, model_builder, utils.
2. *Note:* Since train.py will be inside the going_modular directory, we can import the other modules via import ... rather than from going_modular import ....
3. Setup various hyperparameters such as batch size, number of epochs, learning rate and number of hidden units (these could be set in the future via Python's argparse).
4. Setup the training and test directories.
5. Setup device-agnostic code.
6. Create the necessary data transforms.
7. Create the DataLoaders using data_setup.py.
8. Create the model using model_builder.py.
9. Setup the loss function and optimizer.
10. Train the model using engine.py.
11. Save the model using utils.py.