<a href="https://colab.research.google.com/github/DietzscheNostoevsky/Learning_Pytorch/blob/main/05_Pytorch_Going_Modular.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PyTorch Going Modular

Going modular involves turning notebook code (from a Jupyter Notebook or Google Colab notebook) into a series of different Python scripts that offer similar functionality.

For example, we could turn our notebook code from a series of cells into the following Python files:

* `data_setup.py` - a file to prepare and download data if needed.  
* `engine.py` - a file containing various training functions.  
* `model_builder.py` or model.py - a file to create a PyTorch model.  
* `train.py` - a file to leverage all other files and train a target PyTorch model.
* `utils.py` - a file dedicated to helpful utility functions.


**Production code** is code that runs to offer a service to someone or something.

# Getting Data

### Smaller Dataset

In [1]:
import os
import zipfile

from pathlib import Path

import requests

# Setup path to data folder
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it... 
if image_path.is_dir():
    print(f"{image_path} directory exists.")
else:
    print(f"Did not find {image_path} directory, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)
    
# Download pizza, steak, sushi data
with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
    request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
    print("Downloading pizza, steak, sushi data...")
    f.write(request.content)

# Unzip pizza, steak, sushi data
with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
    print("Unzipping pizza, steak, sushi data...") 
    zip_ref.extractall(image_path)

# Remove zip file
os.remove(data_path / "pizza_steak_sushi.zip")

Did not find data/pizza_steak_sushi directory, creating one...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


### Larger Dataset

In [3]:
#Imports
import os
import zipfile
from pathlib import Path
import requests


# Mount the GDrive
from google.colab import drive
drive.mount('/content/drive')

data_path = Path("data/")  # The trailing forward slash (/) in the string
# is used to indicate that it represents a
# directory rather than a specific file.
# It's a common convention to include the trailing slash
# in directory paths to differentiate them from file paths.

image_path_full = data_path / "pizza_steak_sushi_full"

if image_path_full.is_dir():
    print(f"{image_path_full} directory exists.")
else:
    print(f"Did not find {image_path_full} directory, creating one...")
    image_path_full.mkdir(parents=True, exist_ok=True)
zip_data = "/content/drive/Othercomputers/My MacBook Air/GitHub/-Machine_Learning/Learning_Pytorch/pizza_steak_sushi_100_percent.zip"

with zipfile.ZipFile(zip_data, "r") as zip_ref:
    print("Unzipping pizza, steak, sushi data...")
    zip_ref.extractall(image_path_full)

# Setup Dirs
train_dir_full = image_path_full / "train"
test_dir_full = image_path_full / "test"

#walk_through_dir(image_path)


Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
data/pizza_steak_sushi_full directory exists.
Unzipping pizza, steak, sushi data...
