<a href="https://colab.research.google.com/github/gauthiermartin/pytorch-deep-learning-course/blob/main/07_pytorch_experiment_tracking.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 07. PyTorch Experiment Tracking

Machine learning is very experimental.

In order to figure out which experiments are worth pursuing, that's where **experiments tracking** come in.

In this notebook we will do this programmatically.


- Book Version - https://www.learnpytorch.io/07_pytorch_experiment_tracking/
- Made with ML - https://madewithml.com/courses/mlops/experiment-tracking/

In [1]:
import torch
import torchvision

print(torch.__version__)
print(torchvision.__version__)

2.0.1+cu118
0.15.2+cu118


In [2]:
# Continue with regular imports
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    !pip install -q torchinfo
    from torchinfo import summary

# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
    from going_modular.going_modular import data_setup, engine
except:
    # Get the going_modular scripts
    print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    !mv pytorch-deep-learning/going_modular .
    !rm -rf pytorch-deep-learning
    from going_modular.going_modular import data_setup, engine

[INFO] Couldn't find torchinfo... installing it.
[INFO] Couldn't find going_modular scripts... downloading them from GitHub.
Cloning into 'pytorch-deep-learning'...
remote: Enumerating objects: 4028, done.[K
remote: Counting objects: 100% (1216/1216), done.[K
remote: Compressing objects: 100% (215/215), done.[K
remote: Total 4028 (delta 1065), reused 1096 (delta 998), pack-reused 2812[K
Receiving objects: 100% (4028/4028), 651.38 MiB | 23.49 MiB/s, done.
Resolving deltas: 100% (2358/2358), done.
Updating files: 100% (248/248), done.


In [3]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

In [4]:
# Set seeds
def set_seeds(seed: int=42):
    """Sets random sets for torch operations.

    Args:
        seed (int, optional): Random seed to set. Defaults to 42.
    """
    # Set the seed for general torch operations
    torch.manual_seed(seed)
    # Set the seed for CUDA torch operations (ones that happen on the GPU)
    torch.cuda.manual_seed(seed)

In [5]:
set_seeds()


# 1. Get Data

Want to get pizza, steak and sushi images.

So we can run experiments building FoodMiniVision model on them.

In [11]:
import os
import zipfile
import requests

from pathlib import Path

def download_data(
    source: str,
    destination: str,
    remove_source: bool = True
) -> Path:
  """
  Downloads a zipped dataset from source and unzips to destination.

  Args:
      source (str): A link to a zipped file containing data.
      destination (str): A target directory to unzip data to.
      remove_source (bool): Whether to remove the source after downloading and extracting.

  Returns:
      pathlib.Path to downloaded data.

  Example usage:
      download_data(source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
                    destination="pizza_steak_sushi")
  """

  # Setup path to data folder
  data_path = Path("data/")
  image_path = data_path / destination

  # If the image folder doesn't exist, create it
  if image_path.is_dir():
    print(f"[INFO] {destination} folder already exists. skipping download")
  else:
    print(f"[INFO] Creating {destination} folder.")
    image_path.mkdir(parents=True, exist_ok=True)

  # Download target data
  target_file = Path(source).name

  with open(data_path / target_file, "wb") as f:
    print(f"[INFO] Downloading {target_file}...")
    response = requests.get(source)
    f.write(response.content)

  # Unzip target file
  with zipfile.ZipFile(data_path / target_file, "r") as zip_ref:
    print(f"[INFO] Unzipping {target_file} data...")
    zip_ref.extractall(image_path)

  # Remove source file
  if remove_source:
    print(f"[INFO] Removing {target_file}...")
    os.remove(data_path / target_file)

  return image_path


In [13]:
image_path = download_data(
              source="https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip",
              destination="pizza_steak_sushi"
)
image_path

[INFO] pizza_steak_sushi folder already exists. skipping download
[INFO] Downloading pizza_steak_sushi.zip...
[INFO] Unzipping pizza_steak_sushi.zip data...
[INFO] Removing pizza_steak_sushi.zip...


PosixPath('data/pizza_steak_sushi')