<a href="https://colab.research.google.com/github/reedhodges/pytorch-loop-integrals/blob/main/pytorch_loop_integrals.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using PyTorch to classify loop integrals by the type of their divergence

Before starting, run all the cells in the 'Preliminaries' section to make sure you've downloaded all the necessary files.

In [1]:
import torch
import torchvision

print(torch.__version__)
print(torchvision.__version__)

device = "cuda" if torch.cuda.is_available() else "cpu"

print(f"Device: {device}")

2.3.0.dev20240314
0.18.0.dev20240314
Device: cpu


#### Optional: using GPU-accelerated PyTorch on Apple Silicon Macs

If you have an Apple Silicon Mac, you can use GPU-accelerated PyTorch, but you need to install the Preview (Nightly) version of PyTorch.  You can set up a Python virtual environment with the following in the command line:

```zsh
python -m venv pytorch-nightly
```

Activate the virtual environment with:

```zsh
source pytorch-nightly/bin/activate
```

Then install the Preview (Nightly) version of PyTorch, followed by whatever other packages are necessary.

```zsh
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cpu
```

Make sure the Jupyter kernel is set to use this virtual environment.  If you have successfully installed the Nightly build, the PyTorch version should have `dev` in it.

In [1]:
import torch
import torchvision

print(torch.__version__)
print(torchvision.__version__)

2.3.0.dev20240314
0.18.0.dev20240314


If everything works, the following should output a tensor with a single one that is stored on the `mps` device.

In [2]:
if torch.backends.mps.is_available():
    mps_device = torch.device("mps")
    x = torch.ones(1, device=mps_device)
    print (x)
else:
    print ("MPS device not found.")

tensor([1.], device='mps:0')


Let's set the device to "mps" for our training later.

In [2]:
import torch

device = "mps" if torch.backends.mps.is_available() else print("MPS device not found.")
print(f"Device: {device}")

Device: mps


### Preliminaries

In [3]:
import requests
import zipfile
from pathlib import Path

path_to_data = Path("data/")

if path_to_data.is_dir():
    print(f"[INFO] Directory {path_to_data} already exists, skipping download.")
else:
    print(f"[INFO] Creating {path_to_data} directory...")
    path_to_data.mkdir(parents=True, exist_ok=True)
    with open("integrand_data.zip", "wb") as f:
        url = "https://github.com/reedhodges/pytorch-loop-integrals/raw/main/integrand_data.zip"
        response = requests.get(url)
        print(f"[INFO] Downloading zip from {url}...")
        f.write(response.content)

    with zipfile.ZipFile("integrand_data.zip", "r") as zip_ref:
        print(f"[INFO] Extracting zip...")
        zip_ref.extractall()

files_to_download = [
    {"path": "engine.py", "url": "https://raw.githubusercontent.com/reedhodges/pytorch-loop-integrals/main/engine.py"},
    {"path": "utils.py", "url": "https://raw.githubusercontent.com/reedhodges/pytorch-loop-integrals/main/utils.py"}
]

for file_info in files_to_download:
    file_path = file_info["path"]
    file_url = file_info["url"]
    if not Path(file_path).is_file():
        print(f"[INFO] Downloading {file_path}...")
        with open(file_path, "wb") as f:
            response = requests.get(file_url)
            f.write(response.content)
    else:
        print(f"[INFO] File {file_path} already exists, skipping download.")

print(f"[INFO] Done!")

[INFO] Directory data already exists, skipping download.
[INFO] File engine.py already exists, skipping download.
[INFO] File utils.py already exists, skipping download.
[INFO] Done!


### Set up data and model

In [4]:
from utils import fix_error_with_weights_download
from torchvision.models import efficientnet_b0, EfficientNet_B0_Weights

image_path_list = list(path_to_data.glob("*/*/*.png"))

# this is a hack to fix the weights download issue
fix_error_with_weights_download()

effnetb0_weights = EfficientNet_B0_Weights.DEFAULT
effnetb0_model = efficientnet_b0(weights=effnetb0_weights).to(device)

effnetb0_transform = effnetb0_weights.transforms()

for param in effnetb0_model.features.parameters():
    param.requires_grad = False

In [5]:
from torchvision.models import ViT_B_16_Weights, vit_b_16

image_path_list = list(path_to_data.glob("*/*/*.png"))

vitb16_weights = ViT_B_16_Weights.DEFAULT
vitb16_model = vit_b_16(weights=vitb16_weights).to(device)

vitb16_transform = vitb16_weights.transforms()

for param in vitb16_model.parameters():
    param.requires_grad = False

Downloading: "https://download.pytorch.org/models/vit_b_16-c867db91.pth" to /Users/reedhodges/.cache/torch/hub/checkpoints/vit_b_16-c867db91.pth
100%|██████████| 330M/330M [00:19<00:00, 18.1MB/s] 


### Train

To use experiment tracking, you'll need a [Weights & Biases account](https://wandb.ai/site), which is free for personal use.  

In [6]:
import wandb

wandb.login()

Failed to detect the name of this notebook, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable to enable code saving.
[34m[1mwandb[0m: Currently logged in as: [33mreedhodges[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

In [7]:
import os
from torch import nn, optim
from engine import create_data_loaders, train
from utils import set_seeds

BATCH_SIZE = 32
NUM_WORKERS = os.cpu_count()
NUM_EPOCHS = 5

train_loader, test_loader, class_names, class_dict = create_data_loaders(path_to_data, batch_size=BATCH_SIZE, num_workers=NUM_WORKERS, train_transform=effnetb0_transform, test_transform=effnetb0_transform)

effnetb0_model.classifier = nn.Sequential(
    nn.Dropout(p=0.2, inplace=True),
    nn.Linear(in_features=1280,
              out_features=len(class_names)).to(device)
)

loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(effnetb0_model.parameters(), lr=0.001)

wandb.init(project="pytorch-loop-integrals",
           config={"model": "EfficientNet-B0",
                   "batch_size": BATCH_SIZE,
                   "num_workers": NUM_WORKERS,
                   "num_epochs": NUM_EPOCHS,
                   "Loss Function": str(loss_fn),
                   "Optimizer": str(optimizer)})

set_seeds()
train(device, effnetb0_model, train_loader, test_loader, loss_fn, optimizer, epochs=NUM_EPOCHS)

 20%|██        | 1/5 [02:09<08:37, 129.36s/it]


------------------------------

Epoch 1
Train Loss: 0.7107, Train Accuracy: 72.75
Test Loss:  0.7233, Test Accuracy:  74.00


 40%|████      | 2/5 [03:34<05:10, 103.47s/it]


------------------------------

Epoch 2
Train Loss: 0.5444, Train Accuracy: 78.62
Test Loss:  0.5899, Test Accuracy:  73.50


 60%|██████    | 3/5 [04:59<03:10, 95.09s/it] 


------------------------------

Epoch 3
Train Loss: 0.5565, Train Accuracy: 77.12
Test Loss:  0.6565, Test Accuracy:  74.00


 80%|████████  | 4/5 [06:25<01:31, 91.31s/it]


------------------------------

Epoch 4
Train Loss: 0.5075, Train Accuracy: 79.25
Test Loss:  0.4256, Test Accuracy:  82.00


100%|██████████| 5/5 [07:50<00:00, 94.07s/it]


------------------------------

Epoch 5
Train Loss: 0.5158, Train Accuracy: 80.12
Test Loss:  0.4398, Test Accuracy:  80.50





In [8]:
import os
from torch import nn, optim
from engine import create_data_loaders, train
from utils import set_seeds

BATCH_SIZE = 32
NUM_WORKERS = os.cpu_count()
NUM_EPOCHS = 5

train_loader, test_loader, class_names, class_dict = create_data_loaders(path_to_data, batch_size=BATCH_SIZE, num_workers=NUM_WORKERS, train_transform=vitb16_transform, test_transform=vitb16_transform)

vitb16_model.heads = nn.Linear(in_features=768,
                               out_features=len(class_names)
                               ).to(device)

loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(vitb16_model.parameters(), lr=0.001)

wandb.init(project="pytorch-loop-integrals",
           config={"model": "ViT-B_16",
                   "batch_size": BATCH_SIZE,
                   "num_workers": NUM_WORKERS,
                   "num_epochs": NUM_EPOCHS,
                   "Loss Function": str(loss_fn),
                   "Optimizer": str(optimizer)})
 
set_seeds()
train(device, effnetb0_model, train_loader, test_loader, loss_fn, optimizer, epochs=NUM_EPOCHS)



0,1
Test Accuracy,▁▁▁█▇
Test Loss,█▅▆▁▁
Train Accuracy,▁▇▅▇█
Train Loss,█▂▃▁▁

0,1
Test Accuracy,80.5
Test Loss,0.43976
Train Accuracy,80.125
Train Loss,0.51585


 20%|██        | 1/5 [01:25<05:42, 85.64s/it]


------------------------------

Epoch 1
Train Loss: 0.4840, Train Accuracy: 82.50
Test Loss:  0.4441, Test Accuracy:  80.00


 40%|████      | 2/5 [02:52<04:19, 86.38s/it]


------------------------------

Epoch 2
Train Loss: 0.4584, Train Accuracy: 82.38
Test Loss:  0.4446, Test Accuracy:  80.50


 60%|██████    | 3/5 [04:18<02:52, 86.37s/it]


------------------------------

Epoch 3
Train Loss: 0.4980, Train Accuracy: 80.38
Test Loss:  0.4454, Test Accuracy:  81.00


 80%|████████  | 4/5 [05:44<01:26, 86.11s/it]


------------------------------

Epoch 4
Train Loss: 0.4766, Train Accuracy: 81.38
Test Loss:  0.4473, Test Accuracy:  80.00


100%|██████████| 5/5 [07:10<00:00, 86.08s/it]


------------------------------

Epoch 5
Train Loss: 0.4993, Train Accuracy: 80.62
Test Loss:  0.4436, Test Accuracy:  80.50



