Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.
- Author: Sebastian Raschka
- GitHub Repository: https://github.com/rasbt/deeplearning-models

In [1]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch

Sebastian Raschka 

CPython 3.7.3
IPython 7.9.0

torch 1.7.0


- Runs on CPU or GPU (if available)

# Model Zoo -- Reproducible Results with Deterministic Behavior and Runtime Benchmark

In this notebook, we are benchmarking the performance impact of setting PyTorch to deterministic behavior. In general, there are two aspects for reproducible resuls in PyTorch, 
1. Setting a random seed
2. Setting cuDNN and PyTorch algorithmic behavior to deterministic

For more details, please see https://pytorch.org/docs/stable/notes/randomness.html

### 1. Setting a random seed

I recommend using a function like the following one prior to using dataset loaders and initializing a model if you want to ensure the data is shuffled in the same manner if you rerun this notebook and the model gets the same initial random weights:

In [2]:
def set_all_seeds(seed):
    os.environ["PL_GLOBAL_SEED"] = str(seed)
    random.seed(seed)
    np.random.seed(seed)
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)

### 2. Setting cuDNN and PyTorch algorithmic behavior to deterministic

Similar to the `set_all_seeds` function above, I recommend setting the behavior of PyTorch and cuDNN to deterministic (this is particulary relevant when using GPUs). We can also define a function for that:

In [3]:
def set_deterministic():
    if torch.cuda.is_available():
        torch.backends.cudnn.benchmark = False
        torch.backends.cudnn.deterministic = True
    torch.set_deterministic(True)

# 1) Setup

After setting up the general configuration in this section, the following two sections will train a ResNet-101 model without and with deterministic behavior to get a sense how using deterministic options affect the runtime speed.

In [4]:
import os
import numpy as np
import torch
import random

In [5]:
##########################
### SETTINGS
##########################

# Device
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print('Device:', DEVICE)

# Data settings
num_classes = 10

# Hyperparameters
random_seed = 1
learning_rate = 0.01
batch_size = 128
num_epochs = 50

Device: cuda:0


In [6]:
import sys

sys.path.insert(0, "..") # to include ../helper_evaluate.py etc.

from helper_evaluate import compute_accuracy
from helper_data import get_dataloaders_cifar10
from helper_train import train_classifier_simple_v1

# 2) Run without Deterministic Behavior

Before we enable deterministic behavior, we will run a ResNet-101 with otherwise the exact same settings for comparison. Note that setting random seeds doesn't affect the timing results.

In [7]:
### Set random seed ###
set_all_seeds(random_seed)

In [8]:
##########################
### Dataset
##########################

train_loader, valid_loader, test_loader = get_dataloaders_cifar10(
    batch_size, 
    num_workers=0, 
    validation_fraction=0.1)

Files already downloaded and verified


In [9]:
##########################
### Model
##########################


from deterministic_benchmark_utils import resnet101




model = resnet101(num_classes, grayscale=False)

model = model.to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

In [10]:
train_classifier_simple_v1(num_epochs=num_epochs, model=model, 
                           optimizer=optimizer, device=DEVICE, 
                           train_loader=train_loader, valid_loader=valid_loader, 
                           logging_interval=200)

Epoch: 001/050 | Batch 0000/0352 | Loss: 2.6711
Epoch: 001/050 | Batch 0200/0352 | Loss: 2.9076
Epoch: 001/050 | Train Acc.: 10.373% |  Loss: 3.180
Epoch: 001/050 | Validation Acc.: 10.120% |  Loss: 3.161
Time elapsed: 1.21 min
Epoch: 002/050 | Batch 0000/0352 | Loss: 2.1174
Epoch: 002/050 | Batch 0200/0352 | Loss: 1.9204
Epoch: 002/050 | Train Acc.: 26.449% |  Loss: 1.886
Epoch: 002/050 | Validation Acc.: 26.340% |  Loss: 1.872
Time elapsed: 2.47 min
Epoch: 003/050 | Batch 0000/0352 | Loss: 1.7529
Epoch: 003/050 | Batch 0200/0352 | Loss: 1.7582
Epoch: 003/050 | Train Acc.: 35.082% |  Loss: 1.731
Epoch: 003/050 | Validation Acc.: 34.200% |  Loss: 1.733
Time elapsed: 3.72 min
Epoch: 004/050 | Batch 0000/0352 | Loss: 1.6620
Epoch: 004/050 | Batch 0200/0352 | Loss: 1.6138
Epoch: 004/050 | Train Acc.: 44.818% |  Loss: 1.538
Epoch: 004/050 | Validation Acc.: 43.380% |  Loss: 1.562
Time elapsed: 4.98 min
Epoch: 005/050 | Batch 0000/0352 | Loss: 1.4111
Epoch: 005/050 | Batch 0200/0352 | Loss:

# 3) Run with Deterministic Behavior

In this section, we set the deterministic behavior via the `set_deterministic()` function defined at the top of this notebook and compare how it affects the runtime speed of the ResNet-101 model. (Note that setting random seeds doesn't affect the timing results.)

In [11]:
set_deterministic()

In [12]:
### Set random seed ###
set_all_seeds(random_seed)

In [13]:
##########################
### Dataset
##########################

train_loader, valid_loader, test_loader = get_dataloaders_cifar10(
    batch_size, 
    num_workers=0, 
    validation_fraction=0.1)

Files already downloaded and verified


In [14]:
##########################
### Model
##########################


from deterministic_benchmark_utils import resnet101




model = resnet101(num_classes, grayscale=False)

model = model.to(DEVICE)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

In [15]:
train_classifier_simple_v1(num_epochs=num_epochs, model=model, 
                           optimizer=optimizer, device=DEVICE, 
                           train_loader=train_loader, valid_loader=valid_loader, 
                           logging_interval=200)

Epoch: 001/050 | Batch 0000/0352 | Loss: 2.6711
Epoch: 001/050 | Batch 0200/0352 | Loss: 3.3140
Epoch: 001/050 | Train Acc.: 24.200% |  Loss: 2.016
Epoch: 001/050 | Validation Acc.: 23.920% |  Loss: 1.993
Time elapsed: 1.25 min
Epoch: 002/050 | Batch 0000/0352 | Loss: 1.8132
Epoch: 002/050 | Batch 0200/0352 | Loss: 1.6629
Epoch: 002/050 | Train Acc.: 36.464% |  Loss: 1.664
Epoch: 002/050 | Validation Acc.: 35.900% |  Loss: 1.637
Time elapsed: 2.49 min
Epoch: 003/050 | Batch 0000/0352 | Loss: 1.5070
Epoch: 003/050 | Batch 0200/0352 | Loss: 1.4028
Epoch: 003/050 | Train Acc.: 44.527% |  Loss: 1.565
Epoch: 003/050 | Validation Acc.: 43.400% |  Loss: 1.571
Time elapsed: 3.74 min
Epoch: 004/050 | Batch 0000/0352 | Loss: 1.3491
Epoch: 004/050 | Batch 0200/0352 | Loss: 1.4037
Epoch: 004/050 | Train Acc.: 51.798% |  Loss: 1.318
Epoch: 004/050 | Validation Acc.: 50.020% |  Loss: 1.361
Time elapsed: 4.98 min
Epoch: 005/050 | Batch 0000/0352 | Loss: 1.2558
Epoch: 005/050 | Batch 0200/0352 | Loss:

# 4) Result

In this particular case, the deterministic behavior does not seem to influence performance noticeably.