<a href="https://colab.research.google.com/github/andrei-radulescu-banu/stat453-deep-learning-ss21/blob/main/L12/code/adam-experiment4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

STAT 453: Deep Learning (Spring 2021)  
Instructor: Sebastian Raschka (sraschka@wisc.edu)  

Course website: http://pages.stat.wisc.edu/~sraschka/teaching/stat453-ss2021/  
GitHub repository: https://github.com/rasbt/stat453-deep-learning-ss21

---

In [1]:
!pip install watermark
!pip install colab-env --upgrade
import colab_env
colab_env.envvar_handler.add_env("CUBLAS_WORKSPACE_CONFIG", ":4096:8", overwrite=True)

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting watermark
  Downloading watermark-2.3.1-py2.py3-none-any.whl (7.2 kB)
Collecting jedi>=0.10
  Downloading jedi-0.18.1-py2.py3-none-any.whl (1.6 MB)
[K     |████████████████████████████████| 1.6 MB 8.7 MB/s 
Installing collected packages: jedi, watermark
Successfully installed jedi-0.18.1 watermark-2.3.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting colab-env
  Downloading colab-env-0.2.0.tar.gz (4.7 kB)
Collecting python-dotenv<1.0,>=0.10.0
  Downloading python_dotenv-0.21.0-py3-none-any.whl (18 kB)
Building wheels for collected packages: colab-env
  Building wheel for colab-env (setup.py) ... [?25l[?25hdone
  Created wheel for colab-env: filename=colab_env-0.2.0-py3-none-any.whl size=3838 sha256=a74a205da6d73415f25deaa63fe5185390b9183d8471e27bd7a305704231f2cf
  Stored in directory: /root/.cache/pip/wheels/bb/ca

In [2]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch

Author: Sebastian Raschka

Python implementation: CPython
Python version       : 3.7.15
IPython version      : 7.9.0

torch: 1.12.1+cu113



# MLP with Dropout 

## Imports

In [3]:
!git clone https://github.com/andrei-radulescu-banu/stat453-deep-learning-ss21.git

Cloning into 'stat453-deep-learning-ss21'...
remote: Enumerating objects: 1133, done.[K
remote: Counting objects: 100% (83/83), done.[K
remote: Compressing objects: 100% (70/70), done.[K
remote: Total 1133 (delta 36), reused 4 (delta 4), pack-reused 1050[K
Receiving objects: 100% (1133/1133), 114.93 MiB | 20.94 MiB/s, done.
Resolving deltas: 100% (140/140), done.
Checking out files: 100% (927/927), done.


In [4]:
!ls /content/stat453-deep-learning-ss21/L12/code

adabelief.ipynb		adamW.ipynb	      helper_plotting.py
adam-experiment1.ipynb	batchsize-1024.ipynb  helper_train.py
adam-experiment2.ipynb	batchsize-64.ipynb    scheduler.ipynb
adam-experiment3.ipynb	helper_dataset.py     sgd-scheduler-momentum.ipynb
adam.ipynb		helper_evaluation.py


In [5]:
import sys, os
sys.path.append("/content/stat453-deep-learning-ss21/L12/code")

In [6]:
import torch
import numpy as np
import matplotlib.pyplot as plt

In [7]:
# From local helper files
from helper_evaluation import set_all_seeds, set_deterministic
from helper_train import train_model
from helper_plotting import plot_training_loss, plot_accuracy, show_examples
from helper_dataset import get_dataloaders_mnist

## Settings and Dataset

In [8]:
##########################
### SETTINGS
##########################

RANDOM_SEED = 123
BATCH_SIZE = 256
NUM_HIDDEN_1 = 75
NUM_HIDDEN_2 = 45
NUM_EPOCHS = 100
DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

In [9]:
set_all_seeds(RANDOM_SEED)
set_deterministic()

In [10]:
##########################
### MNIST DATASET
##########################

train_loader, valid_loader, test_loader = get_dataloaders_mnist(
    batch_size=BATCH_SIZE,
    validation_fraction=0.1)

# Checking the dataset
for images, labels in train_loader:  
    print('Image batch dimensions:', images.shape)
    print('Image label dimensions:', labels.shape)
    print('Class labels of 10 examples:', labels[:10])
    break

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data/MNIST/raw/train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data/MNIST/raw/train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting data/MNIST/raw/train-labels-idx1-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data/MNIST/raw/t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting data/MNIST/raw/t10k-images-idx3-ubyte.gz to data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data/MNIST/raw/t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting data/MNIST/raw/t10k-labels-idx1-ubyte.gz to data/MNIST/raw

Image batch dimensions: torch.Size([256, 1, 28, 28])
Image label dimensions: torch.Size([256])
Class labels of 10 examples: tensor([4, 5, 8, 9, 9, 4, 9, 9, 3, 9])


## Model

In [11]:
class MultilayerPerceptron(torch.nn.Module):

    def __init__(self, num_features, num_classes, drop_proba, 
                 num_hidden_1, num_hidden_2):
        super().__init__()
        
        self.my_network = torch.nn.Sequential(
            # 1st hidden layer
            torch.nn.Flatten(),
            torch.nn.Linear(num_features, num_hidden_1, bias=False),
            torch.nn.BatchNorm1d(num_hidden_1),
            torch.nn.ReLU(),
            torch.nn.Dropout(0.5),
            # 2nd hidden layer
            torch.nn.Linear(num_hidden_1, num_hidden_2, bias=False),
            torch.nn.BatchNorm1d(num_hidden_2),
            torch.nn.ReLU(),
            torch.nn.Dropout(0.3),
            # output layer
            torch.nn.Linear(num_hidden_2, num_classes)
        )
           
    def forward(self, x):
        logits = self.my_network(x)
        return logits

In [None]:
torch.manual_seed(RANDOM_SEED)
model = MultilayerPerceptron(num_features=28*28,
                             num_hidden_1=NUM_HIDDEN_1,
                             num_hidden_2=NUM_HIDDEN_2,
                             drop_proba=0.5,
                             num_classes=10)
model = model.to(DEVICE)

optimizer = torch.optim.SGD(model.parameters(), lr=0.005, momentum=0.5) # Use SGD instead of Adam

minibatch_loss_list, train_acc_list, valid_acc_list = train_model(
    model=model,
    num_epochs=NUM_EPOCHS,
    train_loader=train_loader,
    valid_loader=valid_loader,
    test_loader=test_loader,
    optimizer=optimizer,
    device=DEVICE,
    logging_interval=100)

plot_training_loss(minibatch_loss_list=minibatch_loss_list,
                   num_epochs=NUM_EPOCHS,
                   iter_per_epoch=len(train_loader),
                   results_dir=None,
                   averaging_iterations=20)
plt.show()

plot_accuracy(train_acc_list=train_acc_list,
              valid_acc_list=valid_acc_list,
              results_dir=None)
plt.ylim([80, 100])
plt.show()

Epoch: 001/100 | Batch 0000/0210 | Loss: 2.3859
Epoch: 001/100 | Batch 0100/0210 | Loss: 1.6672
Epoch: 001/100 | Batch 0200/0210 | Loss: 1.3961
Epoch: 001/100 | Train: 80.52% | Validation: 84.38%
Time elapsed: 0.21 min
Epoch: 002/100 | Batch 0000/0210 | Loss: 1.3419
Epoch: 002/100 | Batch 0100/0210 | Loss: 1.0870
Epoch: 002/100 | Batch 0200/0210 | Loss: 1.0335
Epoch: 002/100 | Train: 87.08% | Validation: 89.83%
Time elapsed: 0.37 min
Epoch: 003/100 | Batch 0000/0210 | Loss: 1.0379
Epoch: 003/100 | Batch 0100/0210 | Loss: 0.9212
Epoch: 003/100 | Batch 0200/0210 | Loss: 0.8476
Epoch: 003/100 | Train: 89.20% | Validation: 91.47%
Time elapsed: 0.53 min
Epoch: 004/100 | Batch 0000/0210 | Loss: 0.7911
Epoch: 004/100 | Batch 0100/0210 | Loss: 0.8304
Epoch: 004/100 | Batch 0200/0210 | Loss: 0.7501
Epoch: 004/100 | Train: 90.15% | Validation: 92.33%
Time elapsed: 0.70 min
Epoch: 005/100 | Batch 0000/0210 | Loss: 0.6264
Epoch: 005/100 | Batch 0100/0210 | Loss: 0.6947
Epoch: 005/100 | Batch 0200/