# Optimization and Initialization 

In [1]:
!conda info 



     active environment : DL-torch-arm64
    active env location : /Applications/anaconda3/envs/DL-torch-arm64
            shell level : 2
       user config file : /Users/antoineherbet/.condarc
 populated config files : /Users/antoineherbet/.condarc
          conda version : 24.7.1
    conda-build version : 24.5.1
         python version : 3.12.2.final.0
                 solver : libmamba (default)
       virtual packages : __archspec=1=m1
                          __conda=24.7.1=0
                          __osx=14.5=0
                          __unix=0=0
       base environment : /Applications/anaconda3  (writable)
      conda av data dir : /Applications/anaconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/osx-arm64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://repo.anaconda.com/pkgs/main/osx-arm64
                          https://repo.anaconda.com/pkgs/main/n

In [2]:
import torch
from torch import mps

# If Metal Performance Shader (mps) is not available tell me:
if not torch.backends.mps.is_available():
    if not torch.backends.mps.is_built():
        print("MPS not available because the current PyTorch install was not "
              "built with MPS enabled.")
    else:
        print("MPS not available because the current MacOS version is not 12.3+ "
              "and/or you do not have an MPS-enabled device on this machine.")

# If mps is available directly put it on the device.
else:
    device = torch.device("mps")
,
# If cuda (nvidia gpu) is not available tell me:
if not torch.cuda.is_available():
    print("Cuda not available because the current PyTorch install was not "
              "built with Cuda enabled.")
   

# If cuda is available directly put it on the device.
else:
    device = torch.device("cuda")
,

# Is mps available?

mps_avail = torch.backends.mps.is_available() 
print(f"Is Metal Performance Shader (mps) available? {mps_avail}")

,

# Is mps available?

cuda_avail = torch.cuda.is_available() 
print(f"Is Cuda available? {cuda_avail}")

Cuda not available because the current PyTorch install was not built with Cuda enabled.
Is Metal Performance Shader (mps) available? True
Is Cuda available? False


In this tutorial, we will review techniques for optimization and initialization of neural networks. When increasing the depth of neural networks, there are various challenges we face. Most importantly, we need to have a stable gradient flow through the network, as otherwise, we might encounter vanishing or exploding gradients. This is why we will take a closer look at the following concepts: initialization and optimization.

In the first half of the notebook, we will review different initialization techniques, and go step by step from the simplest initialization to methods that are nowadays used in very deep networks. In the second half, we focus on optimization comparing the optimizers SGD, SGD with Momentum, and Adam.

Let's start with importing our standard libraries:

In [4]:
## Standard libraries
import os
import json
import math
import numpy as np
import copy

## Imports for plotting
import matplotlib.pyplot as plt
from matplotlib import cm
import matplotlib_inline
%matplotlib inline
from IPython.display import set_matplotlib_formats
matplotlib_inline.backend_inline.set_matplotlib_formats('svg','pdf')
import seaborn as sns
sns.set()

## Progress bar
from tqdm.notebook import tqdm

## PyTorch
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as data
import torch.optim as optim

We will use the same set_seed function as in Notebook 2, as weel as the path variables DATASET_PATH and CHECKPOINT_PATH.
Adjust the paths if necessary.