# Testing Regularization on MNIST with LeNet

Here I use models trained on MNIST using LeNet using all the different regularization techniques I have implemented: No Regularization, L1, L2, Elastic Net, Soft SVB, Hard SVB, Jacobian, Jacobian Determinant, Dropout, Confidence Penalty, Label Smoothing, Noise Injection to Inputs and Noise Injection to Weights. On these models, I use all of the different visualization techniques I have implemented: Training and Test Loss Curves, Weight Distributions (for L1, L2 and Elastic Net), Feature Map Visualizations, Uncertainty Estimates, T-SNE and PCA of Activations, Saliency Maps and Occlusion Sensitivity. As there are so many this notebook is quite messy, but it works as a reference point to see how any specific regularization method effects the different visualizations.

The MNIST data is preprocessed by normalizing using mean 0.1307 and variance 0.3081. The batch size is 100. The model optimizes using SGD with momentum p = 0.9, and standard cross-entropy loss. Model parameters are initialized using Glorot initialization (See Glorot & Bengio 2010), expect for SVB regularization which uses orthogonal initialization. The learning rate starts at 0.1, and is reduced to 0.01 and 0.001 1/3 and 2/3s into training, respectively. The models are trained for 100 epochs.

### Imports and Model Loading

In [None]:
import jupyter_black
import pickle
import torch
from torchsummary import summary

from data_generators import data_loader_MNIST
from model_classes import LeNet
from tools import ModelInfo
from plotting_tools import (
    plot_results,
    plot_reg_results,
    plot_weight_distributions,
    plot_activation_maps,
    plot_predicted_probabilities,
    plot_activations_pca,
    plot_activations_tsne,
    plot_saliency_maps,
    plot_occlusion_sensitivity,
)

jupyter_black.load()

In [None]:
# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

In [None]:
# Loading MNIST dataset
in_channels = 1
train_loader, test_loader = data_loader_MNIST()
summary_model = LeNet(in_channels=in_channels).to(device)
summary(summary_model, (in_channels, 28, 28))

In [None]:
# Load models
model_names = [
    "model_no_reg",
    "model_l1",
    "model_l2",
    "model_l1_l2",
    "model_svb",
    "model_soft_svb",
    "model_jacobi_reg",
    "model_jacobi_det_reg",
    "model_dropout",
    "model_conf_penalty",
    "model_label_smoothing",
    "model_noise_inject_inputs",
    "model_noise_inject_weights",
]
models = {name: ModelInfo(name) for name in model_names}

## No regularization

### Training

In [None]:
# model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
# n_epochs = 10
# losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
#    train_loader, test_loader, model, n_epochs
# )
model = torch.load("./Trained_models/model_no_reg.pt", map_location=torch.device("cpu"))
with open("./Trained_models/model_no_reg_data.pkl", "rb") as f:
    data = pickle.load(f)
    losses = data["losses"]
    reg_losses = data["reg_losses"]
    epochs = data["epochs"]
    weights = data["weights"]
    train_accuracies = data["train_accuracies"]
    test_accuracies = data["test_accuracies"]

### Visualization

#### Plot of Losses and Accuracies

The most straightforward way to visualize the effect of regularization is by plotting the training and validation loss over time. If regularization is working correctly, we should observe a decrease in the gap between training and validation loss, indicating a reduction in overfitting.

In [None]:
plot_results(
    epochs, losses, train_accuracies, test_accuracies, title="No regularization"
)

#### Plot of Weight Distributions

For L1, L2, and Elastic Net regularization, it is useful to visualize the distribution of the weights in the model. L1 regularization should result in many weights being exactly zero, while L2 regularization will typically result in a distribution with smaller magnitudes.

In [None]:
plot_weight_distributions(model, title="Weight Distributions with No Regularization")

#### Plots of Activation Maps

Plots activation maps for each filter in each convolutional layer for a random image taken from the loader sent into the function. To see whether the model builds good representations/picks up on interesting features with the filters - look at train_loader. To see how well it generalizes look at test_loader. We see that images in the first conv layer look almost identical to the input image, while in the second layer they start learning some more clear features. Especially in the context of convolutional neural networks (CNNs), visualizing the feature maps - the activations of the convolutional layers - can provide insight into what features the network is learning. This can give a sense of how regularization is affecting the types of features learned. For instance, too much L1/L2 regularization might result in overly simplistic feature maps, while too little might result in feature maps that are overly complex or noisy.

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

We are taking PCA of the activations (outputs) of each layer, to see how they are distributed. They are distributed in high dimensional space, but we project them down on two principal components. One can use dimensionality reduction techniques like t-SNE or PCA to visualize the activations of the network. This involves taking the activation values of a particular layer and reducing them to 2 or 3 dimensions for plotting. Different classes should ideally form distinct clusters, and overfitting may manifest as overly complex boundaries between classes.

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

One can use dimensionality reduction techniques like t-SNE or PCA to visualize the activations of the network. This involves taking the activation values of a particular layer and reducing them to 2 or 3 dimensions for plotting. Different classes should ideally form distinct clusters, and overfitting may manifest as overly complex boundaries between classes.

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

A saliency map is a simple, yet effective method for understanding which parts of the image contribute most significantly to a neural network's decision. It is created by calculating the gradient of the output category with respect to the input image. This gradient is then visualized as a heatmap overlaying the original image, with high-gradient regions indicating important areas for the model's decision. The intuition behind this is that the gradient measures how much a small change in each pixel's intensity would affect the final prediction. So, large gradient values suggest important pixels.

In [None]:
plot_saliency_maps(model, train_loader, num_images=12)

#### Plots of Occlusion Sensitivity

Occlusion sensitivity is a method that involves systematically occluding different parts of the input image with a grey square (or other "occluder"), and monitoring the effect on the classifier's output. The output is then visualized as a heatmap showing how much the classifier's confidence decreased when each region was occluded, highlighting important regions in the input image for the model's decision. Here, the warmer colour the more important for the classification. A score of 1 means that blocking the area did not change the classification, while 0 means it occluded the prediction.

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## L1 Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs, l1=True, l1_lmbd=0.001
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="L1 regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(model, title="Weight Distributions with L1 Regularization")

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## L2 Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs, l2=True, l2_lmbd=0.01
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="L2 regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(model, title="Weight Distributions with L2 Regularization")

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Elastic Net Regularization (L1 and L2)

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader,
    test_loader,
    model,
    n_epochs,
    l1=True,
    l1_lmbd=0.001,
    l2=True,
    l2_lmbd=0.001,
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="L1 and L2 regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Elastic Net Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Soft SVB Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs, soft_svb=True, soft_svb_lmbd=0.01
)
print(len(epochs))

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="Soft SVB regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Soft SVB Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Hard SVB Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs, hard_svb=True, hard_svb_lmbd=0.00001
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="Hard SVB regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(model, title = "Weight Distributions with Hard SVB Regularization")

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Jacobi Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 3
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs, jacobi_reg=True, jacobi_reg_lmbd=1
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="Jacobi regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Jacobi Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Jacobi Determinant Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 3
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader,
    test_loader,
    model,
    n_epochs,
    jacobi_det_reg=True,
    jacobi_det_reg_lmbd=0.001,
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="Jacobi Determinant Regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Jacobi Determinant Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Dropout Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels, dropout_rate=0.2).to(
    device
)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_results(
    epochs, losses, train_accuracies, test_accuracies, title="Dropout Regularization"
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Dropout Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Confidence Penalty Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 3
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader,
    test_loader,
    model,
    n_epochs,
    conf_penalty=True,
    conf_penalty_lmbd=0.03,
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_reg_results(
    epochs,
    losses,
    reg_losses,
    train_accuracies,
    test_accuracies,
    title="Confidence Penalty Regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Confidence Penalty Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Label Smoothing Regularization

### Training

In [None]:
model = LeNet(lr=lr, momentum=momentum, in_channels=in_channels).to(device)
n_epochs = 3
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader,
    test_loader,
    model,
    n_epochs,
    label_smoothing=True,
    label_smoothing_lmbd=0.000001,
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_results(
    epochs,
    losses,
    train_accuracies,
    test_accuracies,
    title="Label Smoothing Regularization",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Label Smoothing Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Noise Injection (to inputs)

### Training

In [None]:
model = LeNet(
    lr=lr,
    momentum=momentum,
    in_channels=in_channels,
    noise_inject_input=True,
    noise_stddev=0.05,
).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_results(
    epochs,
    losses,
    train_accuracies,
    test_accuracies,
    title="Noise Injection (to inputs)",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Noise Injection (to inputs) Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

## Noise Injection (to weights of first convolutional layer)

### Training

In [None]:
model = LeNet(
    lr=lr,
    momentum=momentum,
    in_channels=in_channels,
    noise_inject_weights=True,
    noise_stddev=0.03,
).to(device)
n_epochs = 5
losses, reg_losses, epochs, weights, train_accuracies, test_accuracies = train(
    train_loader, test_loader, model, n_epochs
)

### Visualization

#### Plot of Losses and Accuracies

In [None]:
plot_results(
    epochs,
    losses,
    train_accuracies,
    test_accuracies,
    title="Noise Injection (to weights of first conv layer)",
)

#### Plot of Weight Distributions

In [None]:
plot_weight_distributions(
    model, title="Weight Distributions with Noise Injection (to weights) Regularization"
)

#### Plots of Activation Maps

In [None]:
plot_activation_maps(model, train_loader, num_images=1)

#### Plot of Predicted Probabilities

In [None]:
plot_predicted_probabilities(model, train_loader, num_batches=10)

In [None]:
plot_predicted_probabilities(model, test_loader, num_batches=10)

#### Plot of PCA of Activations

In [None]:
plot_activations_pca(model, train_loader, device)

#### Plot of t-SNE of Activations

In [None]:
plot_activations_tsne(model, train_loader, device)

#### Plots of Saliency Maps

In [None]:
plot_saliency_maps(model, train_loader, num_images=9)

#### Plots of Occlusion Sensitivity

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=8, stride=4)

In [None]:
plot_occlusion_sensitivity(model, train_loader, num_images=3, occluder_size=4, stride=2)

###