<a href="https://colab.research.google.com/github/kleeresearch/Python/blob/master/cvps22_19_mlp_weight_regularization.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **CVPS22 // Addressing Overfitting in MLPs**

*November 1, 2022*

This notebook will explore weight regularization as a method for controlling overfitting.

---

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import imageio as iio
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import ConfusionMatrixDisplay, classification_report

In [None]:
plt.style.use("seaborn-dark")
plt.rcParams["figure.figsize"] = (10, 7)
plt.rcParams["image.interpolation"] = "nearest"
plt.rcParams["image.cmap"] = "gist_gray"

---

Load the hand-written digits data,

In [None]:
# -- load in the hand-written digits data set
fname = "/content/drive/MyDrive/cvps22/data/examples/digits.png"
digits = np.asarray(iio.imread(fname)) / 255.

# -- get a list of individual numbers (note they are 20x20 pixels)
nums = digits.reshape(50, 20, 100, 20).transpose(0, 2, 1, 3).reshape(5000, 20, 20)

# -- create features array [NOTE THE .copy()]
nimg = nums.shape[0]
nrow = nums.shape[1]
ncol = nums.shape[2]
feat = nums.reshape(nimg, nrow * ncol).copy()

# -- set the target
targ = np.concatenate((np.full(500, 0), np.full(500, 1), np.full(500, 2), 
                       np.full(500, 3), np.full(500, 4), np.full(500, 5), 
                       np.full(500, 6), np.full(500, 7), np.full(500, 8), 
                       np.full(500, 9)))

In [None]:
# -- create a training/testing sample
feat_tr, feat_te, targ_tr, targ_te = train_test_split(feat, targ, test_size=0.2, random_state=302)

print("number of training examples : {0}".format(targ_tr.size))
print("number of testing examples  : {0}".format(targ_te.size))

Let's train a **Multi-layer Perceptron classifier**,

In [None]:
# -- instantiate an MLP classifier
mlp = MLPClassifier(hidden_layer_sizes=(10), max_iter=500)

# -- train it
mlp.fit(feat_tr, targ_tr)

# -- predict
pred_tr = mlp.predict(feat_tr)
pred_te = mlp.predict(feat_te)

# -- print accuracy
acc_tr = accuracy_score(targ_tr, pred_tr)
acc_te = accuracy_score(targ_te, pred_te)

print("training accuracy : {0}".format(acc_tr))
print("testing accuracy : {0}".format(acc_te))

# -- evaluate performance metrics
ConfusionMatrixDisplay.from_estimator(mlp, feat_te, targ_te)
print(classification_report(targ_te, pred_te))

In [None]:
# -- plot the loss function
fig, ax = plt.subplots()
ax.plot(mlp.loss_curve_)
ax.set_xlabel("iteration")
ax.set_ylabel("loss")
ax.set_yscale("log")
fig.show()

# -- visualize the weights
ww = mlp.coefs_[0].reshape(20, 20, 10)

fig, ax = plt.subplots(2, 5, figsize=[15, 5])
for ii in range(10):
  ax[ii // 5, ii % 5].imshow(np.abs(ww[:, :, ii]))

In [None]:
# -- instantiate an MLP classifier with more neurons
mlp = 

# -- train it
mlp.fit(feat_tr, targ_tr)

# -- predict
pred_tr = mlp.predict(feat_tr)
pred_te = mlp.predict(feat_te)

# -- print accuracy
acc_tr = accuracy_score(targ_tr, pred_tr)
acc_te = accuracy_score(targ_te, pred_te)

print("training accuracy : {0}".format(acc_tr))
print("testing accuracy : {0}".format(acc_te))

In [None]:
# -- visualize the weights
ww = 

fig, ax = plt.subplots(10, 10, figsize=[15, 10])
for ii in range(100):
  ax[ii // 10, ii % 10].imshow(ww[:, :, ii], cmap="coolwarm", clim=[-0.5, 0.5])

This model is clearly overfit.  Let's use weight regularization to reduce the model flexibility,

In [None]:
# -- instantiate an MLP classifier with more neurons
mlp = 

# -- train it
mlp.fit(feat_tr, targ_tr)

# -- predict
pred_tr = mlp.predict(feat_tr)
pred_te = mlp.predict(feat_te)

# -- print accuracy
acc_tr = accuracy_score(targ_tr, pred_tr)
acc_te = accuracy_score(targ_te, pred_te)

print("training accuracy : {0}".format(acc_tr))
print("testing accuracy : {0}".format(acc_te))

In [None]:
# -- visualize the weights
ww = 

fig, ax = plt.subplots(10, 10, figsize=[15, 10])
for ii in range(100):
  ax[ii // 10, ii % 10].imshow(ww[:, :, ii], cmap="coolwarm", clim=[-0.15, 0.15])

In [None]:
# -- plot the loss function
fig, ax = plt.subplots()
ax.plot(mlp.loss_curve_)
ax.set_xlabel("iteration")
ax.set_ylabel("loss")
ax.set_yscale("log")
fig.show()