## **Notebook PC#02**
## Regularized Linear Classifiers for MNIST and CIFAR-10 databases. High-level commands.
**Professor:** Fernando J. Von Zuben <br>
**Aluno(a):** <br>
**Aluno(a):**


In [1]:
from matplotlib import pyplot as plt
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist

(Xp, y), (Xtp, yt) = mnist.load_data()
Xa = Xp.reshape(Xp.shape[0], 784)
Xta = Xtp.reshape(Xtp.shape[0], 784)
X = Xa / 255.0
Xt = Xta / 255.0

print("Shape of X: ".ljust(10),  X.shape)
print("Shape of y: ".ljust(10),  y.shape)
print("Shape of Xt: ".ljust(10),  Xt.shape)
print("Shape of yt: ".ljust(10),  yt.shape)

2024-03-31 16:27:27.132077: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.10.1


Shape of X:  (60000, 784)
Shape of y:  (60000,)
Shape of Xt:  (10000, 784)
Shape of yt:  (10000,)


In [2]:
# Dataset balancing analysis
from tensorflow.keras.utils import to_categorical
import plotly.express as px

yy = to_categorical(y)
yyt = to_categorical(yt)
sum_along_columns1 = np.sum(yy, axis = 0)
print(sum_along_columns1)
sum_along_columns2 = np.sum(yyt, axis = 0)
print(sum_along_columns2)
fig1 = px.bar(x=range(10),y=sum_along_columns1, width=600, height=400,
              labels=dict(x="Class", y="Number of examples"), title="<b>Number of examples per class in the training dataset</b>")
fig1.update_layout(xaxis = dict(tickmode = 'linear', dtick = 1), title_x=0.5)
fig1.show()
fig2 = px.bar(x=range(10),y=sum_along_columns2, width=600, height=400,
              labels=dict(x="Class", y="Number of examples"), title="<b>Number of examples per class in the test dataset</b>")
fig2.update_layout(xaxis = dict(tickmode = 'linear', dtick = 1), title_x=0.5)
fig2.show()

[5923. 6742. 5958. 6131. 5842. 5421. 5918. 6265. 5851. 5949.]
[ 980. 1135. 1032. 1010.  982.  892.  958. 1028.  974. 1009.]


##**A classifier with ridge regularization**


In [4]:
from numpy import mean
from numpy import std
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.linear_model import RidgeClassifier

model = RidgeClassifier()
# Define the model evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=5, n_repeats=3, random_state=1)
# Evaluate the model and collect the scores
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# Report the model performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

In [None]:
from sklearn.metrics import confusion_matrix
import pandas as pd

model.fit(X,y)
print(f"Performance on the test data: {model.score(Xt,yt)} \n")
cf1 = confusion_matrix(model.predict(Xt),yt)
confusion_matrix_df = pd.DataFrame(cf1)
confusion_matrix_df


In [None]:
cf2 = confusion_matrix(model.predict(Xt),yt, normalize='true')
confusion_matrix_df = pd.DataFrame(cf2)
confusion_matrix_df

##**Another classifier with ridge regularization**

In [None]:
from sklearn.linear_model import RidgeClassifierCV

c_values = [pow(2,d) for d in range(-10, 19, 2)] # Range goes up to 19 to ensure search up to 2^18
model1 = RidgeClassifierCV(alphas=c_values)
model1.fit(X,y)
print(f"Performance on the test data: {model1.score(Xt,yt)} \n")
cf3 = confusion_matrix(model1.predict(Xt),yt)
confusion_matrix_df = pd.DataFrame(cf3)
confusion_matrix_df

In [None]:
cf4 = confusion_matrix(model1.predict(Xt),yt, normalize='true')
confusion_matrix_df = pd.DataFrame(cf4)
confusion_matrix_df

In [None]:
print(c_values)
print(f'alpha_: {model1.alpha_}')

##**Visualization tools for unveiling the classification strategy**

In [None]:
# Only for the MNIST dataset
# weights = model.coef_.copy() # Run this command for the RidgeClassifier
weights = model1.coef_.copy() # Run this command for the RidgeClassifierCV
plt.figure(figsize=(10,5))
for i in range(10):
    plt.subplot(2, 5, i+1)
    weight = weights[i,:].reshape([28,28])
    plt.title(f'Class {i}')
    plt.imshow(weight, cmap='viridis')
    frame1 = plt.gca()
    frame1.axes.get_xaxis().set_visible(False)
    frame1.axes.get_yaxis().set_visible(False)

In [None]:
# Only for the MNIST dataset
# weights = model.coef_.copy() # Run this command for the RidgeClassifier
weights = model1.coef_.copy() # Run this command for the RidgeClassifierCV
scale = np.abs(weights).max()
plt.figure(figsize=(10,5))
for i in range(10):
    plt.subplot(2, 5, i+1)
    weight = weights[i,:].reshape([28,28])
    plt.title(f'Class {i}')
    plt.imshow(weight, cmap=plt.cm.seismic, vmin=-scale/2, vmax=scale/2, interpolation='bilinear')
    frame1 = plt.gca()
    frame1.axes.get_xaxis().set_visible(False)
    frame1.axes.get_yaxis().set_visible(False)

<font color="green">
Atividade (a) <br>
(a)	Qual é a diferença entre a estratégia de regularização do PC01 e a estratégia de regularização do PC02 (célula 3 ou célula 6)? <br>
</font>

Resposta:

<font color="green">
Atividade (b) <br>
Quais são as principais diferenças entre RidgeClassifier e RidgeClassifierCV? <br> </font>

Links relevantes: <BR>
[RidgeClassifier](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeClassifier.html) <br>
[RidgeClassifierCV](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeClassifierCV.html)

Resposta:

## Execute o mesmo notebook, a seguir, com adaptações para a base CIFAR-10.

In [None]:
from keras.datasets import cifar10

(Xp, y), (Xtp, yt) = cifar10.load_data()
Xa = Xp.reshape(Xp.shape[0], 3072)
Xta = Xtp.reshape(Xtp.shape[0], 3072)
y = y.ravel()
yt = yt.ravel()
X = Xa / 255.0
Xt = Xta / 255.0

print("Shape of X: ".ljust(10),  X.shape)
print("Shape of y: ".ljust(10),  y.shape)
print("Shape of Xt: ".ljust(10),  Xt.shape)
print("Shape of yt: ".ljust(10),  yt.shape)

In [None]:
# Dataset balancing analysis
from tensorflow.keras.utils import to_categorical
import plotly.express as px

yy = to_categorical(y)
yyt = to_categorical(yt)
sum_along_columns1 = np.sum(yy, axis = 0)
print(sum_along_columns1)
sum_along_columns2 = np.sum(yyt, axis = 0)
print(sum_along_columns2)
fig1 = px.bar(x=range(10),y=sum_along_columns1, width=600, height=400,
              labels=dict(x="Class", y="Number of examples"), title="<b>Number of examples per class in the training dataset</b>")
fig1.update_layout(xaxis = dict(tickmode = 'linear', dtick = 1), title_x=0.5)
fig1.show()
fig2 = px.bar(x=range(10),y=sum_along_columns2, width=600, height=400,
              labels=dict(x="Class", y="Number of examples"), title="<b>Number of examples per class in the test dataset</b>")
fig2.update_layout(xaxis = dict(tickmode = 'linear', dtick = 1), title_x=0.5)
fig2.show()

##**A classifier with ridge regularization**


In [None]:
from numpy import mean
from numpy import std
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import RepeatedStratifiedKFold
from sklearn.linear_model import RidgeClassifier

model = RidgeClassifier()
# Define the model evaluation procedure
cv = RepeatedStratifiedKFold(n_splits=5, n_repeats=3, random_state=1)
# Evaluate the model and collect the scores
n_scores = cross_val_score(model, X, y, scoring='accuracy', cv=cv, n_jobs=-1)
# Report the model performance
print('Mean Accuracy: %.3f (%.3f)' % (mean(n_scores), std(n_scores)))

In [None]:
from sklearn.metrics import confusion_matrix
import pandas as pd

model.fit(X,y)
print(f"Performance on the test data: {model.score(Xt,yt)} \n")
cf1 = confusion_matrix(model.predict(Xt),yt)
confusion_matrix_df = pd.DataFrame(cf1)
confusion_matrix_df


In [None]:
cf2 = confusion_matrix(model.predict(Xt),yt, normalize='true')
confusion_matrix_df = pd.DataFrame(cf2)
confusion_matrix_df

##**Another classifier with ridge regularization**

In [None]:
from sklearn.linear_model import RidgeClassifierCV

c_values = [pow(2,d) for d in range(-10, 19, 2)] # Range goes up to 19 to ensure search up to 2^18
model1 = RidgeClassifierCV(alphas=c_values)
model1.fit(X,y)
print(f"Performance on the test data: {model1.score(Xt,yt)} \n")
cf3 = confusion_matrix(model1.predict(Xt),yt)
confusion_matrix_df = pd.DataFrame(cf3)
confusion_matrix_df

In [None]:
cf4 = confusion_matrix(model1.predict(Xt),yt, normalize='true')
confusion_matrix_df = pd.DataFrame(cf4)
confusion_matrix_df

In [None]:
print(c_values)
print(f'alpha_: {model1.alpha_}')

##**Visualization tools for unveiling the classification strategy**

In [None]:
from matplotlib.pyplot import figure

figure(figsize=(10, 5))
weights = model1.coef_.copy()
print(weights.shape)
classes = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']
for i in range(10):
    plt.subplot(2, 5, i+1)
    weight1 = weights[i,:].reshape([32,32,3])
    weight1 *= 1.0/weight1.max()
    weight2 = np.clip(weight1, 0, 1)
    plt.title(classes[i])
    plt.imshow(weight2)
    frame1 = plt.gca()
    frame1.axes.get_xaxis().set_visible(False)
    frame1.axes.get_yaxis().set_visible(False)

<font color="green">
Atividade (c) <br>
Observando as figuras logo acima, informe qual é a estratégia adotada pelo classificador RidgeClassifierCV para a base CIFAR-10. <br>
</font>

Resposta:

<font color="green">
Atividade (d) <br>
Comparando os dois conjuntos de dados (MNIST e CIFAR-10), o que pode sustentar a grande perda de desempenho verificada para o caso da base CIFAR-10? <br>
</font>

Resposta: