<a href="https://colab.research.google.com/github/ganeshbmc/MLP/blob/main/colabs_mlp/wk7_smr_mlp.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Softmax regression classifier  

## Overview  

We make use of MNIST dataset for multiclass classification of images into digits they represent.  

## Imports  

In [None]:
# Common imports
import numpy as np
from pprint import pprint

# to make this notebook's output stable across runs
np.random.seed(42)

# sklearn specific imports
# Data fetching  
from sklearn.datasets import fetch_openml

# Feature scaling
from sklearn.preprocessing import StandardScaler

# Pipeline utility  
from sklearn.pipeline import Pipeline

# Classifiers: logistic regression (LogisticRegression)
from sklearn.linear_model import LogisticRegression, LogisticRegressionCV

# Evaluation metrics
from sklearn.metrics import ConfusionMatrixDisplay
from sklearn.metrics import classification_report
from sklearn.metrics import f1_score
from sklearn.metrics import make_scorer

# To plot pretty figures
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns

# global settings
mpl.rc('axes', labelsize=14)
mpl.rc('xtick', labelsize=12)
mpl.rc('ytick', labelsize=12)
mpl.rc('figure', figsize=(8,6))

## Data loading  

Load MNIST dataset for handwritten digit recognition from OpenML  

In [None]:
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)
X = X.to_numpy()
y = y.to_numpy()

In [None]:
## Training-test split
x_train, x_test, y_train, y_test = X[:60000], X[60000:], y[:60000], y[60000:]

## Model building  

We scale the input features with `StandardScaler` and `LogisticRegression` estimator with `multi_class` parameter set to `multinomial` and using `sag` solver.

In [None]:
pipe = Pipeline(steps=[('scaler', StandardScaler()),
                        ('logreg', LogisticRegression(multi_class='multinomial',
                                                        solver='sag'))])
pipe.fit(x_train, y_train)



Pipeline(steps=[('scaler', StandardScaler()),
                ('logreg',
                 LogisticRegression(multi_class='multinomial', solver='sag'))])

After training the model with the training feature matrix and labels, we learn model parameters.  

In [None]:
pipe[-1].coef_.shape

(10, 784)

In [None]:
pipe[-1].intercept_.shape

(10,)

In [None]:
pipe[-1].classes_

array(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'], dtype=object)

## Model evaluation  

Let's get a classification report on the TEST set and also display the confusion matrix.  

In [None]:
print(classification_report(y_test, pipe.predict(x_test)))

              precision    recall  f1-score   support

           0       0.95      0.98      0.97       980
           1       0.96      0.98      0.97      1135
           2       0.94      0.90      0.92      1032
           3       0.91      0.91      0.91      1010
           4       0.92      0.94      0.93       982
           5       0.91      0.87      0.89       892
           6       0.93      0.95      0.94       958
           7       0.92      0.93      0.92      1028
           8       0.88      0.88      0.88       974
           9       0.91      0.91      0.91      1009

    accuracy                           0.92     10000
   macro avg       0.92      0.92      0.92     10000
weighted avg       0.92      0.92      0.92     10000



Most of the classes have f1_score greater than 90% which is considered to be a good f1-score.  

In [None]:
ConfusionMatrixDisplay.from_estimator(pipe, x_test, y_test)
plt.show()

AttributeError: type object 'ConfusionMatrixDisplay' has no attribute 'from_estimator'

## Exercise  

Use LogisticRegressionCV 

In [None]:
from sklearn.metrics import make_scorer
from sklearn.metrics import f1_score

scorer = make_scorer(f1_score, average='micro')

pipe_cv = Pipeline(steps=[('scaler', StandardScaler()),
                        ('logreg', LogisticRegressionCV(
                                                        cv=5, multi_class='multinomial', solver='sag',
                                                        scoring=scorer, max_iter=100,
                                                        random_state=1729)
                        )])
pipe_cv.fit(x_train, y_train)



In [None]:
pipe[-1].C_

In [None]:
pipe[-1].l1_ratio_

In [None]:
print(classification_report(y_test, pipe.predict(x_test)))

In [None]:
ConfusionMatrixDisplay.from_estimator(pipe, x_test, y_test)
plt.show()