## Training Notebook

This notebook illustrates training of a simple model to classify digits using the MNIST dataset. This code is used to train the model included with the templates. This is meant to be a started model to show you how to set up Serverless applications to do inferences. For deeper understanding of how to train a good model for MNIST, we recommend literature from the [MNIST website](http://yann.lecun.com/exdb/mnist/). The dataset is made available under a [Creative Commons Attribution-Share Alike 3.0](https://creativecommons.org/licenses/by-sa/3.0/) license.

In [1]:
# We'll use scikit-learn to load the dataset

! pip install -q scikit-learn==0.23.2

In [2]:
# Load the mnist dataset

from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split

X, y = fetch_openml('mnist_784', return_X_y=True)

# We limit training to 10000 images for faster training. Remove train_size to use all examples.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1000, train_size=10000)

In [6]:
# Next, let's add code for deskewing images (we will use this to improve accuracy)
# This code comes from https://fsix.github.io/mnist/Deskewing.html

from scipy.ndimage import interpolation

def moments(image):
    c0, c1 = np.mgrid[:image.shape[0], :image.shape[1]]
    img_sum = np.sum(image)
    
    m0 = np.sum(c0 * image) / img_sum
    m1 = np.sum(c1 * image) / img_sum
    m00 = np.sum((c0-m0)**2 * image) / img_sum
    m11 = np.sum((c1-m1)**2 * image) / img_sum
    m01 = np.sum((c0-m0) * (c1-m1) * image) / img_sum
    
    mu_vector = np.array([m0,m1])
    covariance_matrix = np.array([[m00, m01],[m01, m11]])
    
    return mu_vector, covariance_matrix


def deskew(image):
    c, v = moments(image)
    alpha = v[0,1] / v[0,0]
    affine = np.array([[1,0], [alpha,1]])
    ocenter = np.array(image.shape) / 2.0
    offset = c - np.dot(affine, ocenter)

    return interpolation.affine_transform(image, affine, offset=offset)


def deskew_images(images):
    output_images = []
    
    for image in images:
        output_images.append(deskew(image.reshape(28, 28)).flatten())
    
    return np.array(output_images)


## Scikit-learn Model Training

For this example, we will train a simple SVM classifier using scikit-learn to classify the MNIST digits. We will then freeze the model in the `.joblib` format. This is same as the starter model file included with the SAM templates.

In [4]:
%%time

import sklearn
import numpy as np

from sklearn.metrics import accuracy_score
from sklearn import svm

print (f'Using scikit-learn version: {sklearn.__version__}')

# Fit our training data
clf = svm.SVC(degree=5)
clf.fit(X_train, y_train)

# Test the fitted model for accuracy for the accuracy score
accuracy = accuracy_score(y_test, clf.predict(X_test))

print('Test accuracy without deskewing:', accuracy)

Using scikit-learn version: 0.23.2
Test accuracy without deskewing: 0.959
CPU times: user 27.1 s, sys: 46.7 ms, total: 27.2 s
Wall time: 27.2 s


In [9]:
%%time

# Let's try this again with deskewing on

# Fit our training data
clf = svm.SVC(degree=5)
clf.fit(deskew_images(X_train), y_train)

# Test the fitted model for accuracy for the accuracy score
accuracy = accuracy_score(y_test, clf.predict(deskew_images(X_test)))

print('Test accuracy with deskewing:', accuracy)

Test accuracy with deskewing: 0.971
CPU times: user 22.7 s, sys: 19.9 ms, total: 22.7 s
Wall time: 22.7 s


In [19]:
import joblib

# Save the model to disk with compression to keep size low
joblib.dump(clf, 'digit_classifier.joblib', compress=3)

['digit_classifier.joblib']