# Handwriting Digit Recognition

By the end of this notebook, you will:

- Have a model trained with MNIST dataset
- Have an evaluation of the model
- Application that use the model trained


In [2]:
%load_ext pycodestyle_magic

## Imports and Setup parameters

In device I set cpu since I do not have gpu. Batch size usually should be power of 2. I started changing with 2 and then changing this value, until I found that for 50 I obtain a good accuracy.  Learning rate I started with 0.1, however I modified and got better result with 0.001. According to the majority of epochs used in MNIST dataset, the number of epochs recommended is 6. I obtain good result with 5

In [17]:
# %%pycodestyle
import torch
import torch.nn as nn
import torchvision
import torchvision.transforms as transforms
import numpy as np

LEARNING_RATE = 0.001
BATCH_SIZE = 50
NUM_CATEGORIES = 10
NUM_EPOCHS = 5
DEVICE = torch.device('cpu')

## Load Dataset

Since MNIST is a public dataset well-known, it is included in the framework. In my case, I choose pytorch.

In [18]:
#%%pycodestyle
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))])
train_dataset = torchvision.datasets.MNIST(
    root='',
    train=True,
    transform=transform,
    download=True)
test_dataset = torchvision.datasets.MNIST(
    root='',
    train=False,
    transform=transforms.ToTensor())

train_loader = torch.utils.data.DataLoader(
    dataset=train_dataset,
    batch_size=BATCH_SIZE,
    shuffle=True)
test_loader = torch.utils.data.DataLoader(
    dataset=test_dataset,
    batch_size=BATCH_SIZE,
    shuffle=False)

## Model Definition and Training
### Definition of Convolutional Neural Network
I choose a CNN because between deep learning models the one used for images is CNN. And really comparing with other models CNN obtain almost 99% of accuracy. The CNN is composed of two Convolutional Layers and one fully connected layer. Also we have consider optimization of the neural network using BatchNormalization

In [19]:
# %%pycodestyle 

class ConvNet(nn.Module):
    def __init__(self, numCategories=10):
        super(ConvNet, self).__init__()
        self.layer1 = nn.Sequential(
          nn.Conv2d(
            in_channels=1,
            out_channels=16,
            kernel_size=5,
            stride=1,
            padding=2),
          nn.BatchNorm2d(num_features=16),
          nn.ReLU(),
          nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.layer2 = nn.Sequential(
         nn.Conv2d(
          in_channels=16,
          out_channels=32,
          kernel_size=5,
          stride=1,
          padding=2),
         nn.BatchNorm2d(num_features=32),
         nn.ReLU(),
         nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.fc = nn.Linear(7*7*32, numCategories)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.reshape(out.size(0), -1)
        out = self.fc(out)
        return out


cnn_mnist = ConvNet(NUM_CATEGORIES).to(DEVICE)

### Helper function for Training

In [20]:
#%%pycodestyle
def fit(model, train_loader):
    optimizer = torch.optim.Adam(model.parameters(), lr=LEARNING_RATE)
    loss_func = nn.CrossEntropyLoss()
    numSteps = len(train_loader)
    for epoch in range(NUM_EPOCHS):
        for i, (images, labels) in enumerate(train_loader):
            images = images.to(DEVICE)
            labels = labels.to(DEVICE)
            outputs = model(images)
            loss = loss_func(outputs, labels)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

            if ((i+1) % 100) == 0 :
                print(f'Epoch: [{epoch+1}:{NUM_EPOCHS}]; batchStep: [{i+1}/{numSteps}]; Loss: {loss.item()}')
    

### Training 

In [38]:
fit(cnn_mnist, train_loader=train_loader)
print(cnn_mnist)

Epoch: [1:5]; batchStep: [100/1200]; Loss: 0.11209528893232346
Epoch: [1:5]; batchStep: [200/1200]; Loss: 0.23784486949443817
Epoch: [1:5]; batchStep: [300/1200]; Loss: 0.04622058570384979
Epoch: [1:5]; batchStep: [400/1200]; Loss: 0.06823167204856873
Epoch: [1:5]; batchStep: [500/1200]; Loss: 0.20222480595111847
Epoch: [1:5]; batchStep: [600/1200]; Loss: 0.0715143233537674
Epoch: [1:5]; batchStep: [700/1200]; Loss: 0.054954953491687775
Epoch: [1:5]; batchStep: [800/1200]; Loss: 0.05416314676403999
Epoch: [1:5]; batchStep: [900/1200]; Loss: 0.027382677420973778
Epoch: [1:5]; batchStep: [1000/1200]; Loss: 0.03680666908621788
Epoch: [1:5]; batchStep: [1100/1200]; Loss: 0.09735927730798721
Epoch: [1:5]; batchStep: [1200/1200]; Loss: 0.0403018519282341
Epoch: [2:5]; batchStep: [100/1200]; Loss: 0.026827611029148102
Epoch: [2:5]; batchStep: [200/1200]; Loss: 0.058852873742580414
Epoch: [2:5]; batchStep: [300/1200]; Loss: 0.03133438900113106
Epoch: [2:5]; batchStep: [400/1200]; Loss: 0.11488

## Model Evaluation

In [25]:
#%%pycodestyle
def evaluate(model):
    correct = 0
    y_true = []
    y_pred = []
    for test_imgs, test_labels in test_loader:
        y_true.extend(test_labels.numpy())
        outputs = model(test_imgs)
        _, predicted = torch.max(outputs, 1)
        y_pred.extend(predicted.cpu().numpy())
        correct += (predicted == test_labels).sum()
    print(f'Test accuracy: {float(correct) / (len(test_loader)*BATCH_SIZE)}')
    return (y_true, y_pred)


test_labels, test_predictions = evaluate(cnn_mnist)

17:1: W391 blank line at end of file


## Classification Report and Confusion Matrix

In [29]:
#%%pycodestyle
from sklearn.metrics import classification_report, confusion_matrix
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import accuracy_score

cf_matrix = confusion_matrix(test_labels, test_predictions)
print(classification_report(test_labels, test_predictions))

plt.figure(figsize=(10, 6))
sns.heatmap(cf_matrix, annot=True, cbar=None, cmap="YlGnBu", fmt="d")
plt.title("Confusion Matrix", fontsize=17)
plt.ylabel("True Class", fontsize=13)
plt.xlabel("Predicted Class", fontsize=13)
plt.show()

accuracy = accuracy_score(test_labels, test_predictions)
print("Accuracy   :", accuracy)

19:1: W391 blank line at end of file


### Save the model

In [34]:
torch.save(cnn_mnist, 'cnn_trained.pt')

## Application to do the Inference
I have developed using Tkinter an application that allows to do a handwritten digit. Then I have consumed the model showing the number that is obtained from the model.

In [46]:
#%%pycodestyle
import torch
from tkinter import *
import tkinter as tk
from PIL import ImageGrab, Image
import numpy as np
model = torch.load('cnn_trained.pt')


def predict_digit(img):
    img = img.resize((28, 28))
    img = img.convert('L')
    transform = transforms.ToTensor()
    img_valid = transform(img)
    img_valid = img_valid.to(DEVICE)
    model.to(DEVICE)
    model.eval()
    output = model(img_valid.unsqueeze(0))
    _, prediction = torch.max(output, 1)
    return prediction.item()


class App(tk.Tk):
    def __init__(self):
        tk.Tk.__init__(self)

        self.x = self.y = 0
        self.winfo_toplevel().title("Handwriting Digit Recognition")
        # Creating elements
        self.canvas = tk.Canvas(
            self,
            width=300,
            height=300,
            bg="white",
            cursor="cross")
        self.label = tk.Label(self, text="Thinking..", font=("Helvetica", 48))
        self.classify_btn = tk.Button(
            self,
            text="Recognise",
            command=self.classify_handwriting)
        self.button_clear = tk.Button(
            self,
            text="Clear",
            command=self.clear_all)

        # Grid structure
        self.canvas.grid(row=0, column=0, pady=2, sticky=W, )
        self.label.grid(row=0, column=1, pady=2, padx=2)
        self.classify_btn.grid(row=1, column=1, pady=2, padx=2)
        self.button_clear.grid(row=1, column=0, pady=2)

        self.canvas.bind("<B1-Motion>", self.draw_lines)

    def clear_all(self):
        self.canvas.delete("all")

    def classify_handwriting(self):
        x, y = (self.canvas.winfo_rootx(), self.canvas.winfo_rooty())
        width, height = (self.canvas.winfo_width(), self.canvas.winfo_height())
        rect = (x, y, x+width, y+height)
        im = ImageGrab.grab(rect)

        digit = predict_digit(im)
        self.label.configure(text=str(digit))

    def draw_lines(self, event):
        self.x = event.x
        self.y = event.y
        r = 8
        self.canvas.create_oval(
            self.x-r,
            self.y - r,
            self.x + r,
            self.y + r,
            fill='black')


app = App()
mainloop()

80:1: W391 blank line at end of file
