# Recognize Handwritten Digit


we import all the needed modules for training our model. We can easily import the dataset and start working on that because the Keras library already contains many datasets and MNIST is one of them. We call mnist.load_data() function to get training data with its labels and also the testing data with its labels.

In [3]:
import keras 
from keras.datasets import mnist 
from keras.models import Sequential 
from keras.layers import Dense, Dropout, Flatten 
from keras.layers import Conv2D, MaxPooling2D 
from keras import backend as K
(x_train, y_train), (x_test, y_test) = mnist.load_data()
print(x_train.shape, y_train.shape) 

(60000, 28, 28) (60000,)


# The Data Preprocessing

Model cannot take the image data directly so we need to perform some basic operations and process the data to make it ready for our neural network. The dimension of the training data is (60000*28*28). One more dimension is needed for the CNN model so we reshape the matrix to shape (60000*28*28*1).

In [4]:
x_train = x_train.reshape(x_train.shape[0],28, 28, 1) 
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
input_shape = (28, 28, 1)
#convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, 10) 
y_test = keras.utils.to_categorical(y_test, 10)
x_train = x_train.astype('float32') 
x_test = x_test.astype('float32') 
x_train /= 255 
x_test /= 255 
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples') 

x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples


# Create the model

Its time for the creation of the CNN model for this Python-based data science project. A convolutional layer and pooling layers are the two wheels of a CNN model. The reason behind the success of CNN for image classification problems is its feasibility with grid structured data. We will use the Adadelta optimizer for the model compilation.

In [None]:
batch_size = 128 
num_classes = 10 
epochs = 20 

model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3),activation='relu',input_shape=input_shape)) 
model.add(Conv2D(64, (3, 3),activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2))) 
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5)) 
model.add(Dense(num_classes,activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adadelta(),metrics=['accuracy']) 

#Train the model
hist = model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(x_test, y_test))

print("The model has successfully trained")
model.save('mnist.h5')
print("Saving the model as mnist.h5")

# Evaluate the model

To evaluate how accurate our model works, we have around 10,000 images in our dataset. In the training of the data model, we do not include the testing data that’s why it is new data for our model. Around 99% accuracy is achieved with this well-balanced MNIST dataset.

In [6]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

Test loss: 0.4207440912723541
Test accuracy: 0.8884999752044678


# Create GUI to predict digits

To build an interactive window we have created a new file in GUI. In this file, you can draw digits on canvas, and by clicking a button, you can identify the digit. The Tkinter library is the part of Python standard library. Our predict_digit() method takes the picture as input and then activates the trained model to predict the digit.

After that to build the GUI for our app we have created the App class. In GUI canvas you can draw a digit by capturing the mouse event and with a button click, we hit the predict_digit() function and show the results.

Below is the full code for our guidigit_recog.py file:

In [7]:
from keras.models import load_model
from tkinter import *
import tkinter as tk
import win32gui
from PIL import ImageGrab, Image
import numpy as np


model = load_model('mnist.h5')

def predict_digit(img):
 #resize image to 28x28 pixels
 img = img.resize((28,28))
 #convert rgb to grayscale
 img = img.convert('L')
 img = np.array(img)
 #reshaping to support our model input and normalizing
 img = img.reshape(1,28,28,1)
 img = img/255.0
 #predicting the class
 res = model.predict([img])[0]
 return np.argmax(res), max(res)

class App(tk.Tk):
 def __init__(self):
     tk.Tk.__init__(self)
        
     self.x = self.y = 0
    
    
     # Creating elements
     self.canvas = tk.Canvas(self, width=300, height=300, bg = "white", cursor="cross")
     self.label = tk.Label(self, text="Thinking..", font=("Helvetica", 48))
     self.classify_btn = tk.Button(self, text = "Recognise", command = self.classify_handwriting)
     self.button_clear = tk.Button(self, text = "Clear", command = self.clear_all)

     # Grid structure
     self.canvas.grid(row=0, column=0, pady=2, sticky=W, )
     self.label.grid(row=0, column=1,pady=2, padx=2)
     self.classify_btn.grid(row=1, column=1, pady=2, padx=2)
     self.button_clear.grid(row=1, column=0, pady=2)
    
     #self.canvas.bind("<Motion>", self.start_pos)
     self.canvas.bind("<B1-Motion>", self.draw_lines)
    
 def clear_all(self):
     self.canvas.delete("all")
        
 def classify_handwriting(self):
     HWND = self.canvas.winfo_id() # get the handle of the canvas
     rect = win32gui.GetWindowRect(HWND) # get the coordinate of the canvas
     im = ImageGrab.grab(rect)
        
     digit, acc = predict_digit(im)
     self.label.configure(text= str(digit)+', '+ str(int(acc*100))+'%')
        
 def draw_lines(self, event):
     self.x = event.x
     self.y = event.y
     r=8
 self.canvas.create_oval(self.x-r, self.y-r, self.x + r, self.y + r, fill='black')
app = App()
mainloop()



# Conclusion

This project is beginner-friendly and can be used by data science newbies. We have created and deployed a successful deep learning project of digit recognition. We build the GUI for easy learning where we draw a digit on the canvas then we classify the digit and show the results.