# Implemenatation of LeNet-5 for digit recognition

In this micro-project, we'll implement famous LeNet-5 convolutional network for digit recognition. More informations about this network you can find in paper: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf. 

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D

# Constants and helper functions

First, lets define constants, we will use them later

In [None]:
image_width = 28
image_height = 28
num_filters = 32
max_pool_size = (2, 2) 
conv_kernel_size = (3, 3)
num_classes = 10
drop_prob = 0.5
epochs = 10
batch_size = 100

Define function for loading input data, data will be returned as numpy arrays

In [None]:
def load_data():
    train_set = pd.read_csv('data/train.csv')
    test_set = pd.read_csv('data/test.csv')

    x_train = train_set.iloc[:, 1:].values
    y_train = train_set.iloc[:, 0].values
    x_test = test_set.iloc[:,:].values
    
    return x_train, y_train, x_test

Define function for converting array into one hot matrix

In [None]:
def convert_to_one_hot(arr):
    one_hot = np.zeros((arr.size, arr.max() + 1))
    one_hot[np.arange(arr.size), arr] = 1
    
    return one_hot

Normalize data values from input range (1 - 255) into target range (0. - 1) 

In [None]:
def normalize_data(data):
    data = data/data.max()
    return data

Saving predictions into csv file

In [None]:
def save_results(preds):
    y_test = preds.astype(int)
    csv_content = pd.DataFrame({'ImageId': range(1,len(y_test)+1), 'Label': y_test})
    csv_content.to_csv('result.csv', index = False)

## Data preprocessing 

In [None]:
Now we have to perform some preprocessing of data to fit our CNN architecture. We want to have:
- normaized values in range (0,1)
- images in shape of 28 x 28 x 1 pixels
- images with padding to match 32 x 32 input shape of LeNet-5 

In [None]:
# Prepare input and output of Neural Network
x_train, y_train, x_test = load_data()
x_train = normalize_data(x_train)
x_test = normalize_data(x_test)
y_train = convert_to_one_hot(y_train)

# Change shape of input from list of values into 28 pixel X 28 pixel X 1 grayscale value
x_train = x_train.reshape(x_train.shape[0], image_height, image_width, 1)
x_test = x_test.reshape(x_test.shape[0], image_height, image_width, 1)

# Padding 
x_train = np.pad(x_train, ((0,0),(2,2),(2,2),(0,0)), 'constant')
x_test = np.pad(x_test, ((0,0),(2,2),(2,2),(0,0)), 'constant')

## Model

Now we will initialize tesorflow session and define architecture of each layer in sequential mode.

In [None]:
sess = tf.InteractiveSession()
model = Sequential()

#### Layer 1

Convolution

In [None]:
model.add(Convolution2D(filters = 6, kernel_size = 5, strides = 1, activation = 'relu',  input_shape = (32,32,1)))

Max Pooling

In [None]:
model.add(MaxPooling2D(pool_size = 2, strides = 2))

#### Layer 2

Convolution

In [None]:
model.add(Convolution2D(filters = 16, kernel_size = 5, strides = 1, activation = 'relu',  input_shape = (14,14,6)))

Max Pooling

In [None]:
model.add(MaxPooling2D(pool_size = 2, strides = 2))

Flattening

In [None]:
model.add(Flatten())

#### Layer 3
Fuly Connected

model.add(Dense(units=120, activation='relu'))

#### Layer 4

In [None]:
model.add(Dense(units = 84, activation = 'relu'))

#### Output layer

In [None]:
model.add(Dense(units = 10, activation = 'softmax'))
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

#### Train model

In [None]:
model.fit(x_train, y_train, batch_size = batch_size, epochs = epochs, verbose = 1)

## Predictions

Now we will make predictions for out test set, and save results with use of function save_results

In [None]:
y_pred = model.predict(x_test)
y_pred = np.argmax(y_pred, axis = 1)

save_results(y_pred)