**PACKAGES**
> 
We will first import the packages nessary to develop and run our models

In [50]:
# Basic data manipulation 
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O 

# Data processing packages
from sklearn.model_selection import train_test_split 

# CNN packages
from tensorflow import keras


**IMPORTING DATA**
> 
We can now begin to import our data from the csv files

In [51]:
# Read data in from files
# Importing data set
data = pd.read_csv('../input/digit-recognizer/train.csv') 
X = data.iloc[:,1:] # Data Inputs (pixels)
y = data.iloc[:,0] # Data Outputs (lables)

**PRE-PROCESSING DATA**
> 
We will first transform the input data into an image form. We know that the images are 28 x 28 pixels

In [52]:
# Reshape the data
# So that each data point is size of image 
X = X.values.reshape(-1,28,28,1) / 255.0 


**TRAIN TEST SPLIT**

In [53]:
# split the data into a training and testing set
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

**CREATE CNN MODEL**

We will create a CNN model to model our data. This model will be simple but made up of several layers.

The first layer is a convolutional layer with 64 channels and a kernal size of (3 x 3).

We will then have a pooling layer which will do Max pooling on the previous layer with a pooling window of (2 x 2).

Next we will flatten our layers after these convolutional layers so that we can have fully connected layers.

Our next layer is a fully connected layer with "relu" as our activation function.

Finally we will have our final output layer that will predict one of the 10 data labels using the "softmax" function.

In [54]:
# Construct the model
model = keras.models.Sequential([
    keras.layers.Conv2D(filters=64, kernel_size=(3, 3), input_shape=[28, 28, 1]),
    keras.layers.MaxPooling2D(pool_size=(4, 4)),
    keras.layers.Flatten(),
    keras.layers.Dense(units=64, activation='relu'),
    keras.layers.Dropout(0.5),
    keras.layers.Dense(units=10, activation='softmax'),
])

**COMPILE AND TRAIN THE MODEL**

We will compile the model and use "sparse categorical crossentropy", and the "nadam" optimiser.

We will then train the model (fit the model) to our training data as defined above.

In [55]:
# Compile the model we have created
model.compile(
    loss="sparse_categorical_crossentropy",
    optimizer = 'nadam',
    metrics=['accuracy']
)

# Train the model we have created on the training data 
model.fit(X_train, y_train, epochs=10, batch_size=16)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f62b6380a60>

**TESTING THE MODEL**

We will finally test the model agaist our testing data set.

In [57]:
# Evaluate our model on our set aside test set
test_loss, test_accuracy = model.evaluate(X_test, y_test)

# Print out the results
print(f"Test loss is: {test_loss}\n")
print(f"Test accuracy is: {test_accuracy}\n")

Test loss is: 0.06852838397026062

Test accuracy is: 0.9804762005805969

