# MNIST Baseline Convolutional Neural Network
*Anders Poirel 04-10-2019

Data from the Kannada Mnist competition on Kaggle. Here, similar the original MNIST, the goal is to correctly classify handwritten digits in the Kannada script.

In [None]:
import pandas as pd
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPooling2D
from tensorflow.keras.utils import to_categorical, normalize
import matplotlib.pyplot as plt 
import seaborn as sns

## Preparing the data

In [None]:
data = pd.read_csv('../data/raw/train.csv')

As in the simple NN example, we normalize and use one-hot-encoding

In [None]:
y_train = data['label']
y_train = to_categorical(y_train)
X_train = data.drop('label', axis = 1, inplace = True).values
X_train = normalize(X_train)

Convolutional neural nets expect data to be fed in the form of tensors (pixel_width, pixel_height, number_of_colors). In its current form, each image is in the form of a 1D array hence we'll need 
to reshape them. The data description says that each image is monochrome, 28x28 thus each data point is a 28x28x1 tensor.

In [None]:
X_train = np.reshape(X_train.values, (len(X_train.index), 28, 28, 1))

## Training the model

Here a very standard CNN architecture is used (The 16-32-64 architecture is known to perform well on simple image classification tasks). Dropout is added to reduce overfitting, though we maybe have sufficient data (60k in the training sample) that this regardless won't be much of an issue 

In [None]:
model = Sequential()
model.add(Conv2D(16, (3, 3), activation = 'relu', input_shape = (28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
model.add(Conv2D(32, (3,3), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Dropout(0.2))
model.add(Conv2D(64, (3,3), activation = 'relu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(64, activation = 'relu'))
model.add(Dense(10, activation = 'softmax'))

In [None]:
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])

In [None]:
model.fit(X_train, y_train,
         validation_split = 0.2, epochs = 15)

### Evaluating model performance

For evaluating performance we can use the same code as in the dense neural network example.

We examine how training and validation set loss and accuracy evolve over time. Note: for this to display, add validation_split = 0.2 as a parameter to model.fit above. Otherwise, we will want to remove the parameter to train the final model on the entire dataset.

In [None]:
sns.set()

def plot_loss(history):
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('Model loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Test'], loc='upper right')
    return

def plot_acc(history):
    plt.plot(history.history['acc'])
    plt.plot(history.history['val_acc'])
    plt.title('Model accuracy')
    plt.ylabel('Accuracy')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Test'], loc='lower right')
    return 

In [None]:
plot_loss(history)

In [None]:
plot_acc(history)

### Making predicitons for Kaggle

In [None]:
submission = pd.read_csv('../data/raw/sample_submission.csv')
X_test = pd.read_csv('../data/raw/test.csv')
X_test.drop('id', axis = 1, inplace = True)

#### Predicitons on alternate validation set

In [None]:
Work in progressh