# My first Google Colab Notebook. Also my first implementation of a CNN with Keras

In [0]:
'''The data is loaded into Google Drive. Data can be directly downloaded from
Kaggle using Kaggle API. However in this proejct I have downloaded the training
and testing data for the MNIST dataset (considered the "Hello World" of Machine
Learning) in my Google Drive. The Drive is loaded in the notebook
'''


from google.colab import drive
import pandas as pd
import numpy as np

drive.mount('/content/gdrive', force_remount = True)
dataset_path = 'gdrive/My Drive/Projects/MNIST/'



Mounted at /content/gdrive


In [0]:
#The test and training files are loaded as  dataframes
train_file = dataset_path + "train.csv"
test_file = dataset_path + "test.csv"




In [0]:
train = pd.read_csv(train_file)
test = pd.read_csv(test_file)

#Training data is prepared for the labels as well as the pixel wise values
Y_train = train['label']
X_train = train.drop(labels = ["label"],axis = 1) 

#The pixel values are normalized
X_train = X_train/255.0
test = test/255.0

#As we will be feeding in images we reshape the vector to 28x28 images
X_train = X_train.values.reshape(-1, 28, 28, 1)
test = test.values.reshape(-1, 28, 28, 1)

#Using the following the labels are converted to one-hot encoded form
Y_train = pd.get_dummies(Y_train)


# All the different constituents for building the model are loaded


*   **train_test_split** is loaded to get validation data and training data
*   **Sequential** is used as we are building a Sequential model where one layer connects to the next layer
* **Dense, Dropout, Flatten, Conv2D, MaxPool2D** are all used to build the CNN as we require **fully-connected layers**, we are adding **dropout** to prevent overfitting. To convert from convolution layers to fully-connected layers we require **Flatten**. Finally **Conv2D, MaxPool2D** are required as we want a CNN with Max Pooling features
* Two Optimizers **RMSProp** and **Adam** are loaded. **Adam** has been used finally
* **ImageDataGenerator** is a Keras trick to provide dynamic **Data Augmentation**
* **ReduceLROnPlateau** is used to reduce the Learning Rate when **Validation Loss** does not reduce after certain number of epochs



In [0]:
from sklearn.model_selection import train_test_split
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPool2D
from keras.optimizers import RMSprop, Adam
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau

In [0]:
np.random.seed(2)
random_seed = 2
#Training and Validation data is generated

X_train, X_val, Y_train, Y_val = train_test_split(X_train, Y_train,
                                                  test_size = 0.1,
                                                  random_state=random_seed)

# CNN is created here
* **Conv Layer-filter size:(5x5), num of filters: 32, activation: relu**
* **Conv Layer-filter size:(5x5), num of filters: 32, activation: relu**
* **Max Pool Layer with filter size (2x2)**
* **Dropout with probability of 0.25 for ignoring neurons**

* **Conv Layer-filter size:(3x3), num of filters: 64, activation: relu**
* **Conv Layer-filter size:(3x3), num of filters: 64, activation: relu**
* **Max Pool Layer with filter size (2x2) with strides of (2x2)**
* **Dropout with probability of 0.25 for ignoring neurons**

* After flattening we connect to a **Dense Layer of 256 neurons** and a **dropout of 50%**
* Finally we connect to a layer of **10 neurons** as we have **10 classes**


In [0]:
model = Sequential()

model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', 
                 activation ='relu', input_shape = (28,28,1)))
model.add(Conv2D(filters = 32, kernel_size = (5,5),padding = 'Same', 
                 activation ='relu'))
model.add(MaxPool2D(pool_size=(2,2)))
model.add(Dropout(0.25))


model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same', 
                 activation ='relu'))
model.add(Conv2D(filters = 64, kernel_size = (3,3),padding = 'Same', 
                 activation ='relu'))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Dropout(0.25))


model.add(Flatten())
model.add(Dense(256, activation = "relu"))
model.add(Dropout(0.5))
model.add(Dense(10, activation = "softmax"))

In [0]:
#Adam Optimizer is set
optimizer = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0,
                 amsgrad=False)

In [0]:
'''The loss that we are trying to minimize is the categorical cross entropy
loss. The metric that we are looking at is accuracy
'''
model.compile(optimizer = optimizer , loss = "categorical_crossentropy",
              metrics=["accuracy"])

In [0]:
'''We are monitoring the validation accuracy. If it does not improve for
3 epochs we reduce the Learning rate by half. The minimum learning rate is
1e-5
'''

learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc', 
                                            patience=3, 
                                            verbose=1, 
                                            factor=0.5, 
                                            min_lr=0.00001)

In [0]:
epochs = 30
batch_size = 128

In [0]:
'''Data augmentation happening here. The data generator is fitted with the 
training data and the images are modified randomly to generate the data for
a particular epoch
'''


datagen = ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range=10,  # randomly rotate images in the range (degrees, 0 to 180)
        zoom_range = 0.1, # Randomly zoom image 
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip=False,  # randomly flip images
        vertical_flip=False)  # randomly flip images


datagen.fit(X_train)

In [0]:
'''The step that fits and generates the model to X_train and Y_train.
callbacks mentions anything that we want to run in the midst of the training
process depending on the status of the training values
'''

history = model.fit_generator(datagen.flow(X_train,Y_train, batch_size=batch_size),
                              epochs = epochs, validation_data = (X_val,Y_val),
                              verbose = 2, steps_per_epoch=X_train.shape[0] // batch_size
                              , callbacks=[learning_rate_reduction])

Epoch 1/30
 - 7s - loss: 0.5149 - acc: 0.8334 - val_loss: 0.0822 - val_acc: 0.9743
Epoch 2/30
 - 7s - loss: 0.1543 - acc: 0.9538 - val_loss: 0.0511 - val_acc: 0.9836
Epoch 3/30
 - 7s - loss: 0.1073 - acc: 0.9685 - val_loss: 0.0476 - val_acc: 0.9865
Epoch 4/30
 - 7s - loss: 0.0897 - acc: 0.9734 - val_loss: 0.0385 - val_acc: 0.9899
Epoch 5/30
 - 7s - loss: 0.0782 - acc: 0.9763 - val_loss: 0.0399 - val_acc: 0.9889
Epoch 6/30
 - 7s - loss: 0.0703 - acc: 0.9788 - val_loss: 0.0399 - val_acc: 0.9905
Epoch 7/30
 - 7s - loss: 0.0596 - acc: 0.9825 - val_loss: 0.0386 - val_acc: 0.9902
Epoch 8/30
 - 7s - loss: 0.0585 - acc: 0.9825 - val_loss: 0.0350 - val_acc: 0.9902
Epoch 9/30
 - 7s - loss: 0.0545 - acc: 0.9844 - val_loss: 0.0335 - val_acc: 0.9921
Epoch 10/30
 - 8s - loss: 0.0554 - acc: 0.9828 - val_loss: 0.0339 - val_acc: 0.9913
Epoch 11/30
 - 7s - loss: 0.0488 - acc: 0.9852 - val_loss: 0.0316 - val_acc: 0.9910
Epoch 12/30
 - 7s - loss: 0.0499 - acc: 0.9851 - val_loss: 0.0329 - val_acc: 0.9913



In [0]:
results = model.predict(test)#predict for the test data

# take the index of highest probability 
results = np.argmax(results,axis = 1)
#Convert the array to a pandas series
results = pd.Series(results,name="Label")

In [0]:
#Saving the output file.

submission = pd.concat([pd.Series(range(1,28001),name = "ImageId"),results],axis = 1)

submission.to_csv("cnn_mnist_datagen.csv",index=False)