<a href="https://colab.research.google.com/github/grace-lees/handwritten-digit-recognition/blob/main/Final_Leeswadtrakul_Grace.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Machine Learning Final Project DUE: Friday May 7th 11:59pm**

**Note: Please read all the instructions carefully before starting the project.**

For your final project you will build an ML model to analyze a dataset of your choice. You are welcome to keep working on the data in your EDA project if your data is large enough (at least 1000 rows for simple models and at least 10,000 for more complex models) or you can choose from the datasets/project suggestions below.

In this project make sure that you:
- Have a large enough dataset
- Split your data in training and testing
- Explore your data to inform which type of model to choose (no need if you are using your EDA dataset)
- Try different models on your training dataset - then select the most promising model
- Use cross validation to fine tune the model’s parameters such as alpha in lasso
- Simplify your model using regularization, prunnning, drop-out, etc. to avoid overfitting
- Communicate your model’s performance and make sure you compare it to a benchmark when appropriate
- Plot interesting graphs and results
- Write and publish your article to medium
- Commit your code to your GitHub

Please ensure you handle all the preprocessing before the modeling.

Suggestions for project:
You can take a look at the resources given below for choosing a dataset for your project. 

- Traffic sign detection - https://benchmark.ini.rub.de/gtsdb_dataset.html
- Cat and dog classifier - https://www.kaggle.com/c/dogs-vs-cats/data
- Other datasets from Kaggle - https://www.kaggle.com/data/41592

## **Grading Criteria**

- Show clear exploration of the data to justify model choice
- Train mutliple models and clearly articulate why you chose your final model
- Show your performance on test dataset
- Clear and concise write-up with clear well-documented figures
- Commit your code to GitHub

## **Submission Details**

This is an individual assignment. You may not work in groups. The assignment is due on Friday (05/07/2021)
- To submit your assignment, download your notebook and the dataset, zip the dataset and notebook, and submit the zipped file on blackboard.
- Make sure the notebook is named in the format - Final_LastName_FirstName. If you are submitting a zipped file, please name the file as well in the same format.
- Please include the link to access your blog and your github repo in your notebook.
- Also include the link to your notebook, github repo and the blog in the submission on blackboard. Please ensure the TAs have the required access to your notebooks and the github repo.

**Note - If the dataset is too large to be zipped and submitted on blackboard, only submit your notebook, add your dataset to your google drive and share a link to the file in your notebook.**

Github repo - 

In [5]:
import tensorflow as tf
gpus = tf.config.list_physical_devices('GPU')

In [6]:
import numpy as np # linear algebra
import pandas as pd # data processing
import matplotlib.pyplot as plot
from sklearn.model_selection import train_test_split
import seaborn as sns
import matplotlib.image as mpimg
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import EarlyStopping, ReduceLROnPlateau
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

In [7]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

In [9]:
# read data from kaggle
df = pd.read_csv('train.csv')
df1 = pd.read_csv('test.csv')

In [10]:
#test train split
X_train, X_test, y_train, y_test = train_test_split(df.drop('label',axis=1), df['label'], test_size=0.2, random_state=42)

In [11]:
#data pre processing

X_train = np.array(X_train)
X_train = np.array_split(X_train, len(X_train))
for t in range(len(X_train)):
    X_train[t] = np.reshape(X_train[t], (28,28))
X_train = np.expand_dims(X_train,-1)
np.shape(X_train)

(18817, 28, 28, 1)

In [12]:
X_test = np.array(X_test)
X_test = np.array_split(X_test, len(X_test))
for t in range(len(X_test)):
    X_test[t] = np.reshape(X_test[t], (28,28))
X_test = np.expand_dims(X_test,-1)
np.shape(X_test)

(4705, 28, 28, 1)

In [14]:
#data augmentation

tgen = ImageDataGenerator( rescale=1./255,
      rotation_range=10,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.01,
      zoom_range=0.2,
      horizontal_flip=False,
      fill_mode='nearest'
                                  )
vgen = ImageDataGenerator( rescale=1./255,
      rotation_range=30,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.01,
      zoom_range=0.2,
      horizontal_flip=False,
      fill_mode='nearest'
                                  )

batchSize = 128
train_generator = tgen.flow(X_train,y_train,batch_size=batchSize)

val_generator = vgen.flow(X_test, y_test,batch_size=batchSize)


In [15]:
#model definition CNN model


model = tf.keras.Sequential([
tf.keras.layers.Conv2D(64,(3,3),input_shape = (28,28,1), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(), 
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Dropout(.3),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(.3),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(), 
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(.3),
tf.keras.layers.Conv2D(64,(3,3), activation = 'relu',padding = 'Same'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(.3),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation = 'relu'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dense(256, activation = 'relu'),
tf.keras.layers.Dropout(.3),
tf.keras.layers.Dense(256, activation = 'relu'),
tf.keras.layers.BatchNormalization(),
tf.keras.layers.Dropout(.3),
tf.keras.layers.Dense(10, activation = 'softmax')
])
    
    
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 28, 28, 64)        640       
_________________________________________________________________
batch_normalization (BatchNo (None, 28, 28, 64)        256       
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 28, 28, 64)        36928     
_________________________________________________________________
batch_normalization_1 (Batch (None, 28, 28, 64)        256       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 28, 28, 64)        36928     
_________________________________________________________________
batch_normalization_2 (Batch (None, 28, 28, 64)        256       
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 28, 28, 64)        3

In [16]:
class myCallback(tf.keras.callbacks.Callback):
        def on_epoch_end(self, epoch, logs={}):
            if(logs.get('val_accuracy')>0.99):
                print("Reached 99% val_accuracy so cancelling training!")
                self.model.stop_training = True


callbacks = myCallback()

In [17]:

call_back = keras.callbacks.EarlyStopping(monitor='val_acc', min_delta=0, patience=5, verbose=0, restore_best_weights=True)

In [18]:
lr_list = [0.001,0.01,0.1]

In [None]:
#Cross validation

for lr in lr_list:
    ad  = tf.keras.optimizers.RMSprop(lr=lr, rho=0.9, epsilon=1e-08, decay=0.0)
    model.compile(optimizer = ad, metrics = ['accuracy'], loss = 'sparse_categorical_crossentropy')
    history = model.fit(train_generator,verbose=2,epochs =10,validation_data = val_generator, callbacks = [callbacks])
        

Epoch 1/10
148/148 - 385s - loss: nan - accuracy: 0.5699 - val_loss: nan - val_accuracy: 0.0956
Epoch 2/10
148/148 - 380s - loss: nan - accuracy: 0.0967 - val_loss: nan - val_accuracy: 0.0956
Epoch 3/10
148/148 - 382s - loss: nan - accuracy: 0.0967 - val_loss: nan - val_accuracy: 0.0956
Epoch 4/10
148/148 - 381s - loss: nan - accuracy: 0.0967 - val_loss: nan - val_accuracy: 0.0956
Epoch 5/10


In [None]:
#we get best result for lr=0.001

In [None]:
import matplotlib.image as mpimg
import random 
k = np.array(X_train)
n = random.randint(0,100)
t = k[n]
t = np.reshape(t,(28,28))
#t = np.expand_dims(t,-1)
g = plot.imshow(t,cmap = 'gray')

In [None]:
np.argmax(model.predict(np.expand_dims(X_train[n],0)))

In [None]:
from keras import models
layers_outputs = [layer.output for layer in model.layers[:20]]
activation_model = models.Model(inputs=model.input, outputs=layers_outputs) 

In [None]:
X_train[9].shape

In [None]:
k=random.randint(0,100)
img_tensor = X_train[k]
img_tensor = np.expand_dims(img_tensor,0)
activations = activation_model.predict(img_tensor) 
n =17
t = np.reshape(X_train[k],(28,28))
#fig, axs = plot.subplots(n+1,1)
plot.matshow(t, cmap='gray')
for h in range(n):
    first_layer_activation = activations[h]
    print(first_layer_activation.shape)
    plot.matshow(first_layer_activation[0, :, :, 4], cmap='gray')

In [None]:
print(history.history)


In [None]:
g = sns.lineplot(history.epoch[:],history.history['accuracy'][:],label = 'accuracy')
g.set(xlabel = 'epoc',ylabel  = 'acc')
sns.lineplot(history.epoch[:],history.history['val_accuracy'][:], label = 'val_accuracy')


In [None]:
#epocs vs loss graphs
g = sns.lineplot(history.epoch[2:],history.history['loss'][2:],label = 'loss')
g.set(xlabel = 'epoc',ylabel  = 'loss')
sns.lineplot(history.epoch[2:],history.history['val_loss'][2:], label = 'val_loss')


In [None]:
df1 = df1/255.0
X_test = df1
X_test = np.array(X_test)
X_test = np.array_split(X_test, len(X_test))
for t in range(len(X_test)):
    X_test[t] = np.reshape(X_test[t], (28,28))
X_test = np.expand_dims(X_test,-1)

k = model.predict(X_test)

k = np.argmax(k,axis = 1)

In [None]:
sample = pd.read_csv('sample_submission.csv')
np.size(k)
sample['Label'] = k
sample.to_csv('sol.csv',index = False)