## This notebook is an assessment test for Machine Learning Engineer position at Startup Crafters, basic skill test.

# **PROBLEM STATEMENT**
#### Aim: The task is to train a neural network model on the provided training data and reach the maximum accuracy you can achieve. Dataset consists of 2400 total images distributed across the training, validation and test set.
#### The dataset consists of three folders training (1,440 images), validation (480 images) and testing dataset (480 images).
#### The images are divided into six classes, 
   1. Bicycle
   2. Boat
   3. Cat
   4. Motorbike
   5. People
   6. Table

# **DATA PREPROCESSING**
#### 1. Importing Libraries

In [None]:
from keras.layers import MaxPool2D,Flatten,Dense,Dropout
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from keras.applications.inception_v3 import InceptionV3
from keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.applications import EfficientNetB3
from tensorflow.keras.preprocessing import image
from sklearn.preprocessing import LabelEncoder
from keras.callbacks import ModelCheckpoint
from keras.callbacks import EarlyStopping
from keras.preprocessing.image import *
from keras.utils import np_utils
import matplotlib.pylab as plt
import tensorflow as tf
import pandas as pd
import numpy as np
import cv2
import os


#### 2. Data fetching


In [None]:
os.chdir("../input/multiclassification-dataset-6classes/AI test") #setting working directory
#taking file path and labels
train = ImageDataGenerator(rescale=1/255)
val = ImageDataGenerator(rescale=1/255)
test = ImageDataGenerator(rescale=1/255)

train_dataset = train.flow_from_directory('../input/multiclassification-dataset-6classes/AI test/Training',target_size=(300,300),batch_size=1) #Fetching training images
validation_dataset = val.flow_from_directory('validation',target_size=(300,300),batch_size=1) #Fetching validation images
testing_dataset = test.flow_from_directory('testing',target_size=(300,300)) #Fetching testing data


#### 3. Image Showing using CV2

In [None]:
#Image showing
train_labels= train_dataset.classes
classes = train_dataset.class_indices
img = train_dataset[0][0] #Change the index to see different image in training set
img = np.reshape(img,(300,300,3))
cv2.imshow("image_instance",img)
cv2.waitKey(0) # wait for ay key to exit window
cv2.destroyAllWindows() # close all windows

###### Herein, In this step, Bicycle is coded as 0, Boat as 1, Cat as 2, Motorbike as 3, People as 4 and Table as 5.

# **MODEL DEVELOPMENT**
#### 1. Model-1 (Simplest deep CNN model)

In [None]:

#Model-1
model = Sequential()
model.add(Conv2D(128,(3,3),activation='swish',input_shape=(300,300,3)))
model.add(MaxPool2D(2,2))
model.add(Conv2D(64,(3,3),activation='swish'))
model.add(MaxPool2D(2,2))
model.add(Conv2D(32,(3,3),activation='swish'))
model.add(MaxPool2D(2,2))
model.add(Conv2D(16,(3,3),activation='swish'))
model.add(MaxPool2D(2,2))
model.add(Flatten())
model.add(Dense(264,activation='swish'))
model.add(Dense(6,activation='softmax'))
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['acc'])
epochs=20
model.summary() 

##### The accuracy on this model is ~0.30 on validation dataset, training set its 0.97 in 100 epochs. 

#### 2. Model-2 (Pre-trained InceptionV3 using imagenet features)


In [None]:
transfered=InceptionV3(include_top=False,weights='imagenet',input_tensor=None,input_shape=(None,None,3),pooling='avg',classes=6)
model=Sequential()
model.add((InputLayer(None,None,3)))
model.add(transfered)
model.add(Dropout(0.1))
model.add(Dense(6,activation='softmax'))
transfered.trainable=False
model.compile(optimizer=optim,loss='categorical_crossentropy',metrics=['acc'])
epochs=10
model.summary()

##### The accuracy on this model is ~0.3975 on validation dataset and 0.98 on training dataset in 100 epochs.

#### 3. Model-3 (Pre-trained EfficientNetB3 using imagenet features)


In [None]:
transfered=EfficientNetB3(include_top=False,weights='imagenet',input_shape=(None,None,3),classes=6)
model=Sequential()
model.add((InputLayer(300,300)))
model.add(transfered)
model.add(Dropout(0.2))
model.add(Dense(6,activation='softmax'))
transfered.trainable=False
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['acc'])
batch_size=10
epochs=10
model.summary()

##### The accuracy on this model is ~0.32 on validation dataset and 0.70 on training dataset in 100 epochs.

## **MODEL FITTING**

In [None]:
model_fit = model.fit(train_dataset,epochs=epochs,validation_data=validation_dataset,callbacks=[ModelCheckpoint(filepath='./weights.hdf5',monitor='val_acc',verbose=1,save_best_only=True)])

# **MODEL PREDICTION**

In [None]:
#Model Prediction
images = []
for img in os.listdir('Testing'):
    img = os.path.join('Testing', img)
    img = image.load_img(img, target_size=(300, 300))
    img = image.img_to_array(img)
    img = np.expand_dims(img, axis=0)
    images.append(img)

# stack up images list to pass for prediction
images = np.vstack(images)
classes_test = model.predict_classes(images, batch_size=10)
print(classes_test)

# Notes
i. I have used three models for this dataset and received respective accuracy, I see inceptionv3 is working better in these cases. 

ii. Due to time limit, i used few epochs with grid search hyperparamterization but can be increase using hyperparametrization techniques and ensembles.

iii. I used my laptop GPU for training and tuning, GTX 1050 Ti MSI GL63.