## Dog Breed Classification

In this project we will use traditional CNN, CNN with data augmentation and finally transfer Learning by VGG16 model with weights pre-trained on Imagenet to solve the dog breed classification problem

### Load Dataset Files

In [0]:
%tensorflow_version 2.x
import cv2
import numpy as np
import pandas as pd
import tensorflow as tf
from tqdm import tqdm
from zipfile import ZipFile
from google.colab import drive
from matplotlib import pyplot as plt
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import load_img, img_to_array
from tensorflow.keras.preprocessing.image import ImageDataGenerator

In [0]:
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


Now, upload the given dataset file shared with you in your google drive and give its path for the below given `project_path` variable. For example, a path is given below according to the file path in our google drive. You need to change this to match the path of yours.

In [0]:
project_path = "/content/gdrive/My Drive/Colab Notebooks/Data/"
img_rows = 224
img_cols = 224

Run the below code to extract all the images in the train.zip files given in the dataset. We are going to use these images as train and validation sets and their labels in further steps.

In [0]:
#with ZipFile(project_path+"train-dog-breed.zip", 'r') as z:
#  z.extractall(project_path)

Repeat the same step for test.zip

In [0]:
#with ZipFile(project_path+"test-dog-breed.zip", 'r') as z:
#  z.extractall(project_path)

Repeat the same step for sample_submission.csv.zip

In [0]:
#with ZipFile(project_path+"sample_submission-dog-breed.csv.zip", 'r') as z:
#  z.extractall(project_path)

Repeat the same step for labels.csv.zip

In [0]:
#with ZipFile(project_path+"labels-dog-breed.csv.zip", 'r') as z:
#  z.extractall(project_path)

After this process, we will have 4 files - Train folder, test folder and labels.csv and sample_submission.csv as part of your google drive

### Read labels.csv file using pandas

In [0]:
labels = pd.read_csv(project_path + 'labels.csv', encoding = "ISO-8859-1" )

### Print the count of each category of Dogs given in the dataset



In [0]:
labels.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10222 entries, 0 to 10221
Data columns (total 2 columns):
id       10222 non-null object
breed    10222 non-null object
dtypes: object(2)
memory usage: 159.8+ KB


In [0]:
print(labels.breed.value_counts())

scottish_deerhound      126
maltese_dog             117
afghan_hound            116
entlebucher             115
bernese_mountain_dog    114
                       ... 
brabancon_griffon        67
golden_retriever         67
komondor                 67
briard                   66
eskimo_dog               66
Name: breed, Length: 120, dtype: int64


### Get one-hot encodings of labels

In [0]:
le = preprocessing.LabelEncoder()

In [0]:
labels.breed = le.fit_transform(labels.breed)

In [0]:
labels[0:5]

Unnamed: 0,id,breed
0,000bec180eb18c7604dcecc8fe0dba07,19
1,001513dfcb2ffafc82cccf4d8bbaba97,37
2,001cdf01b096e06d78e9e5112d419397,85
3,00214f311d5d2247d5dfe4fe24b2303d,15
4,0021f9ceb3235effd7fcde7f7538ed62,49


## Preparing training dataset
1. Write a code which reads each and every id from labels.csv file and loads the corresponding image (in RGB - 128, 128, 3) from the train folder. <br>
2. Create 2 variables <br> 
     a.  x_train - Should have all the images of the dogs from train folder <br>
     b.  y_train - Corresponding label of the dog <br>
<u>Note:</u> The id of the dog images and its corresponding labels are available in labels.csv file   
<u>Hint:</u> Watch the video shared on "Preparing the training dataset" if you face issue on creating the training dataset

In [0]:
#!ls "./train/"

In [0]:
labels.shape

(10222, 2)

In [0]:
x_train = []
y_train = []
for ii, label in tqdm(labels.iterrows()):
  train_image = cv2.imread(project_path + 'train/{}.jpg'.format(label.id), 1)
  train_image_resize = cv2.resize(train_image, (img_rows, img_cols))
  x_train.append(train_image_resize.astype('float32') / 255)
  y_train.append(label.breed)

[1;30;43mStreaming output truncated to the last 5000 lines.[0m

7724it [12:03,  6.70it/s][A[A

7725it [12:04,  7.06it/s][A[A

7726it [12:04,  7.37it/s][A[A

7727it [12:04,  7.45it/s][A[A

7728it [12:04,  7.55it/s][A[A

7729it [12:04,  7.58it/s][A[A

7730it [12:04,  6.66it/s][A[A

7731it [12:04,  7.15it/s][A[A

7732it [12:05,  5.72it/s][A[A

7733it [12:05,  5.77it/s][A[A

7734it [12:05,  5.06it/s][A[A

7735it [12:05,  5.74it/s][A[A

7736it [12:05,  6.42it/s][A[A

7737it [12:05,  6.48it/s][A[A

7738it [12:06,  6.41it/s][A[A

7739it [12:06,  6.48it/s][A[A

7740it [12:06,  6.78it/s][A[A

7741it [12:06,  6.98it/s][A[A

7742it [12:06,  7.49it/s][A[A

7743it [12:06,  6.15it/s][A[A

7744it [12:06,  6.73it/s][A[A

7745it [12:07,  6.90it/s][A[A

7746it [12:07,  6.54it/s][A[A

7747it [12:07,  6.43it/s][A[A

7748it [12:07,  6.88it/s][A[A

7749it [12:07,  7.05it/s][A[A

7750it [12:07,  7.01it/s][A[A

7751it [12:07,  7.18it/s][A[A

7752it [12

Normalize the training data and convert into 4 dimensions so that it can be used as an input to conv layers in the model

In [0]:
x_train_data = np.asarray(x_train)

In [0]:
y_train_data = pd.get_dummies(y_train)
y_train_data = np.asarray(y_train_data)

### Split the training and validation data from `x_train_data` and `y_train_data` obtained from above step

In [0]:
x_train_data.shape, y_train_data.shape

((10222, 224, 224, 3), (10222, 120))

In [0]:
x_train_data, x_test_data, y_train_data, y_test_data = train_test_split(x_train_data, y_train_data, test_size=0.2, random_state=42)

In [0]:
x_train_data.shape, y_train_data.shape, x_test_data.shape, y_test_data.shape

((8177, 224, 224, 3), (8177, 120), (2045, 224, 224, 3), (2045, 120))

### Build a basic conv neural network with 2 conv layers (kernel sizes - 5 and 3) add layers as mentioned below for classification.

1. Add a Dense layer with 256 neurons with `relu` activation

2. Add a Dense layer with 120 neurons as final layer (as there are 120 classes in the given dataset) with `softmax` activation for classifiaction. 

In [0]:
# Clear the session before building the model
tf.keras.backend.clear_session()

# Initialize model, reshape & normalize data
model = tf.keras.models.Sequential()

#Normalize the data
model.add(tf.keras.layers.BatchNormalization())

#Add first convolutional layer
model.add(tf.keras.layers.Conv2D(32, kernel_size=(5,5), activation='relu'))

#Normalize the data
model.add(tf.keras.layers.BatchNormalization())

#Add second convolutional layer
model.add(tf.keras.layers.Conv2D(32, kernel_size=(3,3), activation='relu'))

#Flatten the output
model.add(tf.keras.layers.Flatten())

#First Dense layer
model.add(tf.keras.layers.Dense(256, activation='relu'))

#Second Dense layer
model.add(tf.keras.layers.Dense(120, activation='softmax'))

In [0]:
#Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

### Use batch_size = 128 and epochs = 10 and execute the model

In [0]:
#Fit the first model with the given dataset
model.fit(x_train_data, y_train_data, validation_data=(x_test_data, y_test_data), epochs=10, batch_size=128)

Train on 8177 samples, validate on 2045 samples
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f417000d2b0>

#The model accuracy is very poor !!!!

### Use Data Augmentation in the above model to see if the accuracy improves


In [0]:
datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True)

In [0]:
datagen.fit(x_train_data)

### Using the above objects, create the image generators with variable names `train_generator` and `val_generator`

You need to use train_datagen.flow() and val_datagen.flow()

In [0]:
train_generator = datagen.flow(x_train_data, y_train_data)

In [0]:
val_generator = datagen.flow(x_test_data, y_test_data)

### Fit the model using fit_generator() using `train_generator` and `val_generator` from the above step with 10 epochs

fit_generator() is being deprecated and simple fit() will work now. So, using this instead of fit_generator()


In [0]:
model.fit(train_generator, validation_data=val_generator, steps_per_epoch=128, epochs=10)

  ...
    to  
  ['...']
  ...
    to  
  ['...']
Train for 128 steps, validate for 64 steps
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<tensorflow.python.keras.callbacks.History at 0x7f415c30e0f0>

# Model accuracy is still poor!!!

### Lets use Transfer Learning

Download the vgg wieght file from here : https://github.com/MinerKasch/applied_deep_learning/blob/master/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

Use the below code to load VGG16 weights trained on ImageNet

In [0]:
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
# Instantiate the model with the pre-trained weights (no top)
base_model= VGG16(weights=(project_path+'vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'),
                 include_top=False, pooling='avg')

### Add the following classification layers to the imported VGG Model <br>
1. Flatten Layer
2. Dense layer with 1024 neurons with activation as Relu
3. Dense layer with 256 neurons with activation as Relu
4. Dense layer with 120 neurons with activation as Softmax

In [0]:
# Clear the session before building the model
tf.keras.backend.clear_session()

#Create a new model
model_new = tf.keras.models.Sequential()

#Add the base VGG model
model_new.add(base_model)

#Flatten the output
model_new.add(tf.keras.layers.Flatten())

#First Dense layer
model_new.add(tf.keras.layers.Dense(1024, activation='relu'))

model_new.add(tf.keras.layers.Dropout(0.25))

#Second Dense layer
model_new.add(tf.keras.layers.Dense(256, activation='relu'))

model_new.add(tf.keras.layers.Dropout(0.25))

#Third Dense layer
model_new.add(tf.keras.layers.Dense(120, activation='softmax'))

### Make all the layers in the base_model (VGG16) to be non-trainable

In [0]:
base_model.summary()

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, None, None, 3)]   0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0     

### Freezing the base_model(VGG16) layers to be non-trainable. Opening the last CNN layer as trainable (to improve the accuracy)

In [0]:
for layer in base_model.layers:
    layer.trainable = False
    if("block5_conv3" in layer.name):
        print(layer.name)
        layer.trainable = True

block5_conv3


In [0]:
base_model.summary()

Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, None, None, 3)]   0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, None, None, 64)    1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, None, None, 64)    36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, None, None, 64)    0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, None, None, 128)   73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, None, None, 128)   147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, None, None, 128)   0     

In [0]:
model_new.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
vgg16 (Model)                (None, 512)               14714688  
_________________________________________________________________
flatten (Flatten)            (None, 512)               0         
_________________________________________________________________
dense (Dense)                (None, 1024)              525312    
_________________________________________________________________
dropout (Dropout)            (None, 1024)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 256)               262400    
_________________________________________________________________
dropout_1 (Dropout)          (None, 256)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 120)               3

### Fit and compile the model with batch_size = 128 and epochs = 10 and execute the model

Try to get training and validation accuracy to be more than 90%

In [0]:
#Compile the model
model_new.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [0]:
#Saving the best model using model checkpoint callback
model_checkpoint=tf.keras.callbacks.ModelCheckpoint('cv_project_model.h5', 
                                                    save_best_only=True, 
                                                    monitor='accuracy', 
                                                    mode='max', 
                                                    verbose=1)

In [0]:
#Fit the first model with the given dataset
model_new.fit(train_generator, validation_data=val_generator,
              steps_per_epoch=150, epochs=50, callbacks=[model_checkpoint])

  ...
    to  
  ['...']
  ...
    to  
  ['...']
Train for 150 steps, validate for 64 steps
Epoch 1/50
Epoch 00001: accuracy improved from -inf to 0.04138, saving model to cv_project_model.h5
Epoch 2/50
Epoch 00002: accuracy improved from 0.04138 to 0.16396, saving model to cv_project_model.h5
Epoch 3/50
Epoch 00003: accuracy improved from 0.16396 to 0.30470, saving model to cv_project_model.h5
Epoch 4/50
Epoch 00004: accuracy improved from 0.30470 to 0.38600, saving model to cv_project_model.h5
Epoch 5/50
Epoch 00005: accuracy improved from 0.38600 to 0.44911, saving model to cv_project_model.h5
Epoch 6/50
Epoch 00006: accuracy improved from 0.44911 to 0.47858, saving model to cv_project_model.h5
Epoch 7/50
Epoch 00007: accuracy improved from 0.47858 to 0.52957, saving model to cv_project_model.h5
Epoch 8/50
Epoch 00008: accuracy improved from 0.52957 to 0.55354, saving model to cv_project_model.h5
Epoch 9/50
Epoch 00009: accuracy improved from 0.55354 to 0.56698, saving model to cv_

## Because of large image data, I observed frequent crashes, insufficient RAM and multiple re-exuection during model tuning.

## Due to limited computing resources, I am stopping at the model fitment. Even at this stage, the notebook crashed at 43/50 epoch and I am submitting the file as is.

## The best training accuracy, I got it 0.8693 and the best validation accuracy was 0.5721

## I am skipping the below section of predicting the x_test_features. Due to limited computing resources, I couldn't move ahead beyond this point after multiple attempts

### Loading the test data
Read the id column from the samples_submission.csv and store it in test_img

In [0]:
submission = pd.read_csv(project_path + 'sample_submission.csv', encoding = "ISO-8859-1")
test_img = submission.id

Run the below code to load the test image files in x_test_feature

In [0]:
x_test_feature = []
i = 0 # initialisation
for f in tqdm(test_img.values): # f for format ,jpg
    test_image = cv2.imread(project_path + 'test/{}.jpg'.format(f), 1)
    test_image_resize = cv2.resize(test_image, (img_rows, img_cols))
    x_test_feature.append(test_image_resize.astype('float32') / 255)

100%|██████████| 10357/10357 [1:40:06<00:00,  2.18it/s]


Normalize the test data and convert it into 4 dimensions

In [0]:
x_test_feature = np.asarray(x_test_feature)

In [0]:
x_test_feature.shape

In [0]:
predict = model_new.predict(x_test_feature)

In [0]:
predict[0:10]