[Reference1](https://towardsdatascience.com/build-your-first-computer-vision-project-dog-breed-classification-a622d8fc691e) <br>
[Reference2](https://medium.com/@RaghavPrabhu/understanding-of-convolutional-neural-network-cnn-deep-learning-99760835f148) <br>
[Reference3](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html) <br>
[Reference4](https://github.com/tuanchris/dog-project/blob/master/dog_app.ipynb)

You can download the dataset [here](https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/dogImages.zip).

# Prepare the data

In [1]:
import os
from google.colab import drive
drive.mount('/content/gdrive')
!pwd
os.chdir('gdrive/My Drive/Medium')
!pwd

Mounted at /content/gdrive
/content
/content/gdrive/My Drive/Medium


In [2]:
from sklearn.datasets import load_files       
from keras.utils import np_utils
import numpy as np
from glob import glob

# define function to load train, test, and validation datasets
def load_dataset(path):
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets
    
# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/valid')
test_files, test_targets = load_dataset('dogImages/test')

In [3]:
train_files

array(['dogImages/train/095.Kuvasz/Kuvasz_06442.jpg',
       'dogImages/train/057.Dalmatian/Dalmatian_04054.jpg',
       'dogImages/train/088.Irish_water_spaniel/Irish_water_spaniel_06014.jpg',
       ..., 'dogImages/train/029.Border_collie/Border_collie_02069.jpg',
       'dogImages/train/046.Cavalier_king_charles_spaniel/Cavalier_king_charles_spaniel_03261.jpg',
       'dogImages/train/048.Chihuahua/Chihuahua_03416.jpg'], dtype='<U99')

In [4]:
train_targets

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]], dtype=float32)

In [6]:
# load list of dog names
dog_names = [item[20:-1] for item in sorted(glob("dogImages/train/*/"))]

# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training dog images.' % len(train_files))
print('There are %d validation dog images.' % len(valid_files))
print('There are %d test dog images.'% len(test_files))

There are 133 total dog categories.
There are 8351 total dog images.

There are 6680 training dog images.
There are 835 validation dog images.
There are 836 test dog images.


In [7]:
from keras.preprocessing import image                  
from tqdm import tqdm

def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

In [8]:
def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)

# Pre-process data

In [9]:
from PIL import ImageFile                            
ImageFile.LOAD_TRUNCATED_IMAGES = True

# pre-process the data for Keras
train_tensors = paths_to_tensor(train_files).astype('float32')/255
valid_tensors = paths_to_tensor(valid_files).astype('float32')/255
test_tensors = paths_to_tensor(test_files).astype('float32')/255

100%|██████████| 6680/6680 [01:15<00:00, 88.11it/s]
100%|██████████| 835/835 [00:08<00:00, 95.39it/s] 
100%|██████████| 836/836 [00:08<00:00, 102.02it/s]


## There are numerous other real-world factors that can affect our model:

- Lighting condition: different lightings change how colors are displayed
- Object orientation: our dogs can help many different poses
- Picture frame: a close-up portrait frame is very different than full body-shot
- Missing features: not all features of a dog is shown in a photo

# Create a model of your own

The objective is to predict the breed of our dog image correctly

## Model architecture

In [10]:
from keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.layers import Dropout, Flatten, Dense
from keras.models import Sequential

model = Sequential()

model.add(Conv2D(filters=16, kernel_size=2, padding='same',activation='relu',input_shape=(224,224,3)))
model.add(MaxPooling2D())
model.add(Conv2D(filters=32, kernel_size=2, padding='same',activation='relu'))
model.add(MaxPooling2D())
model.add(Conv2D(filters=64, kernel_size=2, padding='same',activation='relu'))
model.add(MaxPooling2D())
model.add(GlobalAveragePooling2D())
model.add(Dense(133,activation='softmax'))

model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              (None, 224, 224, 16)      208       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 112, 112, 16)      0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 112, 112, 32)      2080      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 56, 56, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 56, 56, 64)        8256      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 28, 28, 64)        0         
_________________________________________________________________
global_average_pooling2d (Gl (None, 64)                0

## Compile and train the model

In [11]:
model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])

In [16]:
from keras.callbacks import ModelCheckpoint

epochs = 5

checkpointer = ModelCheckpoint(filepath='weights.best.from_scratch.hdf5', 
                               verbose=1, save_best_only=True)

model.fit(train_tensors, train_targets, 
          validation_data=(valid_tensors, valid_targets),
          epochs=epochs, batch_size=20, callbacks=[checkpointer], verbose=1)

Epoch 1/5
Epoch 00001: val_loss improved from inf to 4.76818, saving model to weights.best.from_scratch.hdf5
Epoch 2/5
Epoch 00002: val_loss improved from 4.76818 to 4.74761, saving model to weights.best.from_scratch.hdf5
Epoch 3/5
Epoch 00003: val_loss improved from 4.74761 to 4.73671, saving model to weights.best.from_scratch.hdf5
Epoch 4/5
Epoch 00004: val_loss improved from 4.73671 to 4.71049, saving model to weights.best.from_scratch.hdf5
Epoch 5/5
Epoch 00005: val_loss improved from 4.71049 to 4.69433, saving model to weights.best.from_scratch.hdf5


<tensorflow.python.keras.callbacks.History at 0x7faaaef1bdd8>

In [17]:
model.load_weights('weights.best.from_scratch.hdf5')

In [18]:
# get index of predicted dog breed for each image in test set
dog_breed_predictions = [np.argmax(model.predict(np.expand_dims(tensor, axis=0))) for tensor in test_tensors]

# report test accuracy
test_accuracy = 100*np.sum(np.array(dog_breed_predictions)==np.argmax(test_targets, axis=1))/len(dog_breed_predictions)
print('Test accuracy: %.4f%%' % test_accuracy)

Test accuracy: 3.2297%


# Create a model using transfer learning

## Obtain bottleneck features
Keras offers the following pre-trained state-of-the-art architectures that you can use in minutes: VGG-19, ResNet-50, Inception, and Xception. In this project, we will use ResNet-50, but you can try out other architecture on your own.

In [20]:
import requests
url = 'https://s3-us-west-1.amazonaws.com/udacity-aind/dog-project/DogResnet50Data.npz'
r = requests.get(url)

with open('DogResnet50Data.npz', 'wb') as f:
    f.write(r.content)

bottleneck_features = np.load('DogResnet50Data.npz')
train_Resnet50 = bottleneck_features['train']
valid_Resnet50 = bottleneck_features['valid']
test_Resnet50 = bottleneck_features['test']

In transfer learning, we take a pre-trained model (network + weights), and then removes the FC network, and construct our own in place of it. Doing so, it will remove the pre-trained weights at those layers. Now the original network before the FC network is frozen so that, when the training is start on the new data set, only the newly added FC network are trained. Here the input to the FC network is called the bottleneck features.

## Model architecture

In [21]:
Resnet50_model = Sequential()
Resnet50_model.add(GlobalAveragePooling2D(input_shape=train_Resnet50.shape[1:]))
Resnet50_model.add(Dropout(0.3))
Resnet50_model.add(Dense(1024,activation='relu'))
Resnet50_model.add(Dropout(0.4))
Resnet50_model.add(Dense(133, activation='softmax'))
Resnet50_model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
global_average_pooling2d_1 ( (None, 2048)              0         
_________________________________________________________________
dropout (Dropout)            (None, 2048)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 1024)              2098176   
_________________________________________________________________
dropout_1 (Dropout)          (None, 1024)              0         
_________________________________________________________________
dense_2 (Dense)              (None, 133)               136325    
Total params: 2,234,501
Trainable params: 2,234,501
Non-trainable params: 0
_________________________________________________________________


In [22]:
### TODO: Compile the model.
Resnet50_model.compile(loss='categorical_crossentropy', optimizer='rmsprop', metrics=['accuracy'])

In [23]:
### TODO: Train the model.
checkpointer = ModelCheckpoint(filepath='weights.best.Resnet50.hdf5', 
                               verbose=1, save_best_only=True)

Resnet50_model.fit(train_Resnet50, train_targets, 
          validation_data=(valid_Resnet50, valid_targets),
          epochs=20, batch_size=20, callbacks=[checkpointer], verbose=1)

Epoch 1/20
Epoch 00001: val_loss improved from inf to 0.91071, saving model to weights.best.Resnet50.hdf5
Epoch 2/20
Epoch 00002: val_loss improved from 0.91071 to 0.84793, saving model to weights.best.Resnet50.hdf5
Epoch 3/20
Epoch 00003: val_loss improved from 0.84793 to 0.83980, saving model to weights.best.Resnet50.hdf5
Epoch 4/20
Epoch 00004: val_loss improved from 0.83980 to 0.79796, saving model to weights.best.Resnet50.hdf5
Epoch 5/20
Epoch 00005: val_loss improved from 0.79796 to 0.75239, saving model to weights.best.Resnet50.hdf5
Epoch 6/20
Epoch 00006: val_loss did not improve from 0.75239
Epoch 7/20
Epoch 00007: val_loss did not improve from 0.75239
Epoch 8/20
Epoch 00008: val_loss did not improve from 0.75239
Epoch 9/20
Epoch 00009: val_loss did not improve from 0.75239
Epoch 10/20
Epoch 00010: val_loss did not improve from 0.75239
Epoch 11/20
Epoch 00011: val_loss did not improve from 0.75239
Epoch 12/20
Epoch 00012: val_loss did not improve from 0.75239
Epoch 13/20
Epoch

<tensorflow.python.keras.callbacks.History at 0x7faaae6e5b00>

In [24]:
### TODO: Load the model weights with the best validation loss.
Resnet50_model.load_weights('weights.best.Resnet50.hdf5')

## Model performance
Using transfer learning (ResNet-50), with twenty epochs and less than two minutes, we have achieved a test accuracy of 80.8612%.

In [25]:
# get index of predicted dog breed for each image in test set
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_Resnet50]

# report test accuracy
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print('Test accuracy: %.4f%%' % test_accuracy)

Test accuracy: 79.9043%


# Future improvements
- Augment data
- Tune model
- Try other models
- Create a web/mobile app