![alt text](https://res.cloudinary.com/dk-find-out/image/upload/q_80,w_1920,f_auto/Dog-main_gdcdzd.jpg)

<center>

# Dog Breed Classifier

**Overview**

The aim of the project is to identify a breed of dog if a photo is given as input. If the photo contains a human face (or alien face), then the application will return the breed of dog that most resembles this person.

In this project I have used Convolutional Neural Networks (CNNs)! A pipeline is built to process real-world, user-supplied images. Given an image of a dog, the algorithm will identify an estimate of the canine’s breed. If supplied an image of a human, the code will identify the resembling dog breed.


**Objective**

Building a model to classify between 133 different breeds of dogs and identify them

**The Road Ahead**

I break the notebook into separate steps to make the steps clear. The following is the steps that I followed during the project building time.

1. Import Datasets
2. Detect Humans
3. Detect Dogs
4. Create a CNN to Classify Dog Breeds (from Scratch)
5. Use a CNN to Classify Dog Breeds (using Transfer Learning)
6. Create a CNN to Classify Dog Breeds (using Transfer Learning)
7. Write your Algorithm
8. Test Your Algorithm

**Step 1: Import Datasets and necessary libraries - The datasets is provided by Udacity.**

In the code cell below, I imported the dataset of dog images. I populated a few variables through the use of the load_files function from the scikit-learn library:

* train_files, validation_files, test_files - numpy arrays containing file paths to images
* train_targets, validation_targets, test_targets - numpy arrays containing onehot-encoded classification labels
* dog_names - list of string-valued dog breed names for translating labels

In [1]:
import numpy as np
from glob import glob
from pathlib import Path
import random
import cv2                
import matplotlib.pyplot as plt                        
%matplotlib inline 
from tqdm import tqdm
import os

import tensorflow as tf
from tensorflow.keras.losses import categorical_crossentropy
from tensorflow.keras.optimizers import Adam

from sklearn.datasets import load_files    
from tensorflow.keras.applications.mobilenet import preprocess_input, decode_predictions 
from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator                  
from tensorflow.keras.layers import Conv2D, MaxPooling2D, GlobalAveragePooling2D
from tensorflow.keras.layers import Dropout, Flatten, Dense
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.callbacks import ModelCheckpoint 
from io import BytesIO
import requests

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

In [2]:
# define function to load train, test, and validation datasets
def load_dataset(path):
    from keras.utils import np_utils
    data = load_files(path)
    dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    return dog_files, dog_targets

In [3]:
#Defining Data Directories
basedir = Path(os.getcwd())
data_dir = Path(os.path.join(basedir,'Data'))
train_dir = Path(os.path.join(basedir,'Data/dogImages/train'))
valid_dir = Path(os.path.join(basedir,'Data/dogImages/valid'))
test_dir = Path(os.path.join(basedir,'Data/dogImages/test'))
algo_test_dir = Path(os.path.join(basedir,'test_images'))

In [4]:
# load train, test, and validation datasets
def get_datasets():
    train_files, train_targets = load_dataset(train_dir)
    valid_files, valid_targets = load_dataset(valid_dir)
    test_files, test_targets = load_dataset(test_dir)
    return train_files, train_targets, valid_files, valid_targets, test_files, test_targets

train_files, train_targets, valid_files, valid_targets, test_files, test_targets = get_datasets()

Using TensorFlow backend.


In [5]:
# load list of dog names
dog_names = [item[20:-1] for item in sorted(glob("Data/dogImages/train/*/"))]

# print statistics about the dataset
print('There are %d total dog categories.' % len(dog_names))
print('There are %s total dog images.\n' % len(np.hstack([train_files, valid_files, test_files])))
print('There are %d training dog images.' % len(train_files))
print('There are %d validation dog images.' % len(valid_files))
print('There are %d test dog images.'% len(test_files))

There are 133 total dog categories.
There are 8351 total dog images.

There are 6680 training dog images.
There are 835 validation dog images.
There are 836 test dog images.


**Import Human Dataset**

The follwoing cell imports dataset of human images, where the file paths are stored in the numpy array human_files.

In [None]:
random.seed(32)
#Assigning human dataset directory
human_dir = os.path.join(data_dir, 'lfw/*/*')

#Load names from, human dataset directory
human_files = np.array(glob(human_dir))
random.shuffle(human_files)

#Print ststistics about the dataset
print('There are %d total human images.' %len(human_files))

**Step 2: Detect Humans**

I have used OpenCV's implementation of <a href="https://docs.opencv.org/trunk/db/d28/tutorial_cascade_classifier.html"> Haar feature-based cascade classifiers </a>to detect human faces in images. OpenCV provides many pre-trained face detectors, stored as XML files on <a href="https://github.com/opencv/opencv/tree/master/data/haarcascades">Github</a>. I have downloaded one of these detectors and stored it in the haarcascades directory.

In the next code cell, I demonstrate how to use this detector to find human faces in a sample image.

In [None]:
# extract pre-trained face detector
face_cascade = cv2.CascadeClassifier('./Data/haarcascades/haarcascade_frontalface_alt2.xml')

# load color (BGR) image
img = cv2.imread(human_files[3])
# convert BGR image to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# find faces in image
faces = face_cascade.detectMultiScale(gray)

# print number of faces detected in the image
print('Number of faces detected:', len(faces))

# get bounding box for each detected face
for (x,y,w,h) in faces:
    # add bounding box to color image
    cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
    
# convert BGR image to RGB for plotting
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

# display the image, along with bounding box
plt.imshow(cv_rgb)
plt.show()

In [None]:
def face_detector(img_path):
    img = cv2.imread(img_path)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    faces = face_cascade.detectMultiScale(gray)
    return len(faces) > 0

In [None]:
human_files_short = human_files[:100]
dog_files_short = train_files[:100]

num_human_faces_human_files = 0
num_human_faces_dog_files = 0
for i in  tqdm(range(0,100)):
    num_human_faces_human_files += face_detector(human_files_short[i])
    num_human_faces_dog_files += face_detector(dog_files_short[i])    
    
## on the images in human_files_short and dog_files_short.
print('% of human faces detected in human files {:2.2%}\n % of human faces detected in dog files {:2.2%}'\
      .format(num_human_faces_human_files/100, num_human_faces_dog_files/100))

**Step 3: Detect Dogs**

In this section, I use a pre-trained <a href='http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006'>ResNet-50</a> model to detect dogs in images. My first line of code downloads the ResNet-50 model, along with weights that have been trained on <a href='http://ethereon.github.io/netscope/#/gist/db945b393d40bfa26006'>ImageNet</a>, a very large, very popular dataset used for image classification and other vision tasks. ImageNet contains over 10 million URLs, each linking to an image containing an object from one of <a href='https://github.com/ravi-gopalan/dog_breed_classifier_udacity/blob/master/dog_breed_classifier.ipynb'>1000 categories</a>. Given an image, this pre-trained ResNet-50 model returns a prediction (derived from the available categories in ImageNet) for the object that is contained in the image.

In [None]:
# define NASNetmobile model
NASNetmobile = tf.keras.applications.NASNetMobile(input_shape=(224, 224, 3),
                                                    include_top=True, 
                                                    weights='imagenet')

In [None]:
NASNetmobile.summary()

**Pre-process the Data**

When using TensorFlow as backend, Keras CNNs require a 4D array (which we'll also refer to as a 4D tensor) as input, with shape

$$
(\text{nb_samples}, \text{rows}, \text{columns}, \text{channels}),
$$
where nb_samples corresponds to the total number of images (or samples), and rows, columns, and channels correspond to the number of rows, columns, and channels for each image, respectively.

The path_to_tensor function below takes a string-valued file path to a color image as input and returns a 4D tensor suitable for supplying to a Keras CNN. The function first loads the image and resizes it to a square image that is 224×224 pixels. Next, the image is converted to an array, which is then resized to a 4D tensor. In this case, since we are working with color images, each image has three channels. Likewise, since we are processing a single image (or sample), the returned tensor will always have shape

$$
(1, 224, 224, 3).
$$
The paths_to_tensor function takes a numpy array of string-valued image paths as input and returns a 4D tensor with shape

$$
(\text{nbsamples},224,224,3).
$$
Here, nb_samples is the number of samples, or number of images, in the supplied array of image paths. It is best to think of nb_samples as the number of 3D tensors (where each 3D tensor corresponds to a different image) in your dataset!

In [6]:
def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in tqdm(img_paths)]
    return np.vstack(list_of_tensors)

Making Predictions with ResNet-50 Getting the 4D tensor ready for ResNet-50, and for any other pre-trained model in Keras, requires some additional processing. First, the RGB image is converted to BGR by reordering the channels. All pre-trained models have the additional normalization step that the mean pixel (expressed in RGB as [103.939,116.779,123.68] and calculated from all pixels in all images in ImageNet) must be subtracted from every pixel in each image. This is implemented in the imported function preprocess_input.

Now that we have a way to format our image for supplying to ResNet-50, we are now ready to use the model to extract the predictions. This is accomplished with the predict method, which returns an array whose 𝑖 -th entry is the model's predicted probability that the image belongs to the 𝑖 -th ImageNet category. This is implemented in the ResNet50_predict_labels function below.

By taking the argmax of the predicted probability vector, we obtain an integer corresponding to the model's predicted object class, which we can identify with an object category through the use of this dictionary.

In [None]:
def NASNetmobile_predict_labels(img_path):
    # returns prediction vector for image located at img_path
    img = preprocess_input(path_to_tensor(img_path))
    return np.argmax(NASNetmobile.predict(img))

**Write a Dog Detector**

While looking at the dictionary, you will notice that the categories corresponding to dogs appear in an uninterrupted sequence and correspond to dictionary keys 151-268, inclusive, to include all categories from 'Chihuahua' to 'Mexican hairless'. Thus, in order to check to see if an image is predicted to contain a dog by the pre-trained ResNet-50 model, we need only check if the ResNet50_predict_labels function above returns a value between 151 and 268 (inclusive).

We use these ideas to complete the dog_detector function below, which returns True if a dog is detected in an image (and False if not).

In [None]:
### returns "True" if a dog is detected in the image stored at img_path
def dog_detector(img_path):
    prediction = NASNetmobile_predict_labels(img_path)
    return ((prediction <= 268) & (prediction >= 151))

In [None]:
dog_detected = 0
for im in human_files_short:
    dog_detected += dog_detector(im)
    
print('% of images in human_files with a dog detected: {:2.2%}'.format(dog_detected/len(human_files_short)))

dog_detected = 0
for im in dog_files_short:
    dog_detected += dog_detector(im)
    
print('% of images in dog_files with a dog detected: {:2.2%}'.format(dog_detected/len(dog_files_short)))

In [7]:
from PIL import ImageFile
import timeit
ImageFile.LOAD_TRUNCATED_IMAGES = True                 


model_tensor_creation_time_start = timeit.default_timer()

# pre-process the data for Keras
train_tensors = paths_to_tensor(train_files).astype('float32')/255
valid_tensors = paths_to_tensor(valid_files).astype('float32')/255
test_tensors = paths_to_tensor(test_files).astype('float32')/255

model_tensor_creation_time_stop = timeit.default_timer()
model_tensor_creation_time = model_tensor_creation_time_stop - model_tensor_creation_time_start

100%|█████████████████████████████████████████████████████████████████████████████| 6680/6680 [00:44<00:00, 150.17it/s]
100%|███████████████████████████████████████████████████████████████████████████████| 835/835 [00:05<00:00, 162.96it/s]
100%|███████████████████████████████████████████████████████████████████████████████| 836/836 [00:05<00:00, 163.89it/s]


## Step 4: Create a CNN to Classify Dog Breeds (from Scratch)

Now that we have functions for detecting humans and dogs in images, we need a way to predict breed from images. In this step, you will create a CNN that classifies dog breeds. You must create your CNN from scratch (so, you can't use transfer learning yet!), and you must attain a test accuracy of at least 1%. In Step 5 of this notebook, you will have the opportunity to use transfer learning to create a CNN that attains greatly improved accuracy.

Be careful with adding too many trainable layers! More parameters means longer training, which means you are more likely to need a GPU to accelerate the training process. Thankfully, Keras provides a handy estimate of the time that each epoch is likely to take; you can extrapolate this estimate to figure out how long it will take for your algorithm to train.

We mention that the task of assigning breed to dogs from images is considered exceptionally challenging. To see why, consider that even a human would have great difficulty in distinguishing between a Brittany and a Welsh Springer Spaniel.

Brittany | Welsh Springer Spaniel <img src="https://camo.githubusercontent.com/4b3524acdcb73ccb2014d90fdb606af2c81d92d4/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f632f63662f4d6f6e7479506f7274726169742e6a70672f32343070782d4d6f6e7479506f7274726169742e6a7067" alt="Brittany" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/c/cf/MontyPortrait.jpg/240px-MontyPortrait.jpg">

<img src="https://camo.githubusercontent.com/c014aa15970a8dbd41840f021304b7d48baa0d12/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f352f35652f57656c73685f537072696e6765725f5370616e69656c5f446f672e6a70672f31323170782d57656c73685f537072696e6765725f5370616e69656c5f446f672e6a7067" alt="Welsh Springer Spaniel" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/5/5e/Welsh_Springer_Spaniel_Dog.jpg/121px-Welsh_Springer_Spaniel_Dog.jpg">

Curly-Coated Retriever | American Water Spaniel <img src="https://camo.githubusercontent.com/7c6c0d6db30a1efdd5ea0731d23a4e9e13c98154/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f312f31642f4375726c795f436f617465645f5265747269657665725f2d5f3030312d322d322e6a70672f31383070782d4375726c795f436f617465645f5265747269657665725f2d5f3030312d322d322e6a7067" alt="Curly_Coated_Retriever" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/1/1d/Curly_Coated_Retriever_-_001-2-2.jpg/180px-Curly_Coated_Retriever_-_001-2-2.jpg">

<img src="https://camo.githubusercontent.com/a8a9f1ad70291b64a02da21a9c848ba91fe6436b/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f332f33662f416d65726963616e5f57617465725f5370616e69656c5f507570706965735f30322e6a70672f31343870782d416d65726963616e5f57617465725f5370616e69656c5f507570706965735f30322e6a7067" alt="American Water Spaniel" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/3f/American_Water_Spaniel_Puppies_02.jpg/148px-American_Water_Spaniel_Puppies_02.jpg">


It is not difficult to find other dog breed pairs with minimal inter-class variation (for instance, Curly-Coated Retrievers and American Water Spaniels).

Likewise, recall that labradors come in yellow, chocolate, and black. Your vision-based algorithm will have to conquer this high intra-class variation to determine how to classify all of these different shades as the same breed.

Yellow Labrador | Chocolate Labrador | Black Labrador

<img src="https://camo.githubusercontent.com/f00781af2fd17663d3408b88b1121685d4275c78/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f332f33382f59454c4c4f575f4c41425241444f525f5245545249455645522e6a70672f31323870782d59454c4c4f575f4c41425241444f525f5245545249455645522e6a7067" alt="Yellow Labrador" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/3/38/YELLOW_LABRADOR_RETRIEVER.jpg/128px-YELLOW_LABRADOR_RETRIEVER.jpg">

<img src="https://camo.githubusercontent.com/4b4c6f8d37380040e3a5c513985c6978eb729e41/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f342f34362f43686f636f6c6174655f4c61627261646f725f253238363832393836303330332532392e6a70672f31363870782d43686f636f6c6174655f4c61627261646f725f253238363832393836303330332532392e6a7067" alt="Chocolate Labrador" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/46/Chocolate_Labrador_%286829860303%29.jpg/168px-Chocolate_Labrador_%286829860303%29.jpg">

<img src="https://camo.githubusercontent.com/eccedae572a8f4166ef3a756006804d6e4317346/68747470733a2f2f75706c6f61642e77696b696d656469612e6f72672f77696b6970656469612f636f6d6d6f6e732f7468756d622f342f34612f426c61636b5f4c61627261646f725f5265747269657665725f706f7274726169742e6a70672f31363870782d426c61636b5f4c61627261646f725f5265747269657665725f706f7274726169742e6a7067" alt="Black Labrador" data-canonical-src="https://upload.wikimedia.org/wikipedia/commons/thumb/4/4a/Black_Labrador_Retriever_portrait.jpg/168px-Black_Labrador_Retriever_portrait.jpg">


We also mention that random chance presents an exceptionally low bar: setting aside the fact that the classes are slightly imabalanced, a random guess will provide a correct answer roughly 1 in 133 times, which corresponds to an accuracy of less than 1%.

Remember that the practice is far ahead of the theory in deep learning. Experiment with many different architectures, and trust your intuition. And, of course, have fun!

In [None]:
### TODO: Define your architecture.

def conv_layer(x, num_filters, ks, pad='valid', strides=1, scope='conv', batch_norm=False, **opts):
    # x = tf.keras.layers.ZeroPadding2D(padding=pad, name=scope + '/pad')(x)
    #x = tf.keras.layers.Conv2D(filters=num_filters, kernel_size=ks, padding=pad, name=scope + '/conv', **opts)(x)
    x = tf.keras.layers.BatchNormalization(name=scope + '/bn')(x) if batch_norm else x
    x = tf.keras.layers.ReLU(name=scope + '/relu')(x)
    return x


def conv_block(x, filters, ks=3, dropout_rate=None, scope='block', **opts):
    block_name = scope
    if dropout_rate:
        x = tf.keras.layers.SpatialDropout2D(dropout_rate, name=block_name + '/dp')(x)
    x = conv_layer(x, filters[0], ks, 'same', scope=block_name + '/conv1', **opts)
    x = conv_layer(x, filters[1], ks, scope=block_name + '/conv2', **opts)
    x = tf.keras.layers.MaxPooling2D(2, name=block_name + '/pool')(x)
    return x 

def simple_model(input_size=[224, 224], **kwargs):
    crop = tf.keras.Input(shape=tuple(input_size) + (3,), name='crop')
    opts = {
        'kernel_initializer': tf.keras.initializers.VarianceScaling(mode='fan_in', distribution='uniform'),
        'bias_initializer': tf.keras.initializers.Constant(0.1)
    }
    opts = dict(opts, **kwargs)
    
    x = crop 
    x = conv_layer(x, 16, 3, scope='conv1', **opts)
    x = tf.keras.layers.MaxPooling2D(3, name='pool1')(x)
    x = tf.keras.layers.Dropout(0.1, name='dp1')(x)
    
    x = tf.keras.layers.BatchNormalization(name='bn1')(x)
    x = conv_layer(x, 32, 3, scope='conv2', **opts)
    x = tf.keras.layers.MaxPooling2D(3, name='pool2')(x)
    x = tf.keras.layers.Dropout(0.1, name='dp2')(x)
    
    x = tf.keras.layers.BatchNormalization(name='bn4')(x)
    x = conv_layer(x, 32, 3, scope='conv5', **opts)
    x = tf.keras.layers.MaxPooling2D(2, name='pool5')(x)
    x = tf.keras.layers.Dropout(0.25, name='dp5')(x)
    
    
    x = tf.keras.layers.BatchNormalization(name='bn2')(x)
    x = conv_layer(x, 64, 3, scope='conv3', **opts)
    x = tf.keras.layers.MaxPooling2D(3, name='pool3')(x)
    x = tf.keras.layers.Dropout(0.2, name='dp3')(x)
    
    x = tf.keras.layers.BatchNormalization(name='bn3')(x)
    x = conv_layer(x, 128, 3, scope='conv4', **opts)
    x = tf.keras.layers.MaxPooling2D(3, name='pool4')(x)
    x = tf.keras.layers.Dropout(0.2, name='dp4')(x)
    
#     x = conv_block(x, filters=[16, 32], dropout_rate=0.1, scope='block1', **opts)
#     x = conv_block(x, filters=[32, 48], dropout_rate=0.1, scope='block2', **opts)
#     x = conv_block(x, filters=[48, 64], dropout_rate=0.1, scope='block3', **opts)
#     x = conv_block(x, filters=[64, 128], dropout_rate=0.1, scope='block4', **opts)
    
    x = tf.keras.layers.GlobalAveragePooling2D(name='gap')(x)
    x = tf.keras.layers.Dense(133, name='fc1')(x)
    out = tf.keras.layers.Softmax(name='prob')(x)
    model = tf.keras.Model(crop, out, name='simple_model')
    return model

In [23]:
model = Sequential()

### TODO: Define your architecture.
model.add(Conv2D(filters=16, kernel_size=3, padding='same', activation='relu', 
                        input_shape=(224, 224, 3)))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(Conv2D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.1))
model.add(BatchNormalization())

model.add(Conv2D(filters=32, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.1))
model.add(BatchNormalization())

model.add(Conv2D(filters=64, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(Conv2D(filters=128, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(Conv2D(filters=512, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.1))
model.add(BatchNormalization())
model.add(GlobalAveragePooling2D())

model.add(Dense(133, activation='softmax'))

model.summary()

Model: "sequential_6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_28 (Conv2D)           (None, 224, 224, 16)      448       
_________________________________________________________________
max_pooling2d_27 (MaxPooling (None, 112, 112, 16)      0         
_________________________________________________________________
dropout_27 (Dropout)         (None, 112, 112, 16)      0         
_________________________________________________________________
batch_normalization_27 (Batc (None, 112, 112, 16)      64        
_________________________________________________________________
conv2d_29 (Conv2D)           (None, 112, 112, 32)      4640      
_________________________________________________________________
max_pooling2d_28 (MaxPooling (None, 56, 56, 32)        0         
_________________________________________________________________
dropout_28 (Dropout)         (None, 56, 56, 32)       

In [27]:
mini = compile_model(model)

In [28]:
epochs = 50

### Do NOT modify the code below this line.
checkpointer = ModelCheckpoint(filepath='Data/saved_models/weights.best.from_scratch.hdf5', 
                               verbose=2, save_best_only=True)

mini.fit(train_tensors, train_targets, 
          validation_data=(valid_tensors, valid_targets),
          epochs=epochs, batch_size=32, callbacks=[checkpointer])

Train on 6680 samples, validate on 835 samples
Epoch 1/50
Epoch 00001: val_loss improved from inf to 5.93593, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 2/50
Epoch 00002: val_loss improved from 5.93593 to 5.34971, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 3/50
Epoch 00003: val_loss improved from 5.34971 to 4.14732, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 4/50
Epoch 00004: val_loss improved from 4.14732 to 4.01320, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 5/50
Epoch 00005: val_loss improved from 4.01320 to 3.45702, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 6/50
Epoch 00006: val_loss improved from 3.45702 to 3.36810, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 7/50
Epoch 00007: val_loss improved from 3.36810 to 3.22712, saving model to Data/saved_models/weights.best.from_scratch.hdf5
Epoch 8/50
Epoch 00008: val_loss 

Epoch 27/50
Epoch 00027: val_loss did not improve from 2.95710
Epoch 28/50
Epoch 00028: val_loss did not improve from 2.95710
Epoch 29/50
Epoch 00029: val_loss did not improve from 2.95710
Epoch 30/50
Epoch 00030: val_loss did not improve from 2.95710
Epoch 31/50
Epoch 00031: val_loss did not improve from 2.95710
Epoch 32/50
Epoch 00032: val_loss did not improve from 2.95710
Epoch 33/50
Epoch 00033: val_loss did not improve from 2.95710
Epoch 34/50
Epoch 00034: val_loss did not improve from 2.95710
Epoch 35/50
Epoch 00035: val_loss did not improve from 2.95710
Epoch 36/50
Epoch 00036: val_loss did not improve from 2.95710
Epoch 37/50
Epoch 00037: val_loss did not improve from 2.95710
Epoch 38/50
Epoch 00038: val_loss did not improve from 2.95710
Epoch 39/50
Epoch 00039: val_loss did not improve from 2.95710
Epoch 40/50
Epoch 00040: val_loss did not improve from 2.95710
Epoch 41/50
Epoch 00041: val_loss did not improve from 2.95710
Epoch 42/50
Epoch 00042: val_loss did not improve from 

<tensorflow.python.keras.callbacks.History at 0x1636971e308>

In [None]:
smodel = simple_model(kernel_regularizer=tf.keras.regularizers.l2(0.01))

In [None]:
smodel.summary()

In [26]:
def compile_model(model, **kwargs):
    if kwargs.get('schedule'):
        lr = tf.keras.optimizers.schedules.ExponentialDecay(initial_learning_rate=kwargs.get('initial_learning_rate', 0.001),
                                                            decay_steps=7000,
                                                            decay_rate=0.1,
                                                            staircase=True, name='lr_scheduler', **kwargs)
    else: 
        lr = kwargs.get('initial_learning_rate', 0.001)
    optimizer = tf.keras.optimizers.Adam(learning_rate=lr,name='optimizer', **kwargs)
    model.compile(optimizer=optimizer,
              loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False),
              metrics=['accuracy'])
    return model 

In [None]:
def pretrained_cnn(input_size=(224,224,3), **kwargs):
    inputs = tf.keras.Input(shape=input_size)
    base_model = tf.keras.applications.MobileNetV2(input_shape=input_size,
                                               include_top=False,
                                               weights='imagenet')
    base_model.trainable = True
    base_model = model_with_regularization(base_model, regularizer=kwargs.get('l2'))
    base_model.summary()
    x = inputs 
    x = base_model(x)
    x = tf.keras.layers.GlobalAveragePooling2D(name='global_average')(x)
    x = tf.keras.layers.Dropout(0.5, name='dropout1')(x)
    x = tf.keras.layers.Dense(133, name='out')(x)
    x = tf.keras.layers.Softmax(name='prob')(x)
    final_model = tf.keras.Model(inputs, x, name='transfered_model')
    return final_model

In [None]:
def model_with_regularization(model, regularizer):
    import tempfile
    import os
    if not isinstance(regularizer, tf.keras.regularizers.Regularizer):
      print("Regularizer must be a subclass of tf.keras.regularizers.Regularizer")
      return model

    for layer in model.layers:
        for attr in ['kernel_regularizer']:
            if hasattr(layer, attr):
              setattr(layer, attr, regularizer)

    # When we change the layers attributes, the change only happens in the model config file
    model_json = model.to_json()

    # Save the weights before reloading the model.
    tmp_weights_path = os.path.join(tempfile.gettempdir(), 'tmp_weights.h5')
    model.save_weights(tmp_weights_path)

    # load the model from the config
    model = tf.keras.models.model_from_json(model_json)
    
    # Reload the model weights
    model.load_weights(tmp_weights_path, by_name=True)
    return model

In [None]:
model = pretrained_cnn(l2=tf.keras.regularizers.l2(0.0001))

In [None]:
model.summary()

In [None]:
def train_model(model, **kwargs):
    checkpointer = ModelCheckpoint(filepath='Data/saved_models/weights.best.from_scratch.hdf5', save_best_only=True)
    model = compile_model(model)
    history = model.fit(train_tensors, train_targets, 
                        validation_data=(valid_tensors, valid_targets),
                        epochs=50, batch_size=32, callbacks=[checkpointer])
    return history

In [None]:
history = train_model(smodel)

In [None]:
def train_model(model):
    checkpointer = ModelCheckpoint(filepath='Data/saved_models/weights.best.from_scratch.hdf5', save_best_only=True)

    model.fit(train_tensors, train_targets, 
          validation_data=(valid_tensors, valid_targets),
          epochs=2, batch_size=32, callbacks=[checkpointer])
    
#     train_score = model.evaluate(train_tensors)
#     print('train loss, train acc:', train_score)

#     validation_score = model.evaluate(valid_tensors)
#     print('validation loss, validation acc:', validation_score)


if __name__ == '__main__':
    compile_model = compile_model()
    train_model(compile_model)

# epochs = 10

# ### Do NOT modify the code below this line.
# checkpointer = ModelCheckpoint(filepath='saved_models/weights.best.from_scratch.hdf5', 
#                                verbose=2, save_best_only=True)

# model.fit(train_tensors, train_targets, 
#           validation_data=(valid_tensors, valid_targets),
#           epochs=epochs, batch_size=32, callbacks=[checkpointer])