# Implementing Transfer Learning

* You will download a given dataset and then use transfer learning to build a classifier (to determine if an image contains cats or dogs).

* You should use ResNet50

* Evaluate your model on the testing accuracy

* Fine-tune the feature extractor

In [None]:
import tensorflow as tf
import numpy as np
from keras.utils import np_utils
from sklearn.model_selection import train_test_split
from keras.datasets import cifar10
from tensorflow.keras import Input
from tensorflow.keras.layers import Flatten, Dense
from tensorflow.keras.models import Model

* Read about the image_dataset_from_directory() function: https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory


* You will use a dataset containing several thousand images of cats and dogs. Download and extract a zip file containing the images, then create a tf.data.Dataset for training and validation using the tf.keras.utils.image_dataset_from_directory utility.

In [None]:
# Set the URL to the .zip file we will download
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'

# Download the file
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)

# Set the path
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')

# We will process the images in the directory in batches, here of size 12
BATCH_SIZE = 32

# Resizing
IMG_SIZE = (160, 160)

Downloading data from https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip


* The approach below avoidings loading all the data into memory by creating a tf.data.Dataset which will read the images in batches from the harddrive.

* Read up about the arguement label_mode. There are different ways of doing this which will result in different loss functions and the number of units in the last layer (and activation).

In [None]:
train_dataset = tf.keras.utils.image_dataset_from_directory(train_dir,
                                                            shuffle=True,
                                                            batch_size=BATCH_SIZE,
                                                            image_size=IMG_SIZE,
                                                            label_mode = "categorical")

Found 2000 files belonging to 2 classes.


In [None]:
validation_dataset = tf.keras.utils.image_dataset_from_directory(validation_dir,
                                                                 shuffle=True,
                                                                 batch_size=BATCH_SIZE,
                                                                 image_size=IMG_SIZE,
                                                                 label_mode = "categorical")

Found 1000 files belonging to 2 classes.


## Some pre-processing needed

* In a moment, you will download ResNet50 for use as your base model. 

* This model expects pixel values in [-1, 1], but at this point, the pixel values in your images are in [0, 255]. 

* To rescale them, use the preprocessing method included with the model.

In [None]:
preprocess_input = tf.keras.applications.resnet50.preprocess_input

## Import ResNet50

* We need to make a modification to our approach. Since we need to pre-process the data, the way this is done is slightly different. First we define an Input, then call the ```preprocess_input()``` function, and then proceed as normal.

```
inputs = tf.keras.Input(shape=(160, 160, 3))
x = preprocess_input(inputs)
x = base_model(x)
```

* Note that there is a slight modification needed when you will define ```Model()``` but I am sure you can figure it out.

In [None]:
from tensorflow.keras.applications import ResNet50

In [None]:
base_model = ResNet50(weights='imagenet',
                  include_top=False, 
                  input_shape=(160, 160, 3)) 

In [None]:
# First we set the entire feature extractor to non-trainable
base_model.trainable = True

In [None]:
inputs = tf.keras.Input(shape=(160, 160, 3))
x = preprocess_input(inputs)
x = base_model(x)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
outputs = Dense(2, activation = "softmax")(x)
model = tf.keras.Model(inputs, outputs)

In [None]:
model.summary()

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_6 (InputLayer)        [(None, 160, 160, 3)]     0         
                                                                 
 tf.__operators__.getitem_1   (None, 160, 160, 3)      0         
 (SlicingOpLambda)                                               
                                                                 
 tf.nn.bias_add_1 (TFOpLambd  (None, 160, 160, 3)      0         
 a)                                                              
                                                                 
 resnet50 (Functional)       (None, 5, 5, 2048)        23587712  
                                                                 
 global_average_pooling2d_1   (None, 2048)             0         
 (GlobalAveragePooling2D)                                        
                                                           

In [None]:
model.compile(loss='categorical_crossentropy',
             optimizer='adam',
             metrics=['accuracy'])

In [None]:
model.fit(train_dataset, epochs=5, batch_size=32, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7ff9e373bc10>

In [None]:
# Some code to help with predictions as now we are using 
# the image_dataset_from_directory function which generated
# a tf.data.Dataset
predictions = []
labels =  []

for x, y in validation_dataset:
  predictions.extend(np.argmax(model.predict(x), axis=-1))
  labels.extend(np.argmax(y.numpy(), axis=-1))

In [None]:
from sklearn.metrics import accuracy_score
accuracy_score(predictions,labels)*100

89.2