<a href="https://colab.research.google.com/github/weiyunna/Deep-Learning-with-Tensorflow/blob/master/Transfer_Learning_with_ResNet50.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transfer Learning with ResNet 50

This kernel is intended to be a tutorial on Keras around image files handling for Transfer Learning using pre-trained weights from ResNet50 convnet.

Though loading all train & test images resized (224 x 224 x 3) in memory would have incurred ~4.9GB of memory, the plan was to batch source image data during the training, validation & testing pipeline. Keras ImageDataGenerator supports batch sourcing image data for all training, validation and testing. Actually, it is quite clean and easy to use Keras ImageDataGenerator except few limitations (listed at the end).

Keras ImageDataGenerator expects labeled training images to be available in certain folder heirarchy, 'train' data was manually split into 10k for training & 2.5k for validation and re-arranged into the desired folder hierarchy. Even 'test' images had to rearranged due to a known issue in flow_from_directory.

In [0]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline 

import cv2

import os

In [10]:
import tensorflow as tf
from tensorflow.python.keras.applications import ResNet50
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense


print(tf.__version__)

1.13.1


## Connect with the Google Drive

In [23]:
import os
print(os.listdir("."))

['The Hello World of Neural Network', 'Week1', 'Week1-Exercise1', 'Week2-Computer Vision on Fashion MINST', 'MINST Digits Recognition', 'Improving Computer Vision Accuracy using Convolutions', 'CNN_Filters and Pools', 'Horse or Human - Image Classification', 'Face Recognition', 'An End-to-End-Data-Science-Framework', 'Azure-aml-real-time-ai', 'Introduction to Pandas', 'First-Steps-With-Tensorflow', 'Introduction to Keras', 'Introduction to Neural Nets', 'Introduction to Matplotlib', 'Introduction to Bokeh', 'Images', 'resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5', 'Transfer Learning with ResNet50']


In [25]:
# Authorization

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

E: Package 'python-software-properties' has no installation candidate


In [0]:
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

In [0]:

# Mount drive to "drive/" folder

!mkdir -p drive
!google-drive-ocamlfuse drive

In [0]:
print(os.listdir("drive/Colab Notebooks/"))
os.chdir("drive/Colab Notebooks/")

['The Hello World of Neural Network', 'Week1', 'Week1-Exercise1', 'Week2-Computer Vision on Fashion MINST', 'MINST Digits Recognition', 'Improving Computer Vision Accuracy using Convolutions', 'CNN_Filters and Pools', 'Horse or Human - Image Classification', 'Face Recognition', 'An End-to-End-Data-Science-Framework', 'Azure-aml-real-time-ai', 'Introduction to Pandas', 'First-Steps-With-Tensorflow', 'Introduction to Keras', 'Introduction to Neural Nets', 'Introduction to Matplotlib', 'Introduction to Bokeh', 'Transfer Learning with ResNet50']


In [24]:
print(os.listdir("."))

['Introduction to Pandas', 'First-Steps-With-Tensorflow', 'Azure-aml-real-time-ai', 'Week2-Computer Vision on Fashion MINST', 'CNN_Filters and Pools', 'Week1', 'An End-to-End-Data-Science-Framework', 'Transfer Learning with ResNet50', 'Introduction to Neural Nets', 'Week1-Exercise1', 'Improving Computer Vision Accuracy using Convolutions', 'Images', 'The Hello World of Neural Network', 'MINST Digits Recognition', 'Face Recognition', 'Introduction to Bokeh', 'Introduction to Keras', 'resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5', 'Introduction to Matplotlib', 'Horse or Human - Image Classification']


## Define the Global Constants

In [0]:
# Fixed for our Cats & Dogs classes
NUM_CLASSES = 2

# Fixed for Cats & Dogs color images
CHANNELS = 3

IMAGE_RESIZE = 224
RESNET50_POOLING_AVERAGE = 'avg'
DENSE_LAYER_ACTIVATION = 'softmax'
OBJECTIVE_FUNCTION = 'categorical_crossentropy'

# Common accuracy metric for all outputs, but can use different metrics for different output
LOSS_METRICS = ['accuracy']

# EARLY_STOP_PATIENCE must be < NUM_EPOCHS
NUM_EPOCHS = 10
EARLY_STOP_PATIENCE = 3

# These steps value should be proper FACTOR of no.-of-images in train & valid folders respectively
# Training images processed in each step would be no.-of-train-images / STEPS_PER_EPOCH_TRAINING
STEPS_PER_EPOCH_TRAINING = 10
STEPS_PER_EPOCH_VALIDATION = 10

# These steps value should be proper FACTOR of no.-of-images in train & valid folders respectively
# NOTE that these BATCH* are for Keras ImageDataGenerator batching to fill epoch step input
BATCH_SIZE_TRAINING = 100
BATCH_SIZE_VALIDATION = 100

# Using 1 to easily manage mapping between test_generator & prediction for submission preparation
BATCH_SIZE_TESTING = 1

## ResNet50

* Notice that resnet50 folder has 2 pre-trained weights files... xyz_tf_kernels.h5 & xyz_tf_kernels_NOTOP.h5

* The xyz_tf_kernels.h5 weights is useful for pure prediction of test image and this prediction will rely completely on ResNet50 pre-trained weights, i.e., it does not expected any training from our side

* Out intention in this kernel is Transfer Learning by using ResNet50 pre-trained weights except its TOP layer, i.e., the xyz_tf_kernels_NOTOP.h5 weights... Use this weights as initial weight for training new layer using train images

In [22]:
print(os.listdir("drive/Colab Notebooks/"))
os.chdir("drive/Colab Notebooks/")

FileNotFoundError: ignored

In [0]:
resnet_weights_path = '/Colab Notebooks/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'

## Define Our Transfer Learning Network Model Consisting of 2 Layers
Here, we are preparing specification or blueprint of the TensorFlow DAG (directed acyclcic graph) for just the MODEL part.

In [21]:
#Still not talking about our train/test data or any pre-processing.

model = Sequential()

# 1st layer as the lumpsum weights from resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
# NOTE that this layer will be set below as NOT TRAINABLE, i.e., use it as is
model.add(ResNet50(include_top = False, pooling = RESNET50_POOLING_AVERAGE, weights = resnet_weights_path))

# 2nd layer as Dense for 2-class classification, i.e., dog or cat using SoftMax activation
model.add(Dense(NUM_CLASSES, activation = DENSE_LAYER_ACTIVATION))

# Say not to train first layer (ResNet) model as it is already trained
model.layers[0].trainable = False

ValueError: ignored

In [16]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
resnet50 (Model)             (None, 2048)              23587712  
_________________________________________________________________
dense_1 (Dense)              (None, 2)                 4098      
Total params: 23,591,810
Trainable params: 4,098
Non-trainable params: 23,587,712
_________________________________________________________________


## Compile Our Transfer Learning Model

In [0]:
from tensorflow.python.keras import optimizers

sgd = optimizers.SGD(lr = 0.01, decay = 1e-6, momentum = 0.9, nesterov = True)
model.compile(optimizer = sgd, loss = OBJECTIVE_FUNCTION, metrics = LOSS_METRICS)

## Prepare Keras Data Generators

Keras `ImageDataGenerator(...)` generates batches of tensor image data with real-time data augmentation. The data will be looped over (in batches). It is useful with large dataset to source, pre-process (resize, color conversion, image augmentation, batch normalize) & supply resulting images in batches to downstream Keras modeling components, namely `fit_generator(...)` &` predict_generator(...) `-vs- `fit(...)` & `predict(...)` for small dataset.

Kaggle competition rule expects Dog & Cat to be labeled as 1 & 0. Keras >> `ImageDataGenerator >> flow_from_directory` takes in 'classes' list for mapping it to LABEL indices otherwise treats sub-folders enumerated classes in alphabetical order, i.e., Cat is 0 & Dog is 1.

In [8]:
from keras.applications.resnet50 import preprocess_input
from keras.preprocessing.image import ImageDataGenerator

image_size = IMAGE_RESIZE

# preprocessing_function is applied on each image but only after re-sizing & augmentation (resize => augment => pre-process)
# Each of the keras.application.resnet* preprocess_input MOSTLY mean BATCH NORMALIZATION (applied on each batch) stabilize the inputs to nonlinear activation functions
# Batch Normalization helps in faster convergence
data_generator = ImageDataGenerator(preprocessing_function=preprocess_input)

# flow_From_directory generates batches of augmented data (where augmentation can be color conversion, etc)
# Both train & valid folders must have NUM_CLASSES sub-folders
train_generator = data_generator.flow_from_directory(
        '/Images/Train',
        target_size=(image_size, image_size),
        batch_size=BATCH_SIZE_TRAINING,
        class_mode='categorical')

validation_generator = data_generator.flow_from_directory(
        '/Images/Validation',
        target_size=(image_size, image_size),
        batch_size=BATCH_SIZE_VALIDATION,
        class_mode='categorical') 

Using TensorFlow backend.


NameError: ignored

## Connect with the Google Drive

In [1]:
import os
print(os.listdir("."))

['.config', 'sample_data']


In [2]:
# Authorization

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

E: Package 'python-software-properties' has no installation candidate
Selecting previously unselected package google-drive-ocamlfuse.
(Reading database ... 131294 files and directories currently installed.)
Preparing to unpack .../google-drive-ocamlfuse_0.7.3-0ubuntu1~ubuntu18.04.1_amd64.deb ...
Unpacking google-drive-ocamlfuse (0.7.3-0ubuntu1~ubuntu18.04.1) ...
Setting up google-drive-ocamlfuse (0.7.3-0ubuntu1~ubuntu18.04.1) ...
Processing triggers for man-db (2.8.3-2ubuntu0.1) ...


In [3]:
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

Please, open the following URL in a web browser: https://accounts.google.com/o/oauth2/auth?client_id=32555940559.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force
··········
Please, open the following URL in a web browser: https://accounts.google.com/o/oauth2/auth?client_id=32555940559.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force
Please enter the verification code: Access token retrieved correctly.


In [0]:

# Mount drive to "drive/" folder

!mkdir -p drive
!google-drive-ocamlfuse drive

In [6]:
print(os.listdir("drive/Colab Notebooks/"))
os.chdir("drive/Colab Notebooks/")

['The Hello World of Neural Network', 'Week1', 'Week1-Exercise1', 'Week2-Computer Vision on Fashion MINST', 'MINST Digits Recognition', 'Improving Computer Vision Accuracy using Convolutions', 'CNN_Filters and Pools', 'Horse or Human - Image Classification', 'Face Recognition', 'An End-to-End-Data-Science-Framework', 'Azure-aml-real-time-ai', 'Introduction to Pandas', 'First-Steps-With-Tensorflow', 'Introduction to Keras', 'Introduction to Neural Nets', 'Introduction to Matplotlib', 'Introduction to Bokeh', 'Transfer Learning with ResNet50']


In [7]:
print(os.listdir("."))

['Introduction to Keras', 'Introduction to Bokeh', 'Azure-aml-real-time-ai', 'An End-to-End-Data-Science-Framework', 'Introduction to Neural Nets', 'MINST Digits Recognition', 'Face Recognition', 'CNN_Filters and Pools', 'Week2-Computer Vision on Fashion MINST', 'Week1-Exercise1', 'The Hello World of Neural Network', 'Improving Computer Vision Accuracy using Convolutions', 'Week1', 'Horse or Human - Image Classification', 'First-Steps-With-Tensorflow', 'Introduction to Pandas', 'Transfer Learning with ResNet50', 'Introduction to Matplotlib']
