<a href="https://colab.research.google.com/github/SaketMunda/human-classifier-unsplash-dataset/blob/master/human_classifier_transfer_learning_feature_extraction.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Building Human Classifier using Transfer Learning Feature extraction

This notebook is responsible for building our own transfer learning feature extraction model for human classification problem.

For transfer learning, we'll take two models from [`tensorflow_hub`](https://www.tensorflow.org/hub) and visualize the performance of both the experiments using [`Tensorboard Playground`](https://www.tensorflow.org/tensorboard).  


We're taking here only 10% of dataset from original extracted images from Unsplash, since we are using feature extraction transfer learning, and it often allows us to get great results with less data.



## Problem Definition

We're working towards building pre-trained model and adding our own custom layers on top, extracting all of the underlying weights and biases learned on another dataset and use them on our own unsplash extracted images to classify whether an image contains **human** or not.

## Creating data loaders (preparing the data)

There are couple of ways to load the data and prepare it for our network, and most commonly used is `ImageDataGenerator`, and one more `image_dataset_from_directory` function.

For now let's use the basic one `ImageDataGenerator`, since we are not using a larger dataset so it'll be fine, but if we have to use a larger dataset then we should use `image_dataset_from_directory` function since it creates a `tf.Data.Dataset` object rather than a generator. 

Since we are dealing with predicting a class i.e binary classification problem, so I've created a dataset in that format only.

Directories is in the below format for train and test dataset,
- train/human, train/non-human
- test/human, test/non-human


In [1]:
# setup the data inputs
from tensorflow.keras.preprocessing.image import ImageDataGenerator

IMG_SHAPE = (224, 224)
BATCH_SIZE = 32

# declaring constant drive paths
drive_path = 'drive/MyDrive/Data Science/HumanClassifier/'
train_dir = drive_path + 'photos/train/'
test_dir = drive_path + 'photos/test/'

# Instantiating ImageDataGenerator
# rescaling since the the range of image tensors would be between 0-255
train_datagen = ImageDataGenerator(rescale=1/255.) 
test_datagen = ImageDataGenerator(rescale=1/255.)

print('Training images:')
train_data_10_percent = train_datagen.flow_from_directory(train_dir,
                                                          target_size=IMG_SHAPE,
                                                          batch_size=BATCH_SIZE,
                                                          class_mode='binary')

print('Test images:')
test_data = test_datagen.flow_from_directory(test_dir,
                                             target_size=IMG_SHAPE,
                                             batch_size=BATCH_SIZE,
                                             class_mode='binary')

Training images:
Found 350 images belonging to 2 classes.
Test images:
Found 350 images belonging to 2 classes.


## Setting Up Callbacks

Before building our model, to track how is our model performing or how much further training is required for our model, and few more things we can do via `callbacks` which executes during or after training our model.

Since we want to visualize the performance of two models of tensorflow_hub, and compare them, so we'll be creating a `TensorBoard` callback, which will create a dashboard for inspecting neural network parameters.

The Tensorboard callback can be accessed using `tf.keras.callbacks.TensorBoard()`.

It's main functionality is saving a model's training performance metrics to a specified `log_dir`.

To track our modelling experiments using Tensorboard, we'll create a function which creates a tensorboard callback for us, so that each time when we fit our model, it create a new one each time.



In [2]:
# Create tensorboard callback
import datetime
import tensorflow

def create_tensorboard_callback(dir_name, experiment_name):
  log_dir = dir_name + "/" + experiment_name + "/" + datetime.datetime.now().strftime('%Y%m%d-%H%M%S')
  tensorboard_callback = tensorflow.keras.callbacks.TensorBoard(
      log_dir = log_dir
  )
  print(f'Saving Tensorboard Log files to: {log_dir}')
  return tensorboard_callback

## Creating models using Tensorflow Hub

We're going to use two models from Tensorflow Hub:

- [ResNet50V2](https://arxiv.org/abs/1603.05027) : a state of art computer vision model architecture from 2016
- [EfficientNetB0](https://arxiv.org/abs/1905.11946) : a state of art computer vision model architecture from 2019

> 💡 *The Tesla Vehicle AI processes huge doses of information in real-time. So the Computer Vision workflow runs all the tasks on a shared backbone called **ResNet-50** that has the ability to run 1000×1000 images at a time*.

Let's build our model using the above said models from TensorFlow Hub.


In [6]:
# import libraries for tensorflow and tensorflow hub
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow.keras import layers

Now we'll get feature vector URLs of two common computer vision architectures, [EfficientNetB0(2019)](https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1) and [ResNetV250(2016)](https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4) from TensorFlow Hub

In [3]:
# Resnet50V2 feature vector
resnet_url = "https://tfhub.dev/google/imagenet/resnet_v2_50/feature_vector/4"

# EfficientNetB0 feature vector
efficientnet_url = "https://tfhub.dev/tensorflow/efficientnet/b0/feature-vector/1"

These URLs link to a saved pretrained model on Tensorflow Hub.

When we use them in our model, the model will automatically get downloaded for us to use.

To do this, we can use the `KerasLayer()` model inside the TensorFlow Hub library.

Since we're going to be comparing two models, to save ourselves code, we'll create a function `create_model()`. This function will take a model's TensorFlow Hub URL, instantiate a Keras Sequential model with the appropriate number of output layers, compile the model and return the model.

In [12]:
def create_model(model_url):
  """
  Takes a TensorFlow Hub URL and creates a Keras Sequential model with it.

  Args:
    model_url(str): A tensorflow hub feature extraction URL    
  Returns:
    A compiled Keras Sequential model with model_url as feature extractor layer 
    and Dense output layer with num_classes outputs.
  """

  # Download the pretrained model and save it as Keras Layer
  feature_extractor_layer = hub.KerasLayer(model_url, 
                                           trainable=False, # freeze the underlying patterns
                                           name='feature_extractor_layer',
                                           input_shape = IMG_SHAPE+(3,))
  
  # Create our own model
  model = tf.keras.Sequential([
      feature_extractor_layer, # Use feature extraction layer as the base
      layers.Dense(1, activation='sigmoid', name='output_layer') # create our own output layer
  ])

  # compile our model
  model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
                optimizer='Adam',
                metrics=['accuracy'])
  
  return model

Great ! Now we've got a function for creating a model, we'll use it to first create and compile a model using ResNet50V2 architecture as our feature extraction compiled model.

Then we'll fit the model with our own training data and test data and also use the callbacks.

In [13]:
# create and compile the model
resnet_model = create_model(resnet_url)

# fit the model
resnet_history = resnet_model.fit(train_data_10_percent,
                                  epochs=5,
                                  steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                  callbacks=[create_tensorboard_callback(dir_name='tensorflow_hub',
                                                                         experiment_name='resnet50v2')])

Saving Tensorboard Log files to: tensorflow_hub/resnet50v2/20221231-080820
Epoch 1/5



Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


Not bad, but it looks like that in test data it is not performing very well compared to train data, that means model is not learning quite good.

However that is not our goal, we're trying here to build a model using a pre-trained model.

So let's continue our experiment.

This time, taking the efficient net url for creating and fitting the model.

In [14]:
efficient_model = create_model(efficientnet_url)

# fit the model
efficient_history = efficient_model.fit(train_data_10_percent,
                                  epochs=5,
                                  steps_per_epoch=len(train_data_10_percent),
                                  validation_data=test_data,
                                  validation_steps=len(test_data),
                                  callbacks=[create_tensorboard_callback(dir_name='tensorflow_hub',
                                                                         experiment_name='efficientnet')])

Saving Tensorboard Log files to: tensorflow_hub/efficientnet/20221231-084830
Epoch 1/5



Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


## Comparing Models using TensorBoard

Since, we've already created a callback function to save the logs of each experiment we did for each model, we can preview those logs using TensorBoard.

To visualize them, we can upload the results to [TensorBoard.dev](https://tensorboard.dev/)

By uploading it to TensorBoard.dev, we can share the results to others as well.

For uploading a series of TensorFlow logs to TensorBoard, we can use the following command:

In [None]:
!tensorboard dev upload --logdir ./tensorflow_hub/ \
  --name "EfficientNetB0 Vs ResNet50V2" \
  --description "Comparing two different TF Hub feature extraction models architecture using 10% of unsplash images of human and non-humans" \
  --one_shot # this is to exit

TensorBoard experiment URL : https://tensorboard.dev/experiment/6QZoenQwQ4KHX2T42h6DMw/