# Training a Single Species Animal Detector with Transfer Learning
### Mike Hilton, Eckerd College, 31 August 2021

This Colab presents the use of transfer learning to create an image classifier that detects if a focal species is present in an image or not.  The detector produced by this colab is suitable for use with the autocopy.py program in the Eckerd Camera Trap Tools suite.

The EfficientNetB0 model is used as the starting point.  This model expects a 224 x 224-pixel 24-bit color image as input.

The training data should be provided in two folders: a folder named "background" containing images that do not include the focal species, and a folder named "present" containing images that do include the focal species.  At least 150 images of each class should be provided and the number of images in each folder should be roughly the same.  



The image folders can exist on your Google Drive, or be uploaded to the runtime Colab instance using the folder icon in the leftmost Colab pane. This notebook demonstrates the Google Drive approach.

Mount Google Drive to get access to the training dataset.  Follow the instructions printed on the screen when you run the code section below.

In [None]:
from google.colab import drive
drive.mount('/content/gdrive')

Load the Python modules needed

In [None]:

import matplotlib.pyplot as plt
import numpy as np
import os
import pandas
import pickle
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
from sklearn.preprocessing import LabelBinarizer
import tensorflow as tf
import tensorflow.keras as keras

Initialize global variables controlling the training process.  You should modify the values as needed.

In [None]:
# Location of training data folder
DATASET_PATH = "/content/gdrive/MyDrive/Colab Notebooks/training_images"  

# Path to saved model
SAVED_MODEL_PATH = "/content/gdrive/MyDrive/Colab Notebooks/saved_models/generic_detector.h5"  

# Path to the saved category label binarizer
LABEL_ENCODER_PATH = "/content/gdrive/MyDrive/Colab Notebooks/saved_models/generic_detector_label_encoder.pickle"

# Number of training images in each class to use; -1 means use all images in dataset.
# If you have a lot of images, you might consider first testing this notebook on a
# small subset of the images to make sure everything works to your satisfaction.  Then
# you can change IMAGE_COUNT to -1 to train with all images.
IMAGE_COUNT = 20                       

# Size of images handled by EfficientNetB0.  You should not change this value unless
# you switch to using a different CNN architecture.
INPUT_DIMS = (224, 224)                 

# Training batch size.  If you are not familiar with how CNN's are trainined, I 
# recommend not changing this value.
BS = 32 

## PHASE ONE of transfer learning is when the classification head of the CNN is
## trained.  The feature extraction network is frozen and does not change during 
## phase one.
# Number of training epochs during phase 1 of training.  An epoch is one complete 
# pass over the training dataset.
PHASE1_EPOCHS = 40 
# learning rate during phase 1 of training
PHASE1_LR = 1e-3   

## PHASE TWO of transfer learning is when the last layers of the feature extraction
## network are unfrozen and fine-tuned to learn about you dataset.
PHASE2_EPOCHS = 10                       
PHASE2_LR = 1e-4                       

Define functions to load an image dataset from disk and prepare it for use with EfficientNet.

In [None]:
def load_image(image_path):
  """
  Loads an image and prepares it for use with EfficientNet.
  Returns a numpy image.
  """
  image = keras.preprocessing.image.load_img(image_path, target_size=INPUT_DIMS)
  image = keras.preprocessing.image.img_to_array(image)
  image = keras.applications.efficientnet.preprocess_input(image)
  return image

def load_image_set(set_path, label, count):
  """
  Loads an image dataset.
  Inputs:
      set_path   string; folder where dataset lives
      label      string; category label associated with these images
      count      integer; maximum number of images to load; -1 means load all images in folder
  Returns:
      Pair (images, labels) where
          images is a list of numpy images
          labels is a list of strings
  """  
  # get name of all files in set_path
  filenames = tf.io.gfile.listdir(set_path)

  # loop over the images
  data = []
  labels = []
  for filename in filenames:
    # load the input image and preprocess it
    image = load_image(os.path.join(set_path, filename))
    # update the data and labels lists, respectively
    data.append(image)
    labels.append(label)
    # check the image count
    count = count - 1
    if count == 0:
      break

  return data, labels

Load the dataset from disk.  The DATASET_PATH folder contains two subfolders, 'background' and 'present'.  Inside the subfolders are JPEG images.

In [None]:
# load the image dataset
data, labels = load_image_set(DATASET_PATH + "/present", "present", IMAGE_COUNT)
data1, labels1 = load_image_set(DATASET_PATH + "/background", "background", IMAGE_COUNT)
data.extend(data1)
labels.extend(labels1)

# convert the data and labels to NumPy arrays
data = np.array(data, dtype="float32")
labels = np.array(labels)

# perform one-hot encoding on the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
labels = keras.utils.to_categorical(labels)

# save the label binarizer to a pickle file
f = open(LABEL_ENCODER_PATH, "wb")
f.write(pickle.dumps(lb))
f.close()

# partition the data into training, validation and testing sets
(trainX_full, testX, trainY_full, testY) = train_test_split(data, labels,	test_size=0.20, random_state=42)
(trainX, validationX, trainY, validationY) = train_test_split(trainX_full, trainY_full,	test_size=0.20, random_state=42)
print(len(trainY), "images in training set")
print(len(validationY), "images in validation set")
print(len(testY), "images in test set")

Create the new model based on EfficientNetB0

In [None]:
# load the EfficientNetB0 network, ensuring the head layers are left off
baseModel = keras.applications.EfficientNetB0(weights="imagenet", include_top=False,
	 input_tensor=keras.layers.Input(shape=(224, 224, 3)))

# construct the head of the model that will be placed on top of the
# the base model
headModel = baseModel.output
headModel = keras.layers.AveragePooling2D(pool_size=(7, 7))(headModel)
headModel = keras.layers.Flatten(name="flatten")(headModel)
headModel = keras.layers.Dense(128, activation="relu")(headModel)
headModel = keras.layers.Dropout(0.5)(headModel)
headModel = keras.layers.Dense(2, activation="softmax")(headModel)

# place the new head on top of the base model (this will become the actual model we will train)
model = keras.models.Model(inputs=baseModel.input, outputs=headModel)
  

Phase 1 training, on the head of the model

In [None]:
# freeze the layers in the base model so they will *not* be updated during the first training process
for layer in baseModel.layers:
	layer.trainable = False  

# compile our model
model.compile(
    loss="binary_crossentropy", 
    optimizer=keras.optimizers.Adam(learning_rate=PHASE1_LR), 
    metrics=["accuracy"])
model.summary()

# train the head of the network
history = model.fit(trainX, trainY,
	validation_data=(validationX, validationY),
	epochs=PHASE1_EPOCHS)  


Evaluate the performance of the model

In [None]:
print("\nEvaluate Phase 1 model performance")

# predict the class of each example in the test set
predY = np.argmax(model.predict(testX), axis=1)

# show a nicely formatted classification report
print(classification_report(testY.argmax(axis=1), predY, target_names=lb.classes_))  

# plot the training history
pandas.DataFrame(history.history).plot() 
plt.grid(True)
plt.title("Phase 1 Training Performance")
plt.xlabel("Epoch")

Phase 2 of training, where we fine-tune the weights in the convolutional layers.

In [None]:
# unfreeze the layers of the base model
for layer in baseModel.layers:
	layer.trainable = True  

# compile our model again
model.compile(
    loss="binary_crossentropy", 
    optimizer=keras.optimizers.Adam(learning_rate=PHASE2_LR), 
    metrics=["accuracy"])   

# train the entire network
history = model.fit(trainX, trainY,
	validation_data=(validationX, validationY),
	epochs=PHASE2_EPOCHS)  

Evaluate the performance of the fine-tuned model

In [None]:
print("\nEvaluate Phase 2 model performance")

# predict the class of each example in the test set
predY = np.argmax(model.predict(testX), axis=1)

# show a nicely formatted classification report
print(classification_report(testY.argmax(axis=1), predY, target_names=lb.classes_))    
 
# plot the training history
pandas.DataFrame(history.history).plot() 
plt.grid(True)
plt.title("Phase 2 Training Performance")
plt.xlabel("Epoch")

Save the model.  A warning will be printed regarding custom masks; you can ignore this warning, as it does not pertain to this script.

In [None]:
model.save(SAVED_MODEL_PATH) 