# Transfer Learning is for the Birds

Patrick Wagstrom <patrick@wagstrom.net>

May 2018

## Project Abstract

One of the most popular applications of deep learning is for image classification. To get a truly great classifier, most applications require networks with dozen of different layers that may take weeks to train on a distributed cluster of GPUs. This cost makes the prospect of an individual training a high accuracy custom classifier daunting, at best. However, there is another way. Transfer learning with deep learning neural networks allows you to take advantage of most of the feature detection inherent in a complex deep learning neural network without the extensive training times. In this talk I'll share the beginning of training and collecting data for a custom classifier in a different domain - identification of common birds. It will cover the basics of neural networks, transfer learning, and show how to apply transfer learning to pre-trained image classification networks from Google to achieve high precision classifiers with small amounts of training time.

## Project Background
This is a simple project that I'm using to try and generate some models to automatically classify the different birds that visit the various feeders in my yard.

Useful references:
* https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
* https://github.com/tzutalin/labelImg
* https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/#3

# Overview

* Four General Problems in AI
* A Tiny Bit about Neural Networks
* Using ImageNet
* A Tiny Bit about Transfer Learning

* High Level General Problem
* Getting Started Quickly - Caltech-UCSD Birds Data Set
* Transfer Learning for CUB Data
* Exploring the CUB Data
  - Potential Problems with the CUB Data
* Capturing Your Own Data
* Tagging the Data with labelImg
* Training Your Own Model
* Potential Challenges
  - Biased Sampling
  - Rare Bird Species

# Four General Problems in AI

* Regression
* Clustering
* Dimensionality Reduction
* Classification

# Two General Data Strategies

* Supervised
* Unsupervised

# Neural Networks

A neural network a machine learning model that happens to be really good at classifying objects. Fundamentally, it consists of a number of different cells that each work on a small amount of data.

<img src="images/neural network.png">

<img src="images/inception architecture.png">
<!-- image from: https://hackathonprojects.files.wordpress.com/2016/09/74911-image03.png via https://hacktilldawn.com/2016/09/25/inception-modules-explained-and-implemented/ -->

# Using MobileNet

For a lot more detail, check out [An Analysis of Deep Neural Networks for Practical Applications](https://arxiv.org/abs/1605.07678), which contains an analysis of the different network architectures and compares their top-1 accuracy against computational complexity and memory needed.

## MobileNet Performance vs Computational Complexity

<table border="1" cellpadding="1" cellspacing="0" style="width: 100%;"><tbody><tr> <td><div style="background-color: lightblue; text-align: center;"><b>Model Checkpoint</b></div></td> <td><div style="background-color: lightblue; text-align: center;"><b>Million MACs</b></div></td> <td><div style="background-color: lightblue; text-align: center;"><b>Million Parameters</b></div></td> <td><div style="background-color: lightblue; text-align: center;"><b>Top-1 Accuracy</b></div></td> <td><div style="background-color: lightblue; text-align: center;"><b>Top-5 Accuracy</b></div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_1.0_224_2017_06_14.tar.gz">MobileNet_v1_1.0_224</a></div></td> <td><div style="text-align: center;">569</div></td> <td><div style="text-align: center;">4.24</div></td> <td><div style="text-align: center;">70.7</div></td> <td><div style="text-align: center;">89.5</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_1.0_192_2017_06_14.tar.gz">MobileNet_v1_1.0_192</a></div></td> <td><div style="text-align: center;">418</div></td> <td><div style="text-align: center;">4.24</div></td> <td><div style="text-align: center;">69.3</div></td> <td><div style="text-align: center;">88.9</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_1.0_160_2017_06_14.tar.gz">MobileNet_v1_1.0_160</a></div></td> <td><div style="text-align: center;">291</div></td> <td><div style="text-align: center;">4.24</div></td> <td><div style="text-align: center;">67.2</div></td> <td><div style="text-align: center;">87.5</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_1.0_128_2017_06_14.tar.gz">MobileNet_v1_1.0_128</a></div></td> <td><div style="text-align: center;">186</div></td> <td><div style="text-align: center;">4.24</div></td> <td><div style="text-align: center;">64.1</div></td> <td><div style="text-align: center;">85.3</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://http//download.tensorflow.org/models/mobilenet_v1_0.75_224_2017_06_14.tar.gz">MobileNet_v1_0.75_224</a></div></td> <td><div style="text-align: center;">317</div></td> <td><div style="text-align: center;">2.59</div></td> <td><div style="text-align: center;">68.4</div></td> <td><div style="text-align: center;">88.2</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.75_192_2017_06_14.tar.gz">MobileNet_v1_0.75_192</a></div></td> <td><div style="text-align: center;">233</div></td> <td><div style="text-align: center;">2.59</div></td> <td><div style="text-align: center;">67.4</div></td> <td><div style="text-align: center;">87.3</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.75_160_2017_06_14.tar.gz">MobileNet_v1_0.75_160</a></div></td> <td><div style="text-align: center;">162</div></td> <td><div style="text-align: center;">2.59</div></td> <td><div style="text-align: center;">65.2</div></td> <td><div style="text-align: center;">86.1</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.75_128_2017_06_14.tar.gz">MobileNet_v1_0.75_128</a></div></td> <td><div style="text-align: center;">104</div></td> <td><div style="text-align: center;">2.59</div></td> <td><div style="text-align: center;">61.8</div></td> <td><div style="text-align: center;">83.6</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.50_224_2017_06_14.tar.gz">MobileNet_v1_0.50_224</a></div></td> <td><div style="text-align: center;">150</div></td> <td><div style="text-align: center;">1.34</div></td> <td><div style="text-align: center;">64.0</div></td> <td><div style="text-align: center;">85.4</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.50_192_2017_06_14.tar.gz">MobileNet_v1_0.50_192</a></div></td> <td><div style="text-align: center;">110</div></td> <td><div style="text-align: center;">1.34</div></td> <td><div style="text-align: center;">62.1</div></td> <td><div style="text-align: center;">84.0</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.50_160_2017_06_14.tar.gz">MobileNet_v1_0.50_160</a></div></td> <td><div style="text-align: center;">77</div></td> <td><div style="text-align: center;">1.34</div></td> <td><div style="text-align: center;">59.9</div></td> <td><div style="text-align: center;">82.5</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.50_128_2017_06_14.tar.gz">MobileNet_v1_0.50_128</a></div></td> <td><div style="text-align: center;">49</div></td> <td><div style="text-align: center;">1.34</div></td> <td><div style="text-align: center;">56.2</div></td> <td><div style="text-align: center;">79.6</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.25_224_2017_06_14.tar.gz">MobileNet_v1_0.25_224</a></div></td> <td><div style="text-align: center;">41</div></td> <td><div style="text-align: center;">0.47</div></td> <td><div style="text-align: center;">50.6</div></td> <td><div style="text-align: center;">75.0</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.25_192_2017_06_14.tar.gz">MobileNet_v1_0.25_192</a></div></td> <td><div style="text-align: center;">34</div></td> <td><div style="text-align: center;">0.47</div></td> <td><div style="text-align: center;">49.0</div></td> <td><div style="text-align: center;">73.6</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.25_160_2017_06_14.tar.gz">MobileNet_v1_0.25_160</a></div></td> <td><div style="text-align: center;">21</div></td> <td><div style="text-align: center;">0.47</div></td> <td><div style="text-align: center;">46.0</div></td> <td><div style="text-align: center;">70.7</div></td> </tr><tr> <td><div style="background-color: orange; text-align: center;"><a href="http://download.tensorflow.org/models/mobilenet_v1_0.25_128_2017_06_14.tar.gz">MobileNet_v1_0.25_128</a></div></td> <td><div style="text-align: center;">14</div></td> <td><div style="text-align: center;">0.47</div></td> <td><div style="text-align: center;">41.3</div></td> <td><div style="text-align: center;">66.2</div></td></tr></tbody></table>
<div style="text-align: center;">From <a href="https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html">https://ai.googleblog.com/2017/06/mobilenets-open-source-models-for.html</a></div>

# Prepping the environment

In [1]:
from tqdm import tqdm_notebook as tqdm
import os
from typing import Mapping, Any

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image
import numpy as np
import ipywidgets as widgets
from ipywidgets import interact

In [2]:
DATA_DIR = 'CUB/CUB_200_2011'

In [3]:
test_train = pd.read_table(os.path.join(DATA_DIR, "train_test_split.txt"), names=["id", "set"], sep=" ")
image_classes = pd.read_table(os.path.join(DATA_DIR, "image_class_labels.txt"), names=["id", "class_id"], sep=" ")
classes = pd.read_table(os.path.join(DATA_DIR, "classes.txt"), names=["class_id", "class_name"], sep=" ")
images = pd.read_table(os.path.join(DATA_DIR, "images.txt"), names=["id", "filename"], sep=" ")
bounding_boxes = pd.read_table(os.path.join(DATA_DIR, "bounding_boxes.txt"), names=["id", "bb_x", "bb_y", "bb_width", "bb_height"], sep=" ")
images = images.merge(test_train, on="id")
images = images.merge(image_classes, on="id")
images = images.merge(bounding_boxes, on="id")
images = images.merge(classes, on="class_id")
images["filename"] = images["filename"].map(lambda x: os.path.join(DATA_DIR, "images", x))
images.head()

Unnamed: 0,id,filename,set,class_id,bb_x,bb_y,bb_width,bb_height,class_name
0,1,CUB/CUB_200_2011/images/001.Black_footed_Albat...,0,1,60.0,27.0,325.0,304.0,001.Black_footed_Albatross
1,2,CUB/CUB_200_2011/images/001.Black_footed_Albat...,1,1,139.0,30.0,153.0,264.0,001.Black_footed_Albatross
2,3,CUB/CUB_200_2011/images/001.Black_footed_Albat...,0,1,14.0,112.0,388.0,186.0,001.Black_footed_Albatross
3,4,CUB/CUB_200_2011/images/001.Black_footed_Albat...,1,1,112.0,90.0,255.0,242.0,001.Black_footed_Albatross
4,5,CUB/CUB_200_2011/images/001.Black_footed_Albat...,1,1,70.0,50.0,134.0,303.0,001.Black_footed_Albatross


In [6]:
def show_row(row: Mapping[str, Any]):
    im = np.array(Image.open(row["filename"]), dtype=np.uint8)

    # Create figure and axes
    fig,ax = plt.subplots(1)

    # Display the image
    ax.imshow(im)

    # Create a Rectangle patch
    rect = patches.Rectangle((row["bb_x"], row["bb_y"]),
                             row["bb_width"], row["bb_height"],
                             linewidth=3,edgecolor='r',facecolor='none')

    # Add the patch to the Axes
    ax.add_patch(rect)
    ax.set_axis_off()
    
    plt.title(row["class_name"])
    plt.show()

In [7]:
def slider_plot(i):
    show_row(images.iloc[[i]].squeeze())

interact(slider_plot, i=(0, images.shape[0]-1))

interactive(children=(IntSlider(value=5893, description='i', max=11787), Output()), _dom_classes=('widget-inte…

<function __main__.slider_plot(i)>

# Using ImageNet

# A Tiny Bit About Transfer Learning