# Transfer Learning for Classifying Cat and Dog images using Tensorflow

## Introduction

Modern object recognition models have millions of parameters and can take weeks to fully train. Transfer learning is a technique that shortcuts a lot of this work by taking a fully-trained model for a set of categories like ImageNet, and retrains from the existing weights for new classes. 

In this notebook, Google's Inception-V3 pretrained model for Image Classification will be used. Inception v3 is a trained for the [ImageNet](http://image-net.org/) Large Visual Recognition Challenge using the data from 2012, and it can differentiate between 1,000 different classes, like Dalmatian or dishwasher.

The chart below shows how the data flows in the Inception v3 model, which is a Convolutional Neural Network with many layers and a complicated structure. The [research paper](http://arxiv.org/pdf/1512.00567v3.pdf) gives more details on how the Inception model is constructed and why it is designed that way.

![](./images/inception_flowchart.png)

Inception model is actually quite capable of extracting useful information from an image. So we can instead train the Inception model using another data-set, in for this notebook the Cat vs Dog dataset. But it takes several weeks using a very powerful and expensive computer to fully train the Inception model on a new data-set.

We can instead re-use the pre-trained Inception model and merely replace the layer that does the final classification. This is called Transfer Learning.

Though it's not as good as a full training run, this is surprisingly effective for many applications, and can be run in as little as thirty minutes on a laptop, without requiring a GPU. 

Transfer Learning will be used to retrain the last layers of the Inception model to be able to classify Cat and Dog images.

## Dataset

Reference :<br>
[Animals on the Web](http://tamaraberg.com/papers/berg_animals.pdf) <br>
[Tamara L. Berg](http://tamaraberg.com/), [David A. Forsyth](http://luthuli.cs.uiuc.edu/~daf) <br> 
*Computer Vision and Pattern Recognition (CVPR), 2006* <br>

Preprocessed dataset in the Github repository.  You may want to use other animal images for classification.

## Data Preprocessing

The postive samples saved in the directory may not be optimal for classifying using transfer learning on the Inception model. The Inception model was trained on the ImageNet dataset which contains JPG / JPEG images sized **299 X 299**. Our positive samples contains some GIFs and other images in PNG format. Hence there is a need to preprocess the dataset.

### gifs2jpg.py 

In [1]:
# convert gif to jpg
from PIL import Image
import glob
import os,sys,shutil

def processImage(infile):
    try:
        im = Image.open(infile)
    except IOError:
        print("Cant load", infile)
        sys.exit(1)
    i = 0
    mypalette = im.getpalette()

    try:
        while 1:
            im.putpalette(mypalette)
            new_im = Image.new("RGBA", im.size)
            new_im.paste(im)
            infilename = os.path.splitext(infile)[0]
            new_im.save(infile+str(i)+'.jpg')
            print('Saved : ' + infilename+str(i)+'.jpg')
            i += 1
            im.seek(im.tell() + 1)

    except EOFError:
        pass # end of sequence

if False:   # Dont call this as the preprocessed dataset is already included.
            # Call this if you want to carry out the preprocessing yourself
    for single_gif in glob.glob('*.gif'):
        print(single_gif)
        processImage(single_gif)
        ensure_dir_exists('Converted_gifs')
        shutil.move(single_gif,'Converted_gifs/'+single_gif)

### resizeImages.py

In [2]:
# resize image to 299 X 299
import os, sys, glob
from PIL import Image

size = (299, 299)
filetypes =['*.jpg','*.jpeg']

def resize(infile):
    outfile = os.path.splitext(infile)[0] + "_resized.jpg"
    if infile != outfile:
        try:
            im = Image.open(infile)
            im.thumbnail(size, Image.ANTIALIAS)
            old_im_size = im.size
            
            ## By default, black colour would be used as the background for padding!
            new_im = Image.new("RGB", size)

            new_im.paste(im, (int((size[0]-old_im_size[0])/2),int((size[1]-old_im_size[1])/2)))
            
            new_im.save('Resized_JPGs/' + outfile, "JPEG")
        except IOError:
            print ("Cannot resize '%s'" % infile)

if False:   # Dont call this as the preprocessed dataset is already included.
            # Call this if you want to carry out the preprocessing yourself            
    for filetype in filetypes:
        for single_jpg in glob.glob(filetype):
            print (single_jpg)
            resize(single_jpg)

    print ("Done")


This script resizes the images to 299 x 299 by padding black color to the boundaries.

For example it resizes 

<img src="images/original.jpg">

to : 

<img src="images/resized.jpg">

Now that we have our datasets ready we can move over to actually code our image classifier.

# Retraining

The following scripts demonstrate how to take an Inception v3 architecture model trained on
ImageNet images, and train a new top layer that can recognize other classes of
images.

The top layer receives as input a 2048-dimensional vector for each image. We
train a softmax layer on top of this representation. Assuming the softmax layer
contains N labels, this corresponds to learning N + 2048*N model parameters
corresponding to the learned biases and weights.

We have a folder called Animals_Data containing class-named
subfolders of 'Leopard' and 'Giraffe', each containing full of images for Leopards and Giraffes. 
The subfolder names are important, since they define what label is applied to
each image, but the filenames themselves don't matter. The label for each image is taken from the name of the subfolder it's
in. This produces a new model file that can be loaded and run by any TensorFlow
program, for example the label_image sample code.

## Imports

In [3]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
import hashlib
import os.path
import random
import re
import struct
import sys
import tarfile

import numpy as np
from six.moves import urllib
import tensorflow as tf

from tensorflow.python.framework import graph_util
from tensorflow.python.framework import tensor_shape
from tensorflow.python.platform import gfile
from tensorflow.python.util import compat

The following are all parameters that are tied to the particular model architecture
we're using for Inception v3. These include things like tensor names and their
sizes. If you want to adapt this script to work with another model, you will
need to update these to reflect the values in the network you're using.

In [4]:
import os
image_dir = os.path.join("Animal_Data")
output_graph = "output_graph.pb"
output_labels = "output_labels.txt"
summaries_dir = "/Users/user_adax/github/google_io_extended_KL/temp/"
how_many_training_steps = 4000
learning_rate = 0.01
testing_percentage = 10
validation_percentage = 10
eval_step_interval = 10
train_batch_size = 100
test_batch_size = -1
validation_batch_size = 100
print_misclassified_test_images = False
model_dir = os.path.join('inception')
bottleneck_dir = "bottlenecks"
final_tensor_name = "final_result"
flip_left_right = False
random_crop = 0
random_scale = 0
random_brightness = 0
DATA_URL = 'http://download.tensorflow.org/models/image/imagenet/inception-2015-12-05.tgz'

# pylint: enable=line-too-long
BOTTLENECK_TENSOR_NAME = 'pool_3/_reshape:0'
BOTTLENECK_TENSOR_SIZE = 2048
MODEL_INPUT_WIDTH = 299
MODEL_INPUT_HEIGHT = 299
MODEL_INPUT_DEPTH = 3
JPEG_DATA_TENSOR_NAME = 'DecodeJpeg/contents:0'
RESIZED_INPUT_TENSOR_NAME = 'ResizeBilinear:0'
MAX_NUM_IMAGES_PER_CLASS = 2 ** 27 - 1  # ~134M

In [5]:
def create_image_lists(image_dir, testing_percentage, validation_percentage):
  if not gfile.Exists(image_dir):
    print("Image directory '" + image_dir + "' not found.")
    return None
  result = {}
  sub_dirs = [x[0] for x in gfile.Walk(image_dir)]
  # The root directory comes first, so skip it.
  is_root_dir = True
  for sub_dir in sub_dirs:
    if is_root_dir:
      is_root_dir = False
      continue
    extensions = ['jpg', 'jpeg', 'JPG', 'JPEG']
    file_list = []
    dir_name = os.path.basename(sub_dir)
    if dir_name == image_dir:
      continue
    print("Looking for images in '" + dir_name + "'")
    for extension in extensions:
      file_glob = os.path.join(image_dir, dir_name, '*.' + extension)
      file_list.extend(gfile.Glob(file_glob))
    if not file_list:
      print('No files found')
      continue
    if len(file_list) < 20:
      print('WARNING: Folder has less than 20 images, which may cause issues.')
    elif len(file_list) > MAX_NUM_IMAGES_PER_CLASS:
      print('WARNING: Folder {} has more than {} images. Some images will '
            'never be selected.'.format(dir_name, MAX_NUM_IMAGES_PER_CLASS))
    label_name = re.sub(r'[^a-z0-9]+', ' ', dir_name.lower())
    training_images = []
    testing_images = []
    validation_images = []
    for file_name in file_list:
      base_name = os.path.basename(file_name)
      # We want to ignore anything after '_nohash_' in the file name when
      # deciding which set to put an image in, the data set creator has a way of
      # grouping photos that are close variations of each other. For example
      # this is used in the plant disease data set to group multiple pictures of
      # the same leaf.
      hash_name = re.sub(r'_nohash_.*$', '', file_name)
      # This looks a bit magical, but we need to decide whether this file should
      # go into the training, testing, or validation sets, and we want to keep
      # existing files in the same set even if more files are subsequently
      # added.
      # To do that, we need a stable way of deciding based on just the file name
      # itself, so we do a hash of that and then use that to generate a
      # probability value that we use to assign it.
      hash_name_hashed = hashlib.sha1(compat.as_bytes(hash_name)).hexdigest()
      percentage_hash = ((int(hash_name_hashed, 16) %
                          (MAX_NUM_IMAGES_PER_CLASS + 1)) *
                         (100.0 / MAX_NUM_IMAGES_PER_CLASS))
      if percentage_hash < validation_percentage:
        validation_images.append(base_name)
      elif percentage_hash < (testing_percentage + validation_percentage):
        testing_images.append(base_name)
      else:
        training_images.append(base_name)
    result[label_name] = {
        'dir': dir_name,
        'training': training_images,
        'testing': testing_images,
        'validation': validation_images,
    }
  return result

In [6]:
# function returns a path to an image for a label at the given index
def get_image_path(image_lists, label_name, index, image_dir, category):
  if label_name not in image_lists:
    tf.logging.fatal('Label does not exist %s.', label_name)
  label_lists = image_lists[label_name]
  if category not in label_lists:
    tf.logging.fatal('Category does not exist %s.', category)
  category_list = label_lists[category]
  if not category_list:
    tf.logging.fatal('Label %s has no images in the category %s.',
                     label_name, category)
  mod_index = index % len(category_list)
  base_name = category_list[mod_index]
  sub_dir = label_lists['dir']
  full_path = os.path.join(image_dir, sub_dir, base_name)
  return full_path


In [7]:
# returns a path to a bottleneck file for a label at the given index
def get_bottleneck_path(image_lists, label_name, index, bottleneck_dir,
                        category):
  
  return get_image_path(image_lists, label_name, index, bottleneck_dir,
                        category) + '.txt'


In [8]:
# create a graph from saved GraphDef file and returns a Graph object
def create_inception_graph():
  with tf.Session() as sess:
    model_filename = os.path.join(
        model_dir, 'classify_image_graph_def.pb')
    with gfile.FastGFile(model_filename, 'rb') as f:
      graph_def = tf.GraphDef()
      graph_def.ParseFromString(f.read())
      bottleneck_tensor, jpeg_data_tensor, resized_input_tensor = (
          tf.import_graph_def(graph_def, name='', return_elements=[
              BOTTLENECK_TENSOR_NAME, JPEG_DATA_TENSOR_NAME,
              RESIZED_INPUT_TENSOR_NAME]))
  return sess.graph, bottleneck_tensor, jpeg_data_tensor, resized_input_tensor


In [9]:
# inference on an image to extract the 'bottleneck' summary layer.
def run_bottleneck_on_image(sess, image_data, image_data_tensor,
                            bottleneck_tensor):
  bottleneck_values = sess.run(
      bottleneck_tensor,
      {image_data_tensor: image_data})
  bottleneck_values = np.squeeze(bottleneck_values)
  return bottleneck_values


In [10]:
# downloads and extracts model tar file
def maybe_download_and_extract():
  dest_directory = model_dir
  if not os.path.exists(dest_directory):
    os.makedirs(dest_directory)
  filename = DATA_URL.split('/')[-1]
  filepath = os.path.join(dest_directory, filename)
  if not os.path.exists(filepath):

    def _progress(count, block_size, total_size):
      sys.stdout.write('\r>> Downloading %s %.1f%%' %
                       (filename,
                        float(count * block_size) / float(total_size) * 100.0))
      sys.stdout.flush()

    filepath, _ = urllib.request.urlretrieve(DATA_URL,
                                             filepath,
                                             _progress)
    print()
    statinfo = os.stat(filepath)
    print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')
  tarfile.open(filepath, 'r:gz').extractall(dest_directory)


In [11]:
# make sure the folder exists on disk
def ensure_dir_exists(dir_name):
  if not os.path.exists(dir_name):
    os.makedirs(dir_name)


In [12]:
# write a given list of floats to a binary file
def write_list_of_floats_to_file(list_of_floats , file_path):
  s = struct.pack('d' * BOTTLENECK_TENSOR_SIZE, *list_of_floats)
  with open(file_path, 'wb') as f:
    f.write(s)


In [13]:
# read list of floats from a given file
def read_list_of_floats_from_file(file_path):

  with open(file_path, 'rb') as f:
    s = struct.unpack('d' * BOTTLENECK_TENSOR_SIZE, f.read())
    return list(s)


In [14]:
bottleneck_path_2_bottleneck_values = {}

def create_bottleneck_file(bottleneck_path, image_lists, label_name, index,
                           image_dir, category, sess, jpeg_data_tensor, bottleneck_tensor):
  print('Creating bottleneck at ' + bottleneck_path)
  image_path = get_image_path(image_lists, label_name, index, image_dir, category)
  if not gfile.Exists(image_path):
    tf.logging.fatal('File does not exist %s', image_path)
  image_data = gfile.FastGFile(image_path, 'rb').read()
  bottleneck_values = run_bottleneck_on_image(sess, image_data, jpeg_data_tensor, bottleneck_tensor)
  bottleneck_string = ','.join(str(x) for x in bottleneck_values)
  with open(bottleneck_path, 'w') as bottleneck_file:
    bottleneck_file.write(bottleneck_string)


In [15]:
# retrieves or calculates bottleneck values for an image.
def get_or_create_bottleneck(sess, image_lists, label_name, index, image_dir,
                             category, bottleneck_dir, jpeg_data_tensor,
                             bottleneck_tensor):
  label_lists = image_lists[label_name]
  sub_dir = label_lists['dir']
  sub_dir_path = os.path.join(bottleneck_dir, sub_dir)
  ensure_dir_exists(sub_dir_path)
  bottleneck_path = get_bottleneck_path(image_lists, label_name, index, bottleneck_dir, category)
  if not os.path.exists(bottleneck_path):
    create_bottleneck_file(bottleneck_path, image_lists, label_name, index, image_dir, category, sess, jpeg_data_tensor, bottleneck_tensor)
  with open(bottleneck_path, 'r') as bottleneck_file:
    bottleneck_string = bottleneck_file.read()
  did_hit_error = False
  try:
    bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
  except:
    print("Invalid float found, recreating bottleneck")
    did_hit_error = True
  if did_hit_error:
    create_bottleneck_file(bottleneck_path, image_lists, label_name, index, image_dir, category, sess, jpeg_data_tensor, bottleneck_tensor)
    with open(bottleneck_path, 'r') as bottleneck_file:
      bottleneck_string = bottleneck_file.read()
    # Allow exceptions to propagate here, since they shouldn't happen after a fresh creation
    bottleneck_values = [float(x) for x in bottleneck_string.split(',')]
  return bottleneck_values


In [16]:
# ensure all the training, testing, and validation bottlenecks are cached
def cache_bottlenecks(sess, image_lists, image_dir, bottleneck_dir,
                      jpeg_data_tensor, bottleneck_tensor):
  how_many_bottlenecks = 0
  ensure_dir_exists(bottleneck_dir)
  for label_name, label_lists in image_lists.items():
    for category in ['training', 'testing', 'validation']:
      category_list = label_lists[category]
      for index, unused_base_name in enumerate(category_list):
        get_or_create_bottleneck(sess, image_lists, label_name, index,
                                 image_dir, category, bottleneck_dir,
                                 jpeg_data_tensor, bottleneck_tensor)

        how_many_bottlenecks += 1
        if how_many_bottlenecks % 100 == 0:
          print(str(how_many_bottlenecks) + ' bottleneck files created.')


In [17]:
# retrieve bottleneck values for cached images
def get_random_cached_bottlenecks(sess, image_lists, how_many, category,
                                  bottleneck_dir, image_dir, jpeg_data_tensor,
                                  bottleneck_tensor):
  class_count = len(image_lists.keys())
  bottlenecks = []
  ground_truths = []
  filenames = []
  if how_many >= 0:
    # Retrieve a random sample of bottlenecks.
    for unused_i in range(how_many):
      label_index = random.randrange(class_count)
      label_name = list(image_lists.keys())[label_index]
      image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)
      image_name = get_image_path(image_lists, label_name, image_index,
                                  image_dir, category)
      bottleneck = get_or_create_bottleneck(sess, image_lists, label_name,
                                            image_index, image_dir, category,
                                            bottleneck_dir, jpeg_data_tensor,
                                            bottleneck_tensor)
      ground_truth = np.zeros(class_count, dtype=np.float32)
      ground_truth[label_index] = 1.0
      bottlenecks.append(bottleneck)
      ground_truths.append(ground_truth)
      filenames.append(image_name)
  else:
    # Retrieve all bottlenecks.
    for label_index, label_name in enumerate(image_lists.keys()):
      for image_index, image_name in enumerate(
          image_lists[label_name][category]):
        image_name = get_image_path(image_lists, label_name, image_index,
                                    image_dir, category)
        bottleneck = get_or_create_bottleneck(sess, image_lists, label_name,
                                              image_index, image_dir, category,
                                              bottleneck_dir, jpeg_data_tensor,
                                              bottleneck_tensor)
        ground_truth = np.zeros(class_count, dtype=np.float32)
        ground_truth[label_index] = 1.0
        bottlenecks.append(bottleneck)
        ground_truths.append(ground_truth)
        filenames.append(image_name)
  return bottlenecks, ground_truths, filenames


The following function retrieves bottleneck values for training images, after distortions.
If we're training with distortions like crops, scales, or flips, we have to recalculate the full model for every image, and so we can't use cached bottleneck values. Instead we find random images for the requested category, run them through the distortion graph, and then the full graph to get the bottleneck results for each.


In [18]:
def get_random_distorted_bottlenecks(
    sess, image_lists, how_many, category, image_dir, input_jpeg_tensor,
    distorted_image, resized_input_tensor, bottleneck_tensor):

  class_count = len(image_lists.keys())
  bottlenecks = []
  ground_truths = []
  for unused_i in range(how_many):
    label_index = random.randrange(class_count)
    label_name = list(image_lists.keys())[label_index]
    image_index = random.randrange(MAX_NUM_IMAGES_PER_CLASS + 1)
    image_path = get_image_path(image_lists, label_name, image_index, image_dir,
                                category)
    if not gfile.Exists(image_path):
      tf.logging.fatal('File does not exist %s', image_path)
    jpeg_data = gfile.FastGFile(image_path, 'rb').read()
    # Note that we materialize the distorted_image_data as a numpy array before
    # sending running inference on the image. This involves 2 memory copies and
    # might be optimized in other implementations.
    distorted_image_data = sess.run(distorted_image,
                                    {input_jpeg_tensor: jpeg_data})
    bottleneck = run_bottleneck_on_image(sess, distorted_image_data,
                                         resized_input_tensor,
                                         bottleneck_tensor)
    ground_truth = np.zeros(class_count, dtype=np.float32)
    ground_truth[label_index] = 1.0
    bottlenecks.append(bottleneck)
    ground_truths.append(ground_truth)
  return bottlenecks, ground_truths



In [19]:
# return whether any distortions are enabled
def should_distort_images(flip_left_right, random_crop, random_scale,
                          random_brightness):
  return (flip_left_right or (random_crop != 0) or (random_scale != 0) or
          (random_brightness != 0))



The following function creates the operations to apply the specified distortions.
  During training it can help to improve the results if we run the images
  through simple distortions like crops, scales, and flips. These reflect the
  kind of variations we expect in the real world, and so can help train the
  model to cope with natural data more effectively. Here we take the supplied
  parameters and construct a network of operations to apply them to an image.

In [20]:
def add_input_distortions(flip_left_right, random_crop, random_scale,
                          random_brightness):
  
  jpeg_data = tf.placeholder(tf.string, name='DistortJPGInput')
  decoded_image = tf.image.decode_jpeg(jpeg_data, channels=MODEL_INPUT_DEPTH)
  decoded_image_as_float = tf.cast(decoded_image, dtype=tf.float32)
  decoded_image_4d = tf.expand_dims(decoded_image_as_float, 0)
  margin_scale = 1.0 + (random_crop / 100.0)
  resize_scale = 1.0 + (random_scale / 100.0)
  margin_scale_value = tf.constant(margin_scale)
  resize_scale_value = tf.random_uniform(tensor_shape.scalar(),
                                         minval=1.0,
                                         maxval=resize_scale)
  scale_value = tf.multiply(margin_scale_value, resize_scale_value)
  precrop_width = tf.multiply(scale_value, MODEL_INPUT_WIDTH)
  precrop_height = tf.multiply(scale_value, MODEL_INPUT_HEIGHT)
  precrop_shape = tf.stack([precrop_height, precrop_width])
  precrop_shape_as_int = tf.cast(precrop_shape, dtype=tf.int32)
  precropped_image = tf.image.resize_bilinear(decoded_image_4d,
                                              precrop_shape_as_int)
  precropped_image_3d = tf.squeeze(precropped_image, squeeze_dims=[0])
  cropped_image = tf.random_crop(precropped_image_3d,
                                 [MODEL_INPUT_HEIGHT, MODEL_INPUT_WIDTH,
                                  MODEL_INPUT_DEPTH])
  if flip_left_right:
    flipped_image = tf.image.random_flip_left_right(cropped_image)
  else:
    flipped_image = cropped_image
  brightness_min = 1.0 - (random_brightness / 100.0)
  brightness_max = 1.0 + (random_brightness / 100.0)
  brightness_value = tf.random_uniform(tensor_shape.scalar(),
                                       minval=brightness_min,
                                       maxval=brightness_max)
  brightened_image = tf.multiply(flipped_image, brightness_value)
  distort_result = tf.expand_dims(brightened_image, 0, name='DistortResult')
  return jpeg_data, distort_result



Attach a lot of summaries to a Tensor (for TensorBoard visualization).

In [21]:
def variable_summaries(var):
  
  with tf.name_scope('summaries'):
    mean = tf.reduce_mean(var)
    tf.summary.scalar('mean', mean)
    with tf.name_scope('stddev'):
      stddev = tf.sqrt(tf.reduce_mean(tf.square(var - mean)))
    tf.summary.scalar('stddev', stddev)
    tf.summary.scalar('max', tf.reduce_max(var))
    tf.summary.scalar('min', tf.reduce_min(var))
    tf.summary.histogram('histogram', var)



The following function adds a new softmax and fully-connected layer for training.
  We need to retrain the top layer to identify our new classes, so this function
  adds the right operations to the graph, along with some variables to hold the
  weights, and then sets up all the gradients for the backward pass.
  The set up for the softmax and fully-connected layers is based on:
  https://tensorflow.org/versions/master/tutorials/mnist/beginners/index.html

In [22]:
def add_final_training_ops(class_count, final_tensor_name, bottleneck_tensor):
  
  with tf.name_scope('input'):
    bottleneck_input = tf.placeholder_with_default(
        bottleneck_tensor, shape=[None, BOTTLENECK_TENSOR_SIZE],
        name='BottleneckInputPlaceholder')

    ground_truth_input = tf.placeholder(tf.float32,
                                        [None, class_count],
                                        name='GroundTruthInput')

  # Organizing the following ops as `final_training_ops` so they're easier
  # to see in TensorBoard
  layer_name = 'final_training_ops'
  with tf.name_scope(layer_name):
    with tf.name_scope('weights'):
      layer_weights = tf.Variable(tf.truncated_normal([BOTTLENECK_TENSOR_SIZE, class_count], stddev=0.001), name='final_weights')
      variable_summaries(layer_weights)
    with tf.name_scope('biases'):
      layer_biases = tf.Variable(tf.zeros([class_count]), name='final_biases')
      variable_summaries(layer_biases)
    with tf.name_scope('Wx_plus_b'):
      logits = tf.matmul(bottleneck_input, layer_weights) + layer_biases
      tf.summary.histogram('pre_activations', logits)

  final_tensor = tf.nn.softmax(logits, name=final_tensor_name)
  tf.summary.histogram('activations', final_tensor)

  with tf.name_scope('cross_entropy'):
    cross_entropy = tf.nn.softmax_cross_entropy_with_logits(
        labels=ground_truth_input, logits=logits)
    with tf.name_scope('total'):
      cross_entropy_mean = tf.reduce_mean(cross_entropy)
  tf.summary.scalar('cross_entropy', cross_entropy_mean)

  with tf.name_scope('train'):
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(
        cross_entropy_mean)

  return (train_step, cross_entropy_mean, bottleneck_input, ground_truth_input,
          final_tensor)



The following function inserts the operations we need to evaluate the accuracy of our results.

In [23]:
def add_evaluation_step(result_tensor, ground_truth_tensor):

  with tf.name_scope('accuracy'):
    with tf.name_scope('correct_prediction'):
      prediction = tf.argmax(result_tensor, 1)
      correct_prediction = tf.equal(
          prediction, tf.argmax(ground_truth_tensor, 1))
    with tf.name_scope('accuracy'):
      evaluation_step = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  tf.summary.scalar('accuracy', evaluation_step)
  return evaluation_step, prediction



Main function :

In [24]:
# Setup the directory we'll write summaries to for TensorBoard
if tf.gfile.Exists(summaries_dir):
    tf.gfile.DeleteRecursively(summaries_dir)
tf.gfile.MakeDirs(summaries_dir)

# Set up the pre-trained graph.
maybe_download_and_extract()
graph, bottleneck_tensor, jpeg_data_tensor, resized_image_tensor = (
      create_inception_graph())

# Look at the folder structure, and create lists of all the images.
image_lists = create_image_lists(image_dir, testing_percentage,
                                   validation_percentage)
class_count = len(image_lists.keys())
if class_count == 0:
    print('No valid folders of images found at ' + image_dir)
if class_count == 1:
    print('Only one valid folder of images found at ' + image_dir +
          ' - multiple classes are needed for classification.')
# See if the command-line flags mean we're applying any distortions.
do_distort_images = should_distort_images(
      flip_left_right, random_crop, random_scale,
      random_brightness)
sess = tf.Session()

if do_distort_images:
    # We will be applying distortions, so setup the operations we'll need.
    distorted_jpeg_data_tensor, distorted_image_tensor = add_input_distortions(
        flip_left_right, random_crop, random_scale,
        random_brightness)
else:
    # We'll make sure we've calculated the 'bottleneck' image summaries and
    # cached them on disk.
    cache_bottlenecks(sess, image_lists, image_dir, bottleneck_dir,
                      jpeg_data_tensor, bottleneck_tensor)

# Add the new layer that we'll be training.
(train_step, cross_entropy, bottleneck_input, ground_truth_input,
   final_tensor) = add_final_training_ops(len(image_lists.keys()),
                                          final_tensor_name,
                                          bottleneck_tensor)

# Create the operations we need to evaluate the accuracy of our new layer.
evaluation_step, prediction = add_evaluation_step(
      final_tensor, ground_truth_input)

# Merge all the summaries and write them out to /tmp/retrain_logs (by default)
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(summaries_dir + '/train',sess.graph)
validation_writer = tf.summary.FileWriter(summaries_dir + '/validation')

# Set up all our weights to their initial default values.
init = tf.global_variables_initializer()
sess.run(init)

# Run the training for as many cycles as requested on the command line.
for i in range(how_many_training_steps):
    # Get a batch of input bottleneck values, either calculated fresh every time
    # with distortions applied, or from the cache stored on disk.
    if do_distort_images:
        train_bottlenecks, train_ground_truth = get_random_distorted_bottlenecks(
            sess, image_lists, train_batch_size, 'training',
            image_dir, distorted_jpeg_data_tensor,
            distorted_image_tensor, resized_image_tensor, bottleneck_tensor)
    else:
        train_bottlenecks, train_ground_truth, _ = get_random_cached_bottlenecks(
            sess, image_lists, train_batch_size, 'training',
            bottleneck_dir, image_dir, jpeg_data_tensor,
            bottleneck_tensor)
    # Feed the bottlenecks and ground truth into the graph, and run a training
    # step. Capture training summaries for TensorBoard with the `merged` op.
    train_summary, _ = sess.run([merged, train_step],
                                feed_dict={bottleneck_input: train_bottlenecks,
                                           ground_truth_input: train_ground_truth})
    train_writer.add_summary(train_summary, i)

    # Every so often, print out how well the graph is training.
    is_last_step = (i + 1 == how_many_training_steps)
    if (i % eval_step_interval) == 0 or is_last_step:
        train_accuracy, cross_entropy_value = sess.run(
            [evaluation_step, cross_entropy],
            feed_dict={bottleneck_input: train_bottlenecks,
                       ground_truth_input: train_ground_truth})
        print('%s: Step %d: Train accuracy = %.1f%%' % (datetime.now(), i,
                                                      train_accuracy * 100))
        print('%s: Step %d: Cross entropy = %f' % (datetime.now(), i,
                                                 cross_entropy_value))
        validation_bottlenecks, validation_ground_truth, _ = (
            get_random_cached_bottlenecks(
                sess, image_lists, validation_batch_size, 'validation',
                bottleneck_dir, image_dir, jpeg_data_tensor,
                bottleneck_tensor))
        # Run a validation step and capture training summaries for TensorBoard
        # with the `merged` op.
        validation_summary, validation_accuracy = sess.run(
            [merged, evaluation_step],
            feed_dict={bottleneck_input: validation_bottlenecks,
                     ground_truth_input: validation_ground_truth})
        validation_writer.add_summary(validation_summary, i)
        print('%s: Step %d: Validation accuracy = %.1f%% (N=%d)' %
              (datetime.now(), i, validation_accuracy * 100,
               len(validation_bottlenecks)))

# We've completed all our training, so run a final test evaluation on
# some new images we haven't used before.
test_bottlenecks, test_ground_truth, test_filenames = (
    get_random_cached_bottlenecks(sess, image_lists, test_batch_size,
                                  'testing', bottleneck_dir,
                                  image_dir, jpeg_data_tensor,
                                  bottleneck_tensor))
test_accuracy, predictions = sess.run(
    [evaluation_step, prediction],
    feed_dict={bottleneck_input: test_bottlenecks,
               ground_truth_input: test_ground_truth})
print('Final test accuracy = %.1f%% (N=%d)' % (
    test_accuracy * 100, len(test_bottlenecks)))

if print_misclassified_test_images:
    print('=== MISCLASSIFIED TEST IMAGES ===')
    for i, test_filename in enumerate(test_filenames):
        if predictions[i] != test_ground_truth[i].argmax():
            print('%70s  %s' % (test_filename, image_lists.keys()[predictions[i]]))

# Write out the trained graph and labels with the weights stored as constants.
output_graph_def = graph_util.convert_variables_to_constants(
    sess, graph.as_graph_def(), [final_tensor_name])
with gfile.FastGFile(output_graph, 'wb') as f:
    f.write(output_graph_def.SerializeToString())
with gfile.FastGFile(output_labels, 'w') as f:
    f.write('\n'.join(image_lists.keys()) + '\n')

Looking for images in 'Cat'
Looking for images in 'Dog'
100 bottleneck files created.
200 bottleneck files created.
300 bottleneck files created.
400 bottleneck files created.
500 bottleneck files created.
600 bottleneck files created.
700 bottleneck files created.
800 bottleneck files created.
900 bottleneck files created.
1000 bottleneck files created.
1100 bottleneck files created.
1200 bottleneck files created.
1300 bottleneck files created.
1400 bottleneck files created.
1500 bottleneck files created.
1600 bottleneck files created.
1700 bottleneck files created.
1800 bottleneck files created.
1900 bottleneck files created.
2000 bottleneck files created.
2017-07-21 22:22:03.757749: Step 0: Train accuracy = 81.0%
2017-07-21 22:22:03.758340: Step 0: Cross entropy = 0.614495
2017-07-21 22:22:03.893697: Step 0: Validation accuracy = 70.0% (N=100)
2017-07-21 22:22:05.036778: Step 10: Train accuracy = 99.0%
2017-07-21 22:22:05.036898: Step 10: Cross entropy = 0.307937
2017-07-21 22:22:05

2017-07-21 22:22:48.192773: Step 390: Train accuracy = 99.0%
2017-07-21 22:22:48.192917: Step 390: Cross entropy = 0.045716
2017-07-21 22:22:48.280000: Step 390: Validation accuracy = 100.0% (N=100)
2017-07-21 22:22:49.204175: Step 400: Train accuracy = 99.0%
2017-07-21 22:22:49.204358: Step 400: Cross entropy = 0.030957
2017-07-21 22:22:49.294798: Step 400: Validation accuracy = 100.0% (N=100)
2017-07-21 22:22:50.219040: Step 410: Train accuracy = 100.0%
2017-07-21 22:22:50.219182: Step 410: Cross entropy = 0.021518
2017-07-21 22:22:50.308461: Step 410: Validation accuracy = 99.0% (N=100)
2017-07-21 22:22:51.222844: Step 420: Train accuracy = 99.0%
2017-07-21 22:22:51.222977: Step 420: Cross entropy = 0.058526
2017-07-21 22:22:51.311375: Step 420: Validation accuracy = 100.0% (N=100)
2017-07-21 22:22:52.246439: Step 430: Train accuracy = 100.0%
2017-07-21 22:22:52.246571: Step 430: Cross entropy = 0.023944
2017-07-21 22:22:52.341502: Step 430: Validation accuracy = 100.0% (N=100)
2017

2017-07-21 22:23:35.990301: Step 810: Train accuracy = 100.0%
2017-07-21 22:23:35.990452: Step 810: Cross entropy = 0.011945
2017-07-21 22:23:36.086049: Step 810: Validation accuracy = 98.0% (N=100)
2017-07-21 22:23:37.005374: Step 820: Train accuracy = 100.0%
2017-07-21 22:23:37.005499: Step 820: Cross entropy = 0.022721
2017-07-21 22:23:37.096909: Step 820: Validation accuracy = 100.0% (N=100)
2017-07-21 22:23:38.117932: Step 830: Train accuracy = 100.0%
2017-07-21 22:23:38.118159: Step 830: Cross entropy = 0.015184
2017-07-21 22:23:38.238124: Step 830: Validation accuracy = 100.0% (N=100)
2017-07-21 22:23:39.295819: Step 840: Train accuracy = 100.0%
2017-07-21 22:23:39.295944: Step 840: Cross entropy = 0.019347
2017-07-21 22:23:39.389060: Step 840: Validation accuracy = 99.0% (N=100)
2017-07-21 22:23:40.429541: Step 850: Train accuracy = 100.0%
2017-07-21 22:23:40.429702: Step 850: Cross entropy = 0.009476
2017-07-21 22:23:40.519605: Step 850: Validation accuracy = 100.0% (N=100)
20

2017-07-21 22:24:25.262591: Step 1220: Train accuracy = 100.0%
2017-07-21 22:24:25.262733: Step 1220: Cross entropy = 0.010188
2017-07-21 22:24:25.360566: Step 1220: Validation accuracy = 99.0% (N=100)
2017-07-21 22:24:26.330837: Step 1230: Train accuracy = 100.0%
2017-07-21 22:24:26.330967: Step 1230: Cross entropy = 0.010207
2017-07-21 22:24:26.426554: Step 1230: Validation accuracy = 99.0% (N=100)
2017-07-21 22:24:27.369870: Step 1240: Train accuracy = 100.0%
2017-07-21 22:24:27.370309: Step 1240: Cross entropy = 0.010098
2017-07-21 22:24:27.455991: Step 1240: Validation accuracy = 100.0% (N=100)
2017-07-21 22:24:28.407331: Step 1250: Train accuracy = 100.0%
2017-07-21 22:24:28.407463: Step 1250: Cross entropy = 0.013183
2017-07-21 22:24:28.495572: Step 1250: Validation accuracy = 98.0% (N=100)
2017-07-21 22:24:29.438158: Step 1260: Train accuracy = 100.0%
2017-07-21 22:24:29.438292: Step 1260: Cross entropy = 0.007632
2017-07-21 22:24:29.531714: Step 1260: Validation accuracy = 100

2017-07-21 22:25:15.398783: Step 1630: Train accuracy = 100.0%
2017-07-21 22:25:15.398921: Step 1630: Cross entropy = 0.015500
2017-07-21 22:25:15.492202: Step 1630: Validation accuracy = 100.0% (N=100)
2017-07-21 22:25:16.544478: Step 1640: Train accuracy = 100.0%
2017-07-21 22:25:16.544714: Step 1640: Cross entropy = 0.010914
2017-07-21 22:25:16.704829: Step 1640: Validation accuracy = 99.0% (N=100)
2017-07-21 22:25:17.855720: Step 1650: Train accuracy = 100.0%
2017-07-21 22:25:17.855980: Step 1650: Cross entropy = 0.015837
2017-07-21 22:25:17.957512: Step 1650: Validation accuracy = 100.0% (N=100)
2017-07-21 22:25:19.255513: Step 1660: Train accuracy = 100.0%
2017-07-21 22:25:19.255686: Step 1660: Cross entropy = 0.010123
2017-07-21 22:25:19.421023: Step 1660: Validation accuracy = 99.0% (N=100)
2017-07-21 22:25:20.594215: Step 1670: Train accuracy = 100.0%
2017-07-21 22:25:20.594529: Step 1670: Cross entropy = 0.008306
2017-07-21 22:25:20.724208: Step 1670: Validation accuracy = 99

2017-07-21 22:26:13.489135: Step 2040: Train accuracy = 100.0%
2017-07-21 22:26:13.489440: Step 2040: Cross entropy = 0.009009
2017-07-21 22:26:13.578605: Step 2040: Validation accuracy = 98.0% (N=100)
2017-07-21 22:26:14.508622: Step 2050: Train accuracy = 100.0%
2017-07-21 22:26:14.508970: Step 2050: Cross entropy = 0.014645
2017-07-21 22:26:14.598680: Step 2050: Validation accuracy = 99.0% (N=100)
2017-07-21 22:26:15.522261: Step 2060: Train accuracy = 100.0%
2017-07-21 22:26:15.522394: Step 2060: Cross entropy = 0.005870
2017-07-21 22:26:15.613931: Step 2060: Validation accuracy = 98.0% (N=100)
2017-07-21 22:26:16.541485: Step 2070: Train accuracy = 100.0%
2017-07-21 22:26:16.541661: Step 2070: Cross entropy = 0.009745
2017-07-21 22:26:16.630144: Step 2070: Validation accuracy = 100.0% (N=100)
2017-07-21 22:26:17.558956: Step 2080: Train accuracy = 100.0%
2017-07-21 22:26:17.559196: Step 2080: Cross entropy = 0.009463
2017-07-21 22:26:17.649615: Step 2080: Validation accuracy = 98.

2017-07-21 22:26:58.675926: Step 2450: Train accuracy = 100.0%
2017-07-21 22:26:58.676060: Step 2450: Cross entropy = 0.006234
2017-07-21 22:26:58.768637: Step 2450: Validation accuracy = 96.0% (N=100)
2017-07-21 22:26:59.691672: Step 2460: Train accuracy = 100.0%
2017-07-21 22:26:59.691808: Step 2460: Cross entropy = 0.014942
2017-07-21 22:26:59.782295: Step 2460: Validation accuracy = 100.0% (N=100)
2017-07-21 22:27:00.708276: Step 2470: Train accuracy = 99.0%
2017-07-21 22:27:00.708402: Step 2470: Cross entropy = 0.032425
2017-07-21 22:27:00.797502: Step 2470: Validation accuracy = 99.0% (N=100)
2017-07-21 22:27:01.739669: Step 2480: Train accuracy = 99.0%
2017-07-21 22:27:01.740012: Step 2480: Cross entropy = 0.028361
2017-07-21 22:27:01.829183: Step 2480: Validation accuracy = 98.0% (N=100)
2017-07-21 22:27:02.768230: Step 2490: Train accuracy = 100.0%
2017-07-21 22:27:02.768428: Step 2490: Cross entropy = 0.005557
2017-07-21 22:27:02.857314: Step 2490: Validation accuracy = 98.0%

2017-07-21 22:27:47.558626: Step 2860: Train accuracy = 100.0%
2017-07-21 22:27:47.558763: Step 2860: Cross entropy = 0.003662
2017-07-21 22:27:47.656386: Step 2860: Validation accuracy = 100.0% (N=100)
2017-07-21 22:27:48.637520: Step 2870: Train accuracy = 100.0%
2017-07-21 22:27:48.637652: Step 2870: Cross entropy = 0.005893
2017-07-21 22:27:48.733690: Step 2870: Validation accuracy = 100.0% (N=100)
2017-07-21 22:27:49.695489: Step 2880: Train accuracy = 100.0%
2017-07-21 22:27:49.695625: Step 2880: Cross entropy = 0.005108
2017-07-21 22:27:49.785438: Step 2880: Validation accuracy = 99.0% (N=100)
2017-07-21 22:27:50.742956: Step 2890: Train accuracy = 99.0%
2017-07-21 22:27:50.743106: Step 2890: Cross entropy = 0.023075
2017-07-21 22:27:50.835212: Step 2890: Validation accuracy = 100.0% (N=100)
2017-07-21 22:27:51.754817: Step 2900: Train accuracy = 99.0%
2017-07-21 22:27:51.755182: Step 2900: Cross entropy = 0.032624
2017-07-21 22:27:51.854984: Step 2900: Validation accuracy = 96.

2017-07-21 22:28:35.831986: Step 3270: Train accuracy = 100.0%
2017-07-21 22:28:35.832126: Step 3270: Cross entropy = 0.005454
2017-07-21 22:28:35.928781: Step 3270: Validation accuracy = 100.0% (N=100)
2017-07-21 22:28:36.969314: Step 3280: Train accuracy = 100.0%
2017-07-21 22:28:36.969453: Step 3280: Cross entropy = 0.005256
2017-07-21 22:28:37.063631: Step 3280: Validation accuracy = 98.0% (N=100)
2017-07-21 22:28:38.045905: Step 3290: Train accuracy = 100.0%
2017-07-21 22:28:38.046041: Step 3290: Cross entropy = 0.004550
2017-07-21 22:28:38.137861: Step 3290: Validation accuracy = 100.0% (N=100)
2017-07-21 22:28:39.099005: Step 3300: Train accuracy = 100.0%
2017-07-21 22:28:39.099129: Step 3300: Cross entropy = 0.007138
2017-07-21 22:28:39.189565: Step 3300: Validation accuracy = 100.0% (N=100)
2017-07-21 22:28:40.257833: Step 3310: Train accuracy = 100.0%
2017-07-21 22:28:40.258273: Step 3310: Cross entropy = 0.004056
2017-07-21 22:28:40.361394: Step 3310: Validation accuracy = 9

2017-07-21 22:29:29.372036: Step 3680: Train accuracy = 100.0%
2017-07-21 22:29:29.372386: Step 3680: Cross entropy = 0.006766
2017-07-21 22:29:29.467553: Step 3680: Validation accuracy = 99.0% (N=100)
2017-07-21 22:29:30.470433: Step 3690: Train accuracy = 100.0%
2017-07-21 22:29:30.470569: Step 3690: Cross entropy = 0.002454
2017-07-21 22:29:30.562975: Step 3690: Validation accuracy = 100.0% (N=100)
2017-07-21 22:29:31.528686: Step 3700: Train accuracy = 100.0%
2017-07-21 22:29:31.529025: Step 3700: Cross entropy = 0.006468
2017-07-21 22:29:31.621074: Step 3700: Validation accuracy = 99.0% (N=100)
2017-07-21 22:29:32.568297: Step 3710: Train accuracy = 100.0%
2017-07-21 22:29:32.568431: Step 3710: Cross entropy = 0.006676
2017-07-21 22:29:32.663187: Step 3710: Validation accuracy = 100.0% (N=100)
2017-07-21 22:29:33.605027: Step 3720: Train accuracy = 100.0%
2017-07-21 22:29:33.605157: Step 3720: Cross entropy = 0.004632
2017-07-21 22:29:33.699426: Step 3720: Validation accuracy = 10

In [25]:
# visualize with TensorBoard
!tensorboard --logdir=temp

Starting TensorBoard b'47' at http://0.0.0.0:6006
(Press CTRL+C to quit)
^C


## Summary

The final test accuracy is almost 99%  for our two classes Cat and Dog which is quite substantial given our training set contained approxiamtely only 1000 images for each classes.

This is where Transfer Learning really shines. All we did was use the trained Inception Model which already had learned basic features of lines, shapes and other features that increase in abstraction as we move towards the final layers of the model. 

We basically retrained the last layers where we supplied training images of Cats and Dogs and the model using its pre-learnt features could "learn" the features specific to these classes and hence resulted in an impressive classification accuracy of 99%. 

## Exercise : Try your own ideas

The whole training regime here is based on the way the image directories are structured. So building your own example shouldn't be very difficult.

Suppose you wanted to classify Cheetah and Tiger:
* Create directories for Cheetah and Tiger
* The images will be automatically resized so that their smallest dimension is 299, and then a square 'crop' area taken from their centers (since ImageNet networks are typically tuned to answering on 299x299 images)
* Test images should be put in the test_images directory
* Finally, re-run everything - checking that the training images are read in correctly, that there are no errors along the way, and that (finally) the class predictions on the test set come out as expected.

*The code in this notebook is inspired by Github codes that can be found [here](https://github.com/Aniruddha-Tapas/Transfer-Learning-for-Animal-Classification-in-Tensorflow).