# Dogs vs Cats
## Transfer Learning with Inception

In order to achieve greater performance than the convnet without significant increase in training time, I now take advantage of the image classification capabilities of Google's [Inception v4](http://arxiv.org/abs/1602.07261), pretrained on ImageNet. Since ImageNet is such a large dataset, this network should be able to detect fairly generic features in our images. Moreover, since Imagenet contains many breeds of cat and dog as classes, it may even have learnt features specific to identifying cats or dogs.

I start by computing the bottlenecks (i.e. the penultimate layer activations) for Inception on our images and then use a single layer (i.e. logistic regression) on these computed features to predict the probabilty that a given image contains a dog. Once the bottlenecks are computed and saved to TFRecord files, they can be fed into the network by an input pipeline just like the images themselves were for the convnet. Where `dataset.py` handled the image TFRecords (both reading and writing), `bottleneck.py` handles the bottleneck TFRecords.

Unlike some older Inception networks, Google does not provide a ProtoBuf GraphDef file for Inception v4. Instead, Python source files are provided which recreate the network using [TF-Slim](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim), a high level wrapper for TensorFlow. Also provided, of course, are checkpoint files containing the pretrained weights, biases and other variables. The following files were downloaded from the TF-Slim [models page](https://github.com/tensorflow/models/tree/master/slim#pre-trained-models) in order to recreate the Inception v4 network:

* `inception_v4.py`: the main model building file; I modified this slightly for compatibility with Python 3 and the latest TensorFlow (see comments in file for details)
* `inception_utils.py`: helper functions needed by `inception_v4.py`
* `inception_v4.ckpt`: the model checkpoint

Since the bottleneck creation is handled by the `bottleneck.py` script, all that remains to do is train a logistic regression model on the cached bottleneck values. I could use, say, `scikit-learn` for this, providing access to the best routines for training linear models, as well as alternative linear classifiers such as SVMs. However, in the name of simplicity and consistency, I instead treat the linear model as a fully connected network with no hidden layer, so that it can be trained using TensorFlow and my `tfutil` helper functions as in the previous notebook.

### Results
The results are indeed much better than with the basic convnet.

#### Training Summary
| learning rate | epoch count |
|---------------|-------------|
| 1e-4          | 25          |

#### Scoring
| dataset    | accuracy | loss    |
|------------|----------|---------|
| train      | 99.6%    | 0.031   |
| validation | 99.5%    | 0.033   |
| test       | 99.7%    | 0.033   |
| kaggle     |          | 0.0733  |

In [1]:
import tensorflow as tf

import bottleneck
import dataset
import tfutil as tfu

Inception expects 299x299 pixel images as input:

In [2]:
dataset.image_dim()

(299, 299)

In [3]:
def lm(images, reg_terms, train=True, share=False):
    with tf.variable_scope('lm', reuse=share):
        keep_prob = 0.5 if train else 1.0
        h = tf.nn.dropout(images, keep_prob=keep_prob)
        h = tfu.fc_op(h, channels_in=bottleneck.FLAGS['BOTTLENECK_SIZE'], channels_out=1, reg_terms=reg_terms, alpha=0.01, name='out', relu=False)
        
    return h

In [4]:
NAME = 'inception_lm'
lm_reg_terms = {}

args = {
    'name': NAME,
    'inference_op': lm,
    'reg_terms': lm_reg_terms,
    'inputs': bottleneck.inputs
}

training_args = {
    **args,
    'optimizer': tf.train.AdamOptimizer,
}

In [5]:
tfu.run_cleanup(name=NAME)
tfu.run_setup(name=NAME)

In [6]:
tfu.run_training(
    learning_rate=1e-4,
    num_epochs=25,
    **training_args
);

Train Accuracy: 50.6%
Validation Accuracy: 48.6%
Train Loss: 0.729
Validation Loss: 0.751
Cross Entropy: 0.739
Cross Entropy: 0.205
Cross Entropy: 0.118
Cross Entropy: 0.109
Train Accuracy: 98.9%
Validation Accuracy: 99.1%
Train Loss: 0.068
Validation Loss: 0.067
Cross Entropy: 0.043
Cross Entropy: 0.068
Cross Entropy: 0.042
Cross Entropy: 0.062
Train Accuracy: 99.6%
Validation Accuracy: 99.4%
Train Loss: 0.051
Validation Loss: 0.048
Cross Entropy: 0.039
Cross Entropy: 0.037
Cross Entropy: 0.043
Cross Entropy: 0.031
Train Accuracy: 99.2%
Validation Accuracy: 99.3%
Train Loss: 0.038
Validation Loss: 0.042
Cross Entropy: 0.032
Cross Entropy: 0.033
Cross Entropy: 0.069
Cross Entropy: 0.032
Train Accuracy: 99.4%
Validation Accuracy: 99.3%
Train Loss: 0.041
Validation Loss: 0.038
Cross Entropy: 0.027
Cross Entropy: 0.047
Cross Entropy: 0.053
Cross Entropy: 0.032
Done training for 4930 steps.
Train Accuracy: 99.3%
Validation Accuracy: 99.4%
Train Loss: 0.031
Validation Loss: 0.036


In [7]:
tfu.run_eval(**args)

Train Accuracy: 99.5%
Validation Accuracy: 99.3%
Test Accuracy: 99.7%
Train Loss: 0.037
Validation Loss: 0.037
Test Loss: 0.033


In [8]:
tfu.run_prediction(**args, clip=True)
tfu.run_prediction(**args, clip=False)

Wrote predictions to ./data/inception_lm_clipped.csv
Wrote predictions to ./data/inception_lm.csv
