# Using Convolutional Network

## Basic setup

In [1]:
%matplotlib inline

Define a path to data

In [2]:
path = 'data/dogscats/'

A few basic libraries that we'll need for the initial exercises:

In [3]:
#from __future__ import division,print_function
import os, json, sys
from glob import glob
import numpy as np
np.set_printoptions(precision=4, linewidth=100)
from matplotlib import pyplot as plt

We have created a file most imaginatively called 'utils.py' to store any little convenience functions we'll want to use. We will discuss these as we use them.

In [4]:
%pwd

'/Users/hoangnguyen/Documents/Github/fastai_na'

In [8]:
sys.path.insert(1, os.path.join(sys.path[0], '/utils/'))
#print(sys.path)
#import modules
from utils import *
from vgg import Vgg16

ImportError: No module named 'vgg'

## Use a pretrained VGG model with our Vgg16 class

In [6]:
# As large as you can, but no larger than 64 is recommended. 
# If you have an older or cheaper GPU, you'll run out of memory, so will have to decrease this.
batch_size=64

In [7]:
# Import our class, and instantiate
sys.path.insert(1, os.path.join(sys.path[0], '/utils/'))
import vgg16; reload(vgg16)
#from vgg16 import Vgg16

ImportError: No module named 'vgg16'

In [None]:
vgg = Vgg16()
# Grab a few images at a time for training and validation.
# NB: They must be in subdirectories named based on their category
batches = vgg.get_batches(path+'train', batch_size=batch_size)
val_batches = vgg.get_batches(path+'valid', batch_size=batch_size*2)
vgg.finetune(batches)
vgg.fit(batches, val_batches, nb_epoch=1)

## Use Vgg16 for basic image recognition

First, create a Vgg16 object:

In [None]:
vgg = Vgg16()

Vgg16 is built on top of Keras (which we will be learning much more about shortly!), a flexible, easy to use deep learning library that sits on top of Theano or Tensorflow. Keras reads groups of images and labels in batches, using a fixed directory structure, where images from each category for training must be placed in a separate folder.

Let's grab batches of data from our training folder:

In [None]:
batches = vgg.get_batches(path+'train', batch_size=4)

Batches is just a regular python iterator. Each iteration returns both the images themselves, as well as the labels.

In [None]:
imgs, labels = next(batches)

As you can see, the labels for each image are an array, containing a 1 in the first position if it's a cat, and in the second position if it's a dog. This approach to encoding categorical variables, where an array containing just a single 1 in the position corresponding to the category, is very common in deep learning. It is called one hot encoding.

The arrays contain two elements, because we have two categories (cat, and dog). If we had three categories (e.g. cats, dogs, and kangaroos), then the arrays would each contain two 0's, and one 1.

In [None]:
plots(imgs, titles=labels)

We can now pass the images to Vgg16's predict() function to get back probabilities, category indexes, and category names for each image's VGG prediction

In [None]:
vgg.predict(imgs, True)

The category indexes are based on the ordering of categories used in the VGG model - e.g here are the first four:

In [None]:
vgg.classes[:4]

## Use our Vgg16 class to finetune a Dogs vs Cats model

To change our model so that it outputs "cat" vs "dog", instead of one of 1,000 very specific categories, we need to use a process called "finetuning". Finetuning looks from the outside to be identical to normal machine learning training - we provide a training set with data and labels to learn from, and a validation set to test against. The model learns a set of parameters based on the data provided.