The goal/point of this notebook is to experiment with different architectures for a CNN that does image classification (I will likely use cats/dogs for ease). I will use both my own and those inspired by other sources. The hope is that I will:
1. Be able to produce classifiers that achieve >90% accuracy
2. Understand what exactly goes into a good classifier, and how/why the structure makes it so. 

In [1]:
%matplotlib inline
from __future__ import division,print_function
import os, json
from glob import glob
import numpy as np
import scipy
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import confusion_matrix
np.set_printoptions(precision=4, linewidth=100)
from matplotlib import pyplot as plt
import utils; reload(utils)
from utils import plots, get_batches, plot_confusion_matrix, get_data

Using cuDNN version 5103 on context None
Preallocating 10867/11439 Mb (0.950000) on cuda
Mapped name None to device cuda: Tesla K40c (0000:81:00.0)
Using Theano backend.


In [2]:
#import relevant keras stuff
from numpy.random import random, permutation
from scipy import misc, ndimage
from scipy.ndimage.interpolation import zoom

import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential
from keras.layers import Input
from keras.layers import Dense, Activation
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD, RMSprop
from keras.preprocessing import image

In [3]:
batch_size = 4

In [4]:
path = "data/dogscats/"
model_path = "data/dogscats/models/"

In [5]:
train_batches = get_batches(path+'train',batch_size=batch_size)
val_batches = get_batches(path+'valid',batch_size=batch_size)

Found 23000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.


In [6]:
train_data = get_data(path+'train')

Found 23000 images belonging to 2 classes.


In [7]:
val_data = get_data(path+'valid')

Found 2000 images belonging to 2 classes.


In [8]:
import bcolz
def save_array(fname, arr): c=bcolz.carray(arr, rootdir=fname, mode='w'); c.flush()
def load_array(fname): return bcolz.open(fname)[:]

In [9]:
save_array('train_data.bc',train_data)
save_array('val_data.bc',val_data)

In [10]:
train_data = load_array('train_data.bc')
val_data = load_array('val_data.bc')

In [11]:
def onehot(x): return np.array(OneHotEncoder().fit_transform(x.reshape(-1,1)).todense())

OneHot encoding converts an array of labels (numbers 1 through n) to a matrix that has the same number of rows but now n columns. The number k in the original vector is encoded by a 1 in the kth column of our new matrix, while the rest of the row is populated with zeros. 

In [12]:
val_classes = val_batches.classes
train_classes = train_batches.classes
val_labels = onehot(val_classes)
train_labels = onehot(train_classes)

In [13]:
train_labels.shape

(23000, 2)

In [14]:
train_classes[:4]

array([0, 0, 0, 0], dtype=int32)

In [15]:
train_labels[:4]

array([[ 1.,  0.],
       [ 1.,  0.],
       [ 1.,  0.],
       [ 1.,  0.]])

In [16]:
train_data.shape

(23000, 3, 224, 224)

In [17]:
sgd = keras.optimizers.SGD(lr=0.0001, momentum=0.0, decay=0.0, nesterov=False)

Below is a basic model that I (randomly) wrote myself--it achieves about a 50% accuracy at best, so not a great classifier.

In [18]:
modelOne = Sequential()
#input = 224x224 images
#first arg is the # of filters, then the size (16 filters, size 3x3 not entered as a tuple)
modelOne.add(Convolution2D(64,3,3, activation='relu', input_shape=train_data.shape[1:]))
modelOne.add(MaxPooling2D(pool_size=(2,2)))
modelOne.add(Convolution2D(32,3,3, activation='relu'))
modelOne.add(MaxPooling2D(pool_size=(2,2)))
modelOne.add(Convolution2D(32,1,1, activation='relu'))
modelOne.add(MaxPooling2D(pool_size=(2,2)))
modelOne.add(Convolution2D(16,1,1, activation='relu'))
modelOne.add(MaxPooling2D(pool_size=(2,2)))
modelOne.add(Flatten()) #flattens 3D shape to 2D for Dense layer
#modelOne.add(Dropout(0.25))
modelOne.add(Dense(2,activation='softmax'))

In [19]:
modelOne.compile(optimizer=sgd,loss='categorical_crossentropy',metrics=['accuracy'])

In [20]:
modelOne.fit(train_data,train_labels,nb_epoch=2,batch_size=32)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7fc04b8cf3d0>

In [21]:
score = modelOne.evaluate(val_data,val_labels, batch_size=32)



In [22]:
modelOne.save('secondTry.h5')

In [23]:
print(score)

[0.66453420639038085, 0.61399999999999999]


Next, I'll try to implement a model based on the "let's keep it simple" paper by Hasanpour et al. 
The paper can be found here: https://arxiv.org/ftp/arxiv/papers/1608/1608.06037.pdf

In [24]:
simpleModel = Sequential()

simpleModel.add(Convolution2D(64,3,3, activation='relu', input_shape=train_data.shape[1:]))
simpleModel.add(Convolution2D(128,3,3,activation='relu'))
simpleModel.add(MaxPooling2D(pool_size=(2,2)))
simpleModel.add(MaxPooling2D(pool_size=(2,2)))
simpleModel.add(Convolution2D(128,3,3, activation='relu'))
simpleModel.add(MaxPooling2D(pool_size=(2,2)))
simpleModel.add(Convolution2D(128,3,3, activation='relu'))
simpleModel.add(Convolution2D(128,3,3, activation='relu'))
simpleModel.add(MaxPooling2D(pool_size=(2,2)))
simpleModel.add(Convolution2D(128,3,3, activation='relu'))
#NOTE: The 11th and 12th layers utilize 1x1 convolutional kernels instead of 3x3.
simpleModel.add(Convolution2D(128,1,1, activation='relu'))
simpleModel.add(Convolution2D(128,1,1, activation='relu'))
simpleModel.add(Convolution2D(128,3,3, activation='relu'))
simpleModel.add(Flatten())
simpleModel.add(Dense(2,activation='softmax'))

In [25]:
simpleModel.compile(optimizer=sgd,loss='categorical_crossentropy',metrics=['accuracy'])

In [26]:
simpleModel.fit(train_data,train_labels,nb_epoch=2,batch_size=32)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x7fc03f3fcbd0>

In [None]:
simpleModel.save('thirdTry.h5')

Below is another network I will attempt (Mek's example)

In [27]:
batch_size = 128
num_classes = 10
epochs = 12

In [28]:
model = Sequential()
model.add(Convolution2D(32,3,3,
                 activation='relu',
                 input_shape=train_data.shape[1:]))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='softmax'))

model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.Adadelta(),
              metrics=['accuracy'])

NameError: name 'Conv2D' is not defined

In [None]:
model.fit(train_data,train_labels,nb_epoch=2,batch_size=32)