# Session 09: Introduction to Neural Networks

In this notebook we introduce the basics of neural networks and how to 
apply them to the cat/dog classification task.

## Setup

We need to load the modules within each notebook. Here, we load the
same set as in the previous question.

In [1]:
%pylab inline

import numpy as np
import scipy as sp
import pandas as pd
import sklearn
from sklearn import linear_model
import urllib

import os
from os.path import join

Populating the interactive namespace from numpy and matplotlib


In [2]:
import matplotlib.pyplot as plt
import matplotlib.patches as patches

plt.rcParams["figure.figsize"] = (8,8)

## Cats and dogs

Read in the cats and dogs dataset once again:

In [3]:
df = pd.read_csv(join("..", "data", "catdog.csv"))
df

Unnamed: 0,filename,animal
0,cat.0.jpg,cat
1,cat.1.jpg,cat
2,cat.10.jpg,cat
3,cat.100.jpg,cat
4,cat.101.jpg,cat
5,cat.102.jpg,cat
6,cat.103.jpg,cat
7,cat.104.jpg,cat
8,cat.105.jpg,cat
9,cat.106.jpg,cat


## Neural networks

We will mostly decribe the basics of neural networks on the white board. In
short, neural networks function by chaining together relatively simple models
in sequence. For images, these consists of a number of **convoluational** layers
followed by **dense** layers. Convolutions functions like the texture features
and dense layers function like linear regression.

## keras

In order to build neural networks in Python, we are going to use the keras library.
Let's read in several of functions that will be useful.

In [4]:
from keras.models import Sequential
from keras.layers import Dense, Activation, Conv2D, MaxPooling2D, Flatten
from keras.preprocessing import image
from keras.utils import to_categorical
from keras.optimizers import SGD, RMSprop

Using TensorFlow backend.


To start, we need to read in all of the images and store the entire corpus.
To make neural networks work, all of the images need to be the same size.
For ease, we will start by assuming that all of the images are 32 by 32 pixels
large.

In [5]:
img_list = []

for i in range(len(df)):
    img_path = join("..", "images", "catdog", df.filename[i])
    img = image.load_img(img_path, target_size=(32, 32))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    img_list.append(x)
    
X = np.vstack(img_list) / 255
y = np.int32(df.animal.values == "dog")

Lets build a model with a single convolutional layer.

In [None]:
model = Sequential()
model.add(Conv2D(32, input_shape=X.shape[1:], kernel_size=(3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=2))

model.add(Flatten())
model.add(Dense(units=2, activation="softmax"))

The final dense layer matches the number of categories (2: dogs and cats)
in the output. Let's look at the model:

In [None]:
model.summary()

It has an impressive number of parameters, over 15 thousand of them! 
Before we use data to learn these parameters, we need to *compile* the
model. This build efficent code that makes the process of learning the
parameters as fast as possible. 

In [None]:
model.compile(loss='sparse_categorical_crossentropy',
              optimizer=SGD(lr=0.03, momentum=0.8, decay=0.0, nesterov=True),
              metrics=['accuracy'])

As before, we will also create a training and testing set from the data.

In [None]:
X_train, X_test, y_train, y_test = sklearn.model_selection.train_test_split(X, y)

Now, we will learn the parameters that (we hope) lead to a model that does a
good job of predicting the output of the type of animal in the image. Note that
this is much less straightforward than the linear regression example; neural
networks do not give a convex optimization task, which means that they are 
significantly more difficult to solve.

In [None]:
model.fit(X_train, y_train, epochs=25, batch_size=32,
          validation_data=(X_test, y_test))

Looking at the output, we (should) see that the prediction is better than the
features we created last time:

In [None]:
yhat = model.predict_classes(X_test)
sklearn.metrics.accuracy_score(y_test, yhat)

The model is still not perfect, but this is more a function of not having
enough data than a problem with the model itself. We will see how to work
with a larger dataset in the next section.