# Convolutional Neural Networks

The idea of this project it to get familiar with convolutional neural networks and solve the "cats vs dogs" captcha challenge, which tries to automatically classify images as cats or dogs.

## Getting Started
To get started with Convnets, read up on the concepts in the [Deep Learning Book, Chapters 9.1-9.3](http://www.deeplearningbook.org/contents/convnets.html).
The Stanford course on neural networks also has a [great explanation](http://cs231n.github.io/convolutional-networks/).

### Understanding Convolutions
Before diving into convolutional neural nets, it's a good idea to understand how convolutions work. A convolution is simply a filter (as in signal processing) applied to all positions of a signal. Very simple filters are those that smooth or blurr, and those that compute derivatives.

### 1d convolutions
Let's start with smoothing a 1d signal.

In [None]:
# create a noisy signal
import numpy as np
# use a fixed seed for simplicity
rng = np.random.RandomState(0)
noise = rng.normal(size=(200,))
random_walk = np.cumsum(noise)
print(random_walk[: 10])

In [None]:
# plot
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(random_walk)

Convolutions in n dimensions are implemented in ``scipy.ndimage.convolve``. Let's start with smoothing our signal. Use ``convolve`` to convolve the signale with a constant filter (``np.ones()``). Start with length 3, and then try different length. Visualize the result of the convolution together with the original signal.

In [None]:
from scipy.ndimage import convolve
# solution here ...

Now let's try a slightly more complicated filter: a Gaussian filter.

In [None]:
gaussian_filter = np.exp(-np.linspace(-2, 2, 15) ** 2)
gaussian_filter /= gaussian_filter.sum()
plt.plot(gaussian_filter)
gaussian_filter

Convolve the signal with the Gaussian filter and compare the results to the constant filter. What does changing the radius or size of the filter do?

In [None]:
# Solution...

A somewhat more interesting filter is the derivative. There's many ways to encode a derivative, the simplest is the filter ``[-1, 1]``. Use this filter on the signal.

In [None]:
# solution ..

Sometimes a better way to compute the derivative is to smooth first. Convolutions are associative, meaning applying one filter and then another is the same as first convolving the two filters, then applying the result.

Apply the ``[-1, 1]`` filter to the ``gaussian_filter`` above and visualize the result. Then, convolve the original signal with this smoothed derivative filter.

### 2d convolutions
Next, we'll apply the same principles to a 2d signal.


In [None]:
from scipy.misc import imread
image = imread("thisisdog.png")
plt.imshow(image)

In [None]:
# for simplicity, convert to grayscale for now:
gray = image.mean(axis=2)
plt.imshow(gray, cmap=plt.cm.Greys_r)

In [None]:
# Creating a 2d Gaussian filter from the 1d Gaussian filter above:

In [None]:
gaussian_2d = gaussian_filter * gaussian_filter[:, np.newaxis]
plt.matshow(gaussian_2d)

Now smooth the image with the 2d Gaussian filter:

In [None]:
# Solution

Computing gradients in an image is the same as doing "edge detection". Convolve the 2d Gaussian with the ``[-1, 1]`` filter to get an edge detection filter. Then run the filter on the image. Create filters for both horizontal and vertical edge detection.

In [None]:
# Solution

# A baseline network on MNIST
For many years, the MNIST dataset of handwritten digits (much larger than the ``digits`` dataset we were using) was a standard benchmark.
Start by applying a standard Multilayer perceptron (no convolutions) to the MNIST dataset as a baseline.
If you get stuck, you can consult the example [here](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py)

In [None]:
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout

batch_size = 128
num_classes = 10
epochs = 20

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# build sequential model as yesterday

# Convolutional neural networks with Keras

Now compare this to a convolutional neural network with keras.
Again you can use the ``Sequential`` model, but this time we are using the Conv2D layers and MaxPooling2D layers (see keras docs).
If you get stuck, you can look at [this example](https://github.com/keras-team/keras/blob/master/examples/mnist_cnn.py).

In [None]:
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K

batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28

# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

# The cats and dogs dataset.

You could try to learn a network like this on the cats and dogs dataset.
However, there is unlikely to be enough data in the dataset. Instead, we will use a neural network that was trained on the much larger "imagenet" dataset.

Start by downloading the cats and dogs data here:

https://www.kaggle.com/c/dogs-vs-cats/data

or if you don't have a kaggle account here:

https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765

Then load one of the models shipped with keras, start with VGG 16:
https://keras.io/applications/#vgg16

In [None]:
# Load model here

In [None]:
# Prepare data similar to MNIST above
from keras.applications.vgg16 import preprocess_input
X_pre = preprocess_input(X)

In [None]:
# use the vgg net to extract features:
features = model.predict(X_pre)

In [None]:
# Now train a scikit-learn model, like logistic regression, to distinguish cats and dogs based on these features.