# Introduction to Keras

#### Alec Chapman

This tutorial was adapted from [this keras blog](https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html)


The data comes from a [Kaggle competition to classify images as being cats or dogs. The data can be downloaded [here](https://www.kaggle.com/c/dogs-vs-cats/data) after signing into Kaggle (either via a Kaggle account or Google, Facebook, or Yahoo!).

## What is Keras?

Keras is a high-level deep learning API written in Python. Keras uses [TensorFlow](https://www.tensorflow.org/) (Google), [CNTK](https://github.com/Microsoft/cntk) (Microsoft), or [Theano](http://deeplearning.net/software/theano/) (University of Montreal) as a backend.



### Import some modules needed for our tutorial

In [None]:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense
from keras import applications
import glob, os

### Set Paths to Data

In [None]:
DATADIR = '/home/jovyan/DATA/keras_cat_dog/data'
TRAINDIR = os.path.join(DATADIR, 'train')
VALDIR = os.path.join(DATADIR, 'val')

In [None]:
# dimensions of our images.
img_width, img_height = 150, 150

top_model_weights_path = 'bottleneck_fc_model.h5'

train_data_dir = TRAINDIR
validation_data_dir = VALDIR
nb_train_samples = 2000
nb_validation_samples = 500
epochs = 50
batch_size = 16

## Loading and Modifying Pre-trained Model

```Python
def save_bottlebeck_features():
    datagen = ImageDataGenerator(rescale=1. / 255)

    # build the VGG16 network
    model = applications.VGG16(include_top=False, weights='imagenet')
```

### Preprocessing

```Python
    datagen = ImageDataGenerator(rescale=1. / 255)
```
[ImageDataGenerator](https://keras.io/preprocessing/image/)
Keras comes with various pre-defined models. The first time we instantiate a model, the weights are downloaded.

```Python
    weights='imagenet'
```

This will download the weights for the model trained on the imagenet data set. If `weights=None`, the weights will be assigned randomly.

In [None]:
def save_bottlebeck_features():
    datagen = ImageDataGenerator(rescale=1. / 255)

    # build the VGG16 network
    model = applications.VGG16(include_top=False, weights='imagenet')

    generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    bottleneck_features_train = model.predict_generator(
        generator, nb_train_samples / batch_size)
    with open('bottleneck_features_train.npy', 'wb') as fp:
        np.save(fp,
                bottleneck_features_train)

    generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    bottleneck_features_validation = model.predict_generator(
        generator, nb_validation_samples / batch_size)
    with open('bottleneck_features_validation.npy', 'wb') as fp:
        np.save(fp,
                bottleneck_features_validation)

In [None]:
#save_bottlebeck_features()

In [None]:
def train_top_model():
    with open('bottleneck_features_train.npy', 'rb') as fp:
        train_data = np.load(fp)
    train_labels = np.array(
        [0] * int(nb_train_samples / 2) + [1] * int(nb_train_samples / 2))

    # NOT SURE WHY I'M NOT ENDING UP WITH THE CORRECT NUMBER OF VALIDATIOAN_DATA POINTS
    with open('bottleneck_features_validation.npy', 'rb') as fp:
        validation_data = np.load(fp)
    validation_labels = np.array(
        [0] * int(len(validation_data) / 2) + [1] * int(len(validation_data) / 2))

    model = Sequential()
    model.add(Flatten(input_shape=train_data.shape[1:]))
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(1, activation='sigmoid'))

    model.compile(optimizer='rmsprop',
                  loss='binary_crossentropy', metrics=['accuracy'])

    model.fit(train_data, train_labels,
              epochs=epochs,
              batch_size=batch_size,
              validation_data=(validation_data, validation_labels))
    model.save_weights(top_model_weights_path)

In [None]:
train_top_model()