# Keras tutorial - the Happy House (CNN)

In Keras you can develop, tune and experiment with NN models easily and rapidly.

Keras is a high-level neural networks API (programming framework) on top of several lower-level frameworks including TensorFlow and CNTK.
However, Keras is more restrictive than the lower-level frameworks, so there are some very complex models that you can implement in TensorFlow but not (without more difficulty) in Keras. That being said, Keras will work fine for many common models. 

## Keras modelling

There are 4 steps in Keras modelling;  
1. **Build**: Define the models' architecture and forward prop; layers, activations, regularisers
2. **Compile**: Define the models' back prop: loss function, optimiser and metrics
3. **Train**: Learn the model according to a schema; learning rate, number of epochs, batch size
4. **Evaluate**: Compute the performance and iteratively tune the hyperparameters in the above steps  

The model is now ready to make predictions on new data

### Setup notebook

In [None]:
import numpy as np

# keras
from keras import layers
from keras.layers import Input, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D
from keras.layers import AveragePooling2D, MaxPooling2D, Dropout, GlobalMaxPooling2D, GlobalAveragePooling2D
from keras.models import Model
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras.applications.imagenet_utils import preprocess_input

# Plotting keras model 
import pydot
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model

# keras backend
import keras.backend as K
K.set_image_data_format('channels_last')

# mpl
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow

%matplotlib inline

from mymods.lauthom import *

**Note**: As you can see, we've imported a lot of functions from Keras. You can use them easily just by calling them directly in the notebook. Ex: `X = Input(...)` or `X = ZeroPadding2D(...)`.

## Case: The Happy House 

For your next vacation, you decided to spend a week with five of your friends from school. It is a very convenient house with many things to do nearby. But the most important benefit is that everybody has commited to be happy when they are in the house. So anyone wanting to enter the house must prove their current state of happiness.

<img src="../../data/conv_images/happy-house.jpg" width=500>
<caption><center> <u> <font color='purple'> **Figure 1** </u><font color='purple'>  : **the Happy House**</center></caption>


As a deep learning expert, to make sure the "Happy" rule is strictly applied, you are going to build an algorithm which that uses pictures from the front door camera to check if the person is happy or not. The door should open only if the person is happy. 

You have gathered pictures of your friends and yourself, taken by the front-door camera. The dataset is labbeled. 

<img src="../../data/conv_images/house-members.png" width=500>

Run the following code to normalize the dataset and learn about its shapes.

In [None]:
def load_dataset():
    """Load dataset of format .h5"""
    import h5py
    train_dataset = h5py.File('../../data/conv_datasets/train_happy.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

    test_dataset = h5py.File('../../data/conv_datasets/test_happy.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    
    train_set_y_orig = train_set_y_orig.reshape((1, -1))
    test_set_y_orig = test_set_y_orig.reshape((1, -1))
    
    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

In [None]:
X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()

# Normalize image vectors
X_train = X_train_orig/255.
X_test = X_test_orig/255.

# Reshape
Y_train = Y_train_orig.T
Y_test = Y_test_orig.T

print("number of training examples = " + str(X_train.shape[0]))
print("number of test examples = " + str(X_test.shape[0]))
print("X_train shape: " + str(X_train.shape))
print("Y_train shape: " + str(Y_train.shape))
print("X_test shape: " + str(X_test.shape))
print("Y_test shape: " + str(Y_test.shape))

## Building a model in Keras

In Keras you can connect the NN layers by re-assigning and applying `X` to a new value using `X = ...(X)`. In other words, **chaining the computations(layers) and updating the same variable `X`**.  
`X_input` can't be reassigned, since we needed it at the end to create the Keras model instance (`model = Model(inputs = X_input, ...)` above).

If there is not much training data, prefer a small network with few hidden layers to avoid overfitting.

In [None]:
def HappyModel(input_shape):
    """
    Implementation of the HappyModel.
    
    Arguments:
    input_shape -- shape of the images of the dataset

    Returns:
    model -- a Model() instance in Keras
    """
    
    # chain layers: X = layer()(X)
    
    # Define the input placeholder as a tensor with shape input_shape. Think of this as your input image!
    X_input = Input(input_shape)

    # Zero-Padding: pads the border of X_input with zeroes
    # Experiment with different convolution layer sizes, strides
    padding = (3, 3)
    X = ZeroPadding2D(padding)(X_input)

    # CONV -> BN -> RELU Block applied to X
    # Experiment with different convolution layer sizes, strides
    conv_size = (7, 7)
    X = Conv2D(32, conv_size, strides=(1, 1), name='conv0')(X)
    X = BatchNormalization(axis=3, name='bn0')(X)
    X = Activation('relu')(X)

    # MAXPOOL
    # Experiment with different convolution layer sizes, strides
    pooling = (2, 2)
    X = MaxPooling2D(pooling, name='max_pool')(X)

    # FLATTEN X (means convert it to a vector(1D array)) + FULLYCONNECTED
    X = Flatten()(X)
    X = Dense(1, activation='sigmoid', name='fc')(X)

    # Create model. This creates your Keras model instance, you'll use this instance to train/test the model.
    model = Model(inputs=X_input, outputs=X, name='HappyModel')

    return model

You have now built a function to describe your model. To train and test this model, there are four steps in Keras:
1. Create the model by calling the function above
2. Compile the model by calling `model.compile(optimizer = "...", loss = "...", metrics = ["accuracy"])`
3. Train the model on train data by calling `model.fit(x = ..., y = ..., epochs = ..., batch_size = ...)`
4. Test the model on test data by calling `model.evaluate(x = ..., y = ...)`

If you want to know more about `model.compile()`, `model.fit()`, `model.evaluate()` and their arguments, refer to the official [Keras documentation](https://keras.io/models/model/).

### 1. Create the model

Initiate model with input shape(image shape).

In [None]:
happyModel = HappyModel(X_train.shape[1:])

### 2. Compile the model to configure the learning process
Choose the 3 arguments of `compile()` wisely. Hint: the Happy Challenge is a binary classification problem.

In [None]:
happyModel.compile(optimizer='adam', 
                   loss='binary_crossentropy', 
                   metrics=['accuracy'])

### 3. Train the model
Choose the number of epochs and the batch size.  
 - First start with a few epochs to check the models performance
 - Smaller batch sizes are less expensive memory wise.  
 - Batch sizes in $2^n$ have been found effective.

In [None]:
happyModel.fit(X_train, 
               Y_train, 
               epochs=15, 
               batch_size=32)

###### Note that if you run `fit()` again, the `model` will continue to train with the parameters it has already learnt instead of reinitializing them.

### 4. Test/evaluate the model

In [None]:
loss, accuracy = happyModel.evaluate(X_test, 
                                     Y_test, 
                                     batch_size=32, 
                                     verbose=1, 
                                     sample_weight=None)

print ("\nLoss = " + str(loss))
print ("Test Accuracy = " + str(accuracy))

If the `happyModel()` function worked, the test accuracy should be at least 75% accuracy. 

The model gets around **95% test accuracy in 40 epochs** (and 99% train accuracy) with a mini batch size of 16 and "adam" optimizer. 

The model gets decent accuracy after just 2-5 epochs, so if you're comparing different models you can also train a variety of models on just a few epochs and see how they compare. 

- Try using blocks of CONV->BATCHNORM->RELU such as:
```python
X = Conv2D(32, (3, 3), strides=(1, 1), name='conv0')(X)
X = BatchNormalization(axis=3, name='bn0')(X)
X = Activation('relu')(X)
```

Until your height and width (layer-)dimensions are quite low and your number of channels quite large (≈32 for example). You are encoding useful information in a volume with a lot of channels. You can then flatten the volume and use a fully-connected layer.
- You can use MAXPOOL after such blocks. It will help you lower the dimension in height and width.
- Change your optimizer. We find Adam works well. 
- If the model struggles to run and you get memory issues, lower your batch_size (12 is usually a good compromise)
- Run on more epochs, until you see the train accuracy plateauing.

###### **Note**: 
If you perform hyperparameter tuning on your model, the test set actually becomes a dev set, and your model might end up overfitting to the test (dev) set. Moreover the training/test sets were quite similar; for example, all the pictures were taken against the same background (since a front door camera is always mounted in the same position). This makes the problem easier, but a model trained on this data may or may not work on your own data.

#### Test set results

In [None]:
import random

k = 15
test_indices = random.sample(range(X_test.shape[0]), k)
fig, axes = plt.subplots((k+4)//5, 5, figsize=(20,k))

# Test set
for i, idx in enumerate(test_indices):
    ax = axes[i//5, i%5]
    x = X_test_orig[idx]
    _ = ax.imshow(x)
    _ = ax.axis('off')
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    y_pred = happyModel.predict(x)[0][0].astype('i')
    y = Y_test_orig[0].reshape(-1)[idx]
    col = 'green' if y_pred == y else 'red'
    title = 'Prediction: {}\nLabel: {}'.format(['not smiling','smiling'][y_pred], ['not smiling','smiling'][y])
    _ = ax.set_title(title, color=col)
    
_ = plt.show()

## Apply model

In [None]:
img_path = '../../data/conv_images/my_image.jpg'

img = image.load_img(img_path, target_size=(64, 64))
_ = plt.imshow(img)

x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

print('Enter?:', happyModel.predict(x)[0][0].astype(bool))

## Visualise model architecture

Two other basic features of Keras that you'll find useful are:
- `model.summary()`: prints the details of your layers in a table with the sizes of its inputs/outputs
- `plot_model()`: plots your graph in a nice layout. You can even save it as ".png" using SVG() if you'd like to share it on social media ;). It is saved in "File" then "Open..." in the upper bar of the notebook.

In [None]:
happyModel.summary()

In [None]:
def svg_model(model, filename):
    """Visualise Keras NN model as flowchart"""
    plot_model(model, to_file=filename)
    return SVG(model_to_dot(model).create(prog='dot', format='svg'))

svg_model(happyModel, 'HappyModel.png')

## Conclusion

- Keras is a recommended tool for rapid prototyping. It allows you to quickly try out different model architectures.
- Keras models are build and evaluated in just 4 steps: Create -> Compile -> Train -> Test