# Keras tutorial - Emotion Detection in Images of Faces

Welcome to the first assignment of week 2. In this assignment, you will:
1. Learn to use Keras, a high-level neural networks API (programming framework), written in Python and capable of running on top of several lower-level frameworks including TensorFlow and CNTK. 
2. See how you can in a couple of hours build a deep learning algorithm.

#### Why are we using Keras? 

* Keras was developed to enable deep learning engineers to build and experiment with different models very quickly. 
* Just as TensorFlow is a higher-level framework than Python, Keras is an even higher-level framework and provides additional abstractions. 
* Being able to go from idea to result with the least possible delay is key to finding good models. 
* However, Keras is more restrictive than the lower-level frameworks, so there are some very complex models that you would still implement in TensorFlow rather than in Keras. 
* That being said, Keras will work fine for many common models. 

## <font color='darkblue'>Updates</font>

#### If you were working on the notebook before this update...
* The current notebook is version "v2a".
* You can find your original work saved in the notebook with the previous version name ("v2").
* To view the file directory, go to the menu "File->Open", and this will open a new tab that shows the file directory.

#### List of updates
* Changed back-story of model to "emotion detection" from "happy house."
* Cleaned/organized wording of instructions and commentary.
* Added instructions on how to set `input_shape`
* Added explanation of "objects as functions" syntax.
* Clarified explanation of variable naming convention.
* Added hints for steps 1,2,3,4

## Load packages
* In this exercise, you'll work on the "Emotion detection" model, which we'll explain below. 
* Let's load the required packages.

In [2]:
import numpy as np
import csv
import pandas as pd
import datetime, os
import tensorflow as tf
from sklearn.metrics import classification_report
from tensorflow import keras
from tensorflow.keras import layers
#from keras import layers
from tensorflow.keras.layers import Input, Dense, Activation, ZeroPadding2D, BatchNormalization, Flatten, Conv2D
from tensorflow.keras.layers import AveragePooling2D, MaxPooling2D, Dropout, GlobalMaxPooling2D, GlobalAveragePooling2D
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from tensorflow.keras.applications.imagenet_utils import preprocess_input
import pydot
from IPython.display import SVG
from keras.utils.vis_utils import model_to_dot
from keras.utils import plot_model
from kt_utils import *
from tensorflow.keras.callbacks import TensorBoard
from keras.callbacks.callbacks import ModelCheckpoint
#import keras.backend as K
K.set_image_data_format('channels_last')
import matplotlib.pyplot as plt
from matplotlib.pyplot import imshow
# Load the TensorBoard notebook extension
%load_ext tensorboard
%matplotlib inline

logdir = '/Users/bassler/Desktop/Documents/dl_chem_genetics/logs/'
tensorboard = TensorBoard(log_dir="/Users/bassler/Desktop/Documents/dl_chem_genetics/logs/")

Using TensorFlow backend.


In [3]:
# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0

# 1. Set `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)
# for later versions: 
# tf.compat.v1.set_random_seed(seed_value)

**Note**: As you can see, we've imported a lot of functions from Keras. You can use them by calling them directly in your code. Ex: `X = Input(...)` or `X = ZeroPadding2D(...)`. 

In other words, unlike TensorFlow, you don't have to create the graph and then make a separate `sess.run()` call to evaluate those variables.

## 1 - Emotion Tracking

* A nearby community health clinic is helping the local residents monitor their mental health.  
* As part of their study, they are asking volunteers to record their emotions throughout the day.
* To help the participants more easily track their emotions, you are asked to create an app that will classify their emotions based on some pictures that the volunteers will take of their facial expressions.
* As a proof-of-concept, you first train your model to detect if someone's emotion is classified as "happy" or "not happy."

To build and train this model, you have gathered pictures of some volunteers in a nearby neighborhood. The dataset is labeled.
<img src="images/face_images.png" style="width:550px;height:250px;">

Run the following code to normalize the dataset and learn about its shapes.

In [4]:
#X_train_orig, Y_train_orig, X_test_orig, Y_test_orig, classes = load_dataset()

# Normalize image vectors
#X_train = X_train_orig/255.
#X_test = X_test_orig/255.

# Reshape
#Y_train = Y_train_orig.T
#Y_test = Y_test_orig.T

#print ("number of training examples = " + str(X_train.shape[0]))
#print ("number of test examples = " + str(X_test.shape[0]))
#print ("X_train shape: " + str(X_train.shape))
#print ("Y_train shape: " + str(Y_train.shape))
#print ("X_test shape: " + str(X_test.shape))
#print ("Y_test shape: " + str(Y_test.shape))

In [5]:
#####Predict MoA from chemical genetics#####

#X_train_test = np.random.randn(600, 4000)
#Y_train_test = np.random.randn(600, 1)


#X_test_test = np.random.randn(200, 4000)
#Y_test_test = np.random.randn(200, 1)


#data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/MoA/X_test.csv")))
#X_test = np.array(data).astype("float").T

#data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/MoA/X_train.csv")))
#X_train = np.array(data).astype("float").T

#data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/MoA/X_predict.csv")))
#X_predict = np.array(data).astype("float").T

#data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/MoA/Y_test.csv")))
#Y_test = np.array(data).astype("float").T
#Y_test = tf.keras.backend.one_hot(Y_test, 4) [0]

#data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/MoA/Y_train.csv")))
#Y_train = np.array(data).astype("float").T
#Y_train = tf.keras.backend.one_hot(Y_train, 4) [0]

#names_predictions = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/MoA/Predict_names.csv")))

#print ("number of training examples = " + str(X_train.shape[0]))
#print ("number of test examples = " + str(X_test.shape[0]))
#print ("X_train shape: " + str(X_train.shape))
#print ("Y_train shape: " + str(Y_train.shape))
#print ("X_test shape: " + str(X_test.shape))
#print ("Y_test shape: " + str(Y_test.shape))
#print ("X_predict shape: " + str(X_predict.shape))

In [6]:
#####Predict drug-drug interactions#####

data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/X_test.csv")))
X_test = np.array(data).astype("float").T
#X_test = np.array(X_test).T

data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/X_train.csv")))
X_train = np.array(data).astype("float").T
#X_train = np.array(X_train).T

data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/X_predict.csv")))
X_predict = np.array(data).astype("float").T
#X_predict = np.array(X_predict).T

data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/Y_test.csv")))
Y_test = np.array(data).astype("float").T
Y_test = tf.keras.backend.one_hot(Y_test, 3) [0]


data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/Y_train.csv")))
Y_train = np.array(data).astype("float").T
Y_train = tf.keras.backend.one_hot(Y_train, 3) [0]


data = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/Y_predict.csv")))
Y_predict = np.array(data).astype("float").T
Y_predict = tf.one_hot(Y_predict, 3) [0]


#names_predictions = list(csv.reader(open("/Users/bassler/Desktop/Documents/dl_chem_genetics/Predict_names.csv")))

print ("number of training examples = " + str(X_train.shape[0]))
print ("number of test examples = " + str(X_test.shape[0]))
print ("X_train shape: " + str(X_train.shape))
print ("Y_train shape: " + str(Y_train.shape))
print ("X_test shape: " + str(X_test.shape))
print ("Y_test shape: " + str(Y_test.shape))
print ("X_predict shape: " + str(X_predict.shape))
print ("Y_predict shape: " + str(Y_predict.shape))

number of training examples = 1182
number of test examples = 393
X_train shape: (1182, 2604)
Y_train shape: (1182, 3)
X_test shape: (393, 2604)
Y_test shape: (393, 3)
X_predict shape: (393, 2604)
Y_predict shape: (393, 3)


**Details of the "Face" dataset**:
- Images are of shape (64,64,3)
- Training: 600 pictures
- Test: 150 pictures

## 2 - Building a model in Keras

Keras is very good for rapid prototyping. In just a short time you will be able to build a model that achieves outstanding results.

Here is an example of a model in Keras:

```python
def model(input_shape):
    """
    input_shape: The height, width and channels as a tuple.  
        Note that this does not include the 'batch' as a dimension.
        If you have a batch like 'X_train', 
        then you can provide the input_shape using
        X_train.shape[1:]
    """
    
    # Define the input placeholder as a tensor with shape input_shape. Think of this as your input image!
    X_input = Input(input_shape)

    # Zero-Padding: pads the border of X_input with zeroes
    X = ZeroPadding2D((3, 3))(X_input)

    # CONV -> BN -> RELU Block applied to X
    X = Conv2D(32, (7, 7), strides = (1, 1), name = 'conv0')(X)
    X = BatchNormalization(axis = 3, name = 'bn0')(X)
    X = Activation('relu')(X)

    # MAXPOOL
    X = MaxPooling2D((2, 2), name='max_pool')(X)

    # FLATTEN X (means convert it to a vector) + FULLYCONNECTED (just Dense is necessary for bliss score prediction)
    X = Flatten()(X)
    X = Dense(1, activation='sigmoid', name='fc')(X)

    # Create model. This creates your Keras model instance, you'll use this instance to train/test the model.
    model = Model(inputs = X_input, outputs = X, name='HappyModel')
    
    return model
```

#### Variable naming convention

* Note that Keras uses a different convention with variable names than we've previously used with numpy and TensorFlow. 
* Instead of creating unique variable names for each step and each layer, such as 
```
X = ...
Z1 = ...
A1 = ...
```
* Keras re-uses and overwrites the same variable at each step:
```
X = ...
X = ...
X = ...
```
* The exception is `X_input`, which we kept separate since it's needed later.

#### Objects as functions
* Notice how there are two pairs of parentheses in each statement.  For example:
```
X = ZeroPadding2D((3, 3))(X_input)
```
* The first is a constructor call which creates an object (ZeroPadding2D).
* In Python, objects can be called as functions. Search for 'python object as function and you can read this blog post [Python Pandemonium](https://medium.com/python-pandemonium/function-as-objects-in-python-d5215e6d1b0d).  See the section titled "Objects as functions."
* The single line is equivalent to this:
```
ZP = ZeroPadding2D((3, 3)) # ZP is an object that can be called as a function
X = ZP(X_input) 
```

**Exercise**: Implement a `HappyModel()`.  
* This assignment is more open-ended than most. 
* Start by implementing a model using the architecture we suggest, and run through the rest of this assignment using that as your initial model. * Later, come back and try out other model architectures. 
* For example, you might take inspiration from the model above, but then vary the network architecture and hyperparameters however you wish. 
* You can also use other functions such as `AveragePooling2D()`, `GlobalMaxPooling2D()`, `Dropout()`. 

**Note**: Be careful with your data's shapes. Use what you've learned in the videos to make sure your convolutional, pooling and fully-connected layers are adapted to the volumes you're applying it to.

In [7]:
# GRADED FUNCTION: HappyModel

def HappyModel(input_shape, n_layers, l1_coef, l2_coef):
    """
    Implementation of the HappyModel.
    
    Arguments:
    input_shape -- shape of the images of the dataset
        (height, width, channels) as a tuple.  
        Note that this does not include the 'batch' as a dimension.
        If you have a batch like 'X_train', 
        then you can provide the input_shape using
        X_train.shape[1:]

    Returns:
    model -- a Model() instance in Keras
    """
    
    ### START CODE HERE ###
    # Feel free to use the suggested outline in the text above to get started, and run through the whole
    # exercise (including the later portions of this notebook) once. The come back also try out other
    # network architectures as well. 
    
    # Define the input placeholder as a tensor with shape input_shape. Think of this as your input image!
    X_input = Input(input_shape)
    
    

    # Deep_neural network with  n_layers (1)
    #X = Dense(64, activation= "linear", name='fc_start', use_bias='false', kernel_initializer="glorot_normal",
    #          kernel_regularizer=tf.keras.regularizers.l1_l2(l1_coef, l2_coef))(X_input)
    #X = BatchNormalization(axis = 1, name = 'bn_start')(X)
    #X = Activation('relu')(X)
    
    #for i in range(n_layers):
    #    X = Dense(64, activation= "linear", name='fc_'+str(i), use_bias='false', kernel_initializer="glorot_normal",
    #          kernel_regularizer=tf.keras.regularizers.l1_l2(l1_coef, l2_coef))(X)
    #    X = BatchNormalization(axis = 1, name = 'bn_'+str(i))(X)
    #    X = Activation('relu')(X)
        
    #X = Dense(1, activation= "relu", name='fc_end', use_bias='True', kernel_initializer="glorot_normal",
    #          kernel_regularizer=tf.keras.regularizers.l1_l2(l1_coef, l2_coef))(X)
    #X = BatchNormalization(axis = 1, name = 'bn_end')(X)
    
    #X = Dense(3, use_bias=True)(X)
    #X = Activation('softmax')(X)
    
    #1/2 fully_connected layer neural network to predict nteractions (Francois Chollet) (2)
    X = Dense(32, activation="relu")(X_input)
    X = Dropout(0.5)(X)
    X = Dense(32, activation="relu")(X)
    X = Dense(3, activation="softmax")(X)
    
    
    #1 fully_connected layer neural network to predict MoA (Francois Chollet) (3)
    #X = Dense(64, activation="relu")(X_input)
    #X = Dropout(0.5)(X)
    #X = Dense(1, activation="softmax")(X)

    # Create model. This creates your Keras model instance, you'll use this instance to train/test the model.
    model = Model(inputs = X_input, outputs = X, name='HappyModel')
    
    ### END CODE HERE ###
    
    return model

You have now built a function to describe your model. To train and test this model, there are four steps in Keras:
1. Create the model by calling the function above  

2. Compile the model by calling `model.compile(optimizer = "...", loss = "...", metrics = ["accuracy"])`  

3. Train the model on train data by calling `model.fit(x = ..., y = ..., epochs = ..., batch_size = ...)`  

4. Test the model on test data by calling `model.evaluate(x = ..., y = ...)`  

If you want to know more about `model.compile()`, `model.fit()`, `model.evaluate()` and their arguments, refer to the official [Keras documentation](https://keras.io/models/model/).

#### Step 1: create the model.  
**Hint**:  
The `input_shape` parameter is a tuple (height, width, channels).  It excludes the batch number.  
Try `X_train.shape[1:]` as the `input_shape`.

In [10]:
n_layers = 10
l1_coef = 0.1
l2_coef = 0.1
### START CODE HERE ### (1 line)
happyModel = HappyModel(X_train.shape[1:],n_layers, l1_coef, l2_coef)
### END CODE HERE ###

(2604,)

#### Step 2: compile the model

**Hint**:  
Optimizers you can try include `'adam'`, `'sgd'` or others.  See the documentation for [optimizers](https://keras.io/optimizers/)  
The "happiness detection" is a binary classification problem.  The loss function that you can use is `'binary_cross_entropy'`.  Note that `'categorical_cross_entropy'` won't work with your data set as its formatted, because the data is an array of 0 or 1 rather than two arrays (one for each category).  Documentation for [losses](https://keras.io/losses/)

In [9]:
happyModel.compile(optimizer = "adam", loss = "binary_crossentropy", metrics = ["accuracy", tf.keras.metrics.Precision()])

logdir = os.path.join("logs", datetime.datetime.now().strftime("%Y%m%d-%H%M%S"))
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)

#### Step 3: train the model

**Hint**:  
Use the `'X_train'`, `'Y_train'` variables.  Use integers for the epochs and batch_size

**Note**: If you run `fit()` again, the `model` will continue to train with the parameters it has already learned instead of reinitializing them.

In [66]:
class_weight = {0: 100.,
                1: 1.,
                2: 500.}

happyModel.fit(x = X_train, y = Y_train, epochs = 300, steps_per_epoch=16, class_weight=class_weight,  callbacks=[tensorboard_callback]) # batch_size = 64

Epoch 1/300
Epoch 2/300
Epoch 3/300
Epoch 4/300
Epoch 5/300
Epoch 6/300
Epoch 7/300
Epoch 8/300
Epoch 9/300
Epoch 10/300
Epoch 11/300
Epoch 12/300
Epoch 13/300
Epoch 14/300
Epoch 15/300
Epoch 16/300
Epoch 17/300
Epoch 18/300
Epoch 19/300
Epoch 20/300
Epoch 21/300
Epoch 22/300
Epoch 23/300
Epoch 24/300
Epoch 25/300
Epoch 26/300
Epoch 27/300
Epoch 28/300
Epoch 29/300
Epoch 30/300
Epoch 31/300
Epoch 32/300
Epoch 33/300
Epoch 34/300
Epoch 35/300
Epoch 36/300
Epoch 37/300
Epoch 38/300
Epoch 39/300
Epoch 40/300
Epoch 41/300
Epoch 42/300
Epoch 43/300
Epoch 44/300
Epoch 45/300
Epoch 46/300
Epoch 47/300
Epoch 48/300
Epoch 49/300
Epoch 50/300
Epoch 51/300
Epoch 52/300
Epoch 53/300
Epoch 54/300
Epoch 55/300
Epoch 56/300
Epoch 57/300
Epoch 58/300
Epoch 59/300
Epoch 60/300
Epoch 61/300
Epoch 62/300
Epoch 63/300
Epoch 64/300
Epoch 65/300
Epoch 66/300
Epoch 67/300
Epoch 68/300
Epoch 69/300
Epoch 70/300
Epoch 71/300
Epoch 72/300
Epoch 73/300
Epoch 74/300
Epoch 75/300
Epoch 76/300
Epoch 77/300
Epoch 78

Epoch 134/300
Epoch 135/300
Epoch 136/300
Epoch 137/300
Epoch 138/300
Epoch 139/300
Epoch 140/300
Epoch 141/300
Epoch 142/300
Epoch 143/300
Epoch 144/300
Epoch 145/300
Epoch 146/300
Epoch 147/300
Epoch 148/300
Epoch 149/300
Epoch 150/300
Epoch 151/300
Epoch 152/300
Epoch 153/300
Epoch 154/300
Epoch 155/300
Epoch 156/300
Epoch 157/300
Epoch 158/300
Epoch 159/300
Epoch 160/300
Epoch 161/300
Epoch 162/300
Epoch 163/300
Epoch 164/300
Epoch 165/300
Epoch 166/300
Epoch 167/300
Epoch 168/300
Epoch 169/300
Epoch 170/300
Epoch 171/300
Epoch 172/300
Epoch 173/300
Epoch 174/300
Epoch 175/300
Epoch 176/300
Epoch 177/300
Epoch 178/300
Epoch 179/300
Epoch 180/300
Epoch 181/300
Epoch 182/300
Epoch 183/300
Epoch 184/300
Epoch 185/300
Epoch 186/300
Epoch 187/300
Epoch 188/300
Epoch 189/300
Epoch 190/300
Epoch 191/300
Epoch 192/300
Epoch 193/300
Epoch 194/300
Epoch 195/300
Epoch 196/300
Epoch 197/300
Epoch 198/300
Epoch 199/300
Epoch 200/300
Epoch 201/300
Epoch 202/300
Epoch 203/300
Epoch 204/300
Epoch 

Epoch 266/300
Epoch 267/300
Epoch 268/300
Epoch 269/300
Epoch 270/300
Epoch 271/300
Epoch 272/300
Epoch 273/300
Epoch 274/300
Epoch 275/300
Epoch 276/300
Epoch 277/300
Epoch 278/300
Epoch 279/300
Epoch 280/300
Epoch 281/300
Epoch 282/300
Epoch 283/300
Epoch 284/300
Epoch 285/300
Epoch 286/300
Epoch 287/300
Epoch 288/300
Epoch 289/300
Epoch 290/300
Epoch 291/300
Epoch 292/300
Epoch 293/300
Epoch 294/300
Epoch 295/300
Epoch 296/300
Epoch 297/300
Epoch 298/300
Epoch 299/300
Epoch 300/300


<tensorflow.python.keras.callbacks.History at 0x7f8f88dfd580>

#### Step 4: evaluate model  
**Hint**:  
Use the `'X_test'` and `'Y_test'` variables to evaluate the model's performance.

In [67]:
### START CODE HERE ### (1 line)
preds = happyModel.evaluate(X_test, Y_test, steps=1)

### END CODE HERE ###
print()
print ("Loss = " + str(preds[0]))
print ("Test Accuracy = " + str(preds[1]))
print ("Test Precision = " + str(preds[2]))
print(preds)


y_pred = pd.DataFrame(happyModel.predict(X_predict))#, batch_size=64, verbose=1))
#y_pred_bool = np.argmax(y_pred, axis=1)
y_pred = (y_pred == y_pred.max(axis=1)[:,None]).astype(int)


y_true = Y_predict
target_names = ['antagonism', 'neutral', 'synergy'] 


print(classification_report(y_true, y_pred, target_names=target_names))


#from sklearn.metrics import f1_score, precision_score, recall_score, confusion_matrix
#y_pred1 = model.predict(X_test)
#y_pred = np.argmax(y_pred1, axis=1)

# Print f1, precision, and recall scores
#print(precision_score(y_test, y_pred , average="macro"))
#print(recall_score(y_test, y_pred , average="macro"))
#print(f1_score(y_test, y_pred , average="macro"))




Loss = 0.9403237104415894
Test Accuracy = 0.580152690410614
Test Precision = 0.5805626511573792
[0.9403237104415894, 0.580152690410614, 0.5805626511573792]
              precision    recall  f1-score   support

  antagonism       0.19      0.39      0.25        31
     neutral       0.87      0.58      0.70       325
     synergy       0.13      0.38      0.19        37

   micro avg       0.55      0.55      0.55       393
   macro avg       0.40      0.45      0.38       393
weighted avg       0.75      0.55      0.62       393
 samples avg       0.55      0.55      0.55       393



In [11]:
happyModel.summary()

Model: "HappyModel"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         [(None, 2604)]            0         
_________________________________________________________________
dense (Dense)                (None, 32)                83360     
_________________________________________________________________
dropout (Dropout)            (None, 32)                0         
_________________________________________________________________
dense_1 (Dense)              (None, 32)                1056      
_________________________________________________________________
dense_2 (Dense)              (None, 3)                 99        
Total params: 84,515
Trainable params: 84,515
Non-trainable params: 0
_________________________________________________________________


In [12]:
%tensorboard --logdir logs

ERROR: Could not find `tensorboard`. Please ensure that your PATH
contains an executable `tensorboard` program, or explicitly specify
the path to a TensorBoard binary by setting the `TENSORBOARD_BINARY`
environment variable.

In [13]:
#For MoA prediction
#print(happyModel.predict(X_predict))
#prediction_data = pd.DataFrame(happyModel.predict(X_predict))

#prediction_data['drugs'] = names_predictions
#prediction_data.columns = ['cell_wall', 'membrane_stess', 'dna', 'protein_synthesis', "drugs"]
#cols = prediction_data.columns.tolist()
#cols = cols[-1:] + cols[:-1]
#prediction_data = prediction_data[cols]
#print(prediction_data)

#prediction_data.to_csv("/Users/bassler/Desktop/Documents/dl_chem_genetics/Prediction_neural_network.csv", index=False, sep='\t')


#### Expected performance   
If your `happyModel()` function worked, its accuracy should be better than random guessing (50% accuracy).

To give you a point of comparison, our model gets around **95% test accuracy in 40 epochs** (and 99% train accuracy) with a mini batch size of 16 and "adam" optimizer.

#### Tips for improving your model

If you have not yet achieved a very good accuracy (>= 80%), here are some things tips:

- Use blocks of CONV->BATCHNORM->RELU such as:
```python
X = Conv2D(32, (3, 3), strides = (1, 1), name = 'conv0')(X)
X = BatchNormalization(axis = 3, name = 'bn0')(X)
X = Activation('relu')(X)
```
until your height and width dimensions are quite low and your number of channels quite large (≈32 for example).  
You can then flatten the volume and use a fully-connected layer.
- Use MAXPOOL after such blocks.  It will help you lower the dimension in height and width.
- Change your optimizer. We find 'adam' works well. 
- If you get memory issues, lower your batch_size (e.g. 12 )
- Run more epochs until you see the train accuracy no longer improves. 

**Note**: If you perform hyperparameter tuning on your model, the test set actually becomes a dev set, and your model might end up overfitting to the test (dev) set. Normally, you'll want separate dev and test sets.  The dev set is used for parameter tuning, and the test set is used once to estimate the model's performance in production.

## 3 - Conclusion

Congratulations, you have created a proof of concept for "happiness detection"!

## Key Points to remember
- Keras is a tool we recommend for rapid prototyping. It allows you to quickly try out different model architectures.
- Remember The four steps in Keras: 


1. Create  
2. Compile  
3. Fit/Train  
4. Evaluate/Test  

## 5 - Other useful functions in Keras (Optional)

Two other basic features of Keras that you'll find useful are:
- `model.summary()`: prints the details of your layers in a table with the sizes of its inputs/outputs
- `plot_model()`: plots your graph in a nice layout. You can even save it as ".png" using SVG() if you'd like to share it on social media ;). It is saved in "File" then "Open..." in the upper bar of the notebook.

Run the following code.

In [14]:
#happyModel.summary()

In [15]:
#plot_model(happyModel, to_file='HappyModel.png', show_shapes=True, rankdir="LR")
#SVG(model_to_dot(happyModel).create(prog='dot', format='svg'))