Before you turn this lab in, make sure everything runs as expected. First, **restart the kernel** (in the menubar, select Kernel$\rightarrow$Restart) and then **run all cells** (in the menubar, select Cell$\rightarrow$Run All).

Make sure you fill in any place that says `YOUR CODE HERE` or "YOUR ANSWER HERE", as well as your name and collaborators below:

In [1]:
DRIVER = "Mike"
NAVIGATOR = "JOnathan"

# Optimization II Lab

Welcome to the optimization II lab! By the end of this lab you will have

- Performed optimization on a deep learning model with several different optimizers
- Visualized optimizer dynamics via TensorBoard
- Visualized optimizer performance via SacredBoard

Let's get started!

# Unit Test Variables

The following code defines variables that will be used in subsequent unit tests. Do not attempt to redefine any of these variables throughout the notebook!

In [2]:
from IPython.display import HTML

def passed():
    print('✅')

## Task

- Define a convolutional neural network trainer `CNNTrainer` in `trainers.py`

## Remarks

- You can either fill in the code below or directly edit `trainers.py`

In [3]:
from keras.models import Sequential
from keras.layers import Conv2D, Dense, MaxPooling2D, Dropout, Flatten

Using TensorFlow backend.


In [4]:
from trainer import Trainer

class CNNTrainer(Trainer):
    """Convolutional Neural Network Classifier"""

    def build_model(self):
        from keras.models import Sequential
        from keras.layers import Conv2D, Dense, MaxPooling2D, Dropout, Flatten

        model = Sequential()
        model.add(Conv2D(36, kernel_size=2, strides=1, input_shape=self.X[0].shape, activation='relu'))
        model.add(MaxPooling2D(2))
        model.add(Flatten())
        model.add(Dense(100, activation='relu'))
        model.add(Dense(self.Y[0].shape[0], activation='softmax'))
        
        

        self.model = model
        
import trainers
trainers.CNNTrainer = CNNTrainer

# `CNNTrainer` Tests

In [5]:
import numpy as np
from trainers import CNNTrainer

cnn = CNNTrainer(config={})
cnn.load_data()
cnn.build_model()

nb_conv = len([layer for layer in cnn.model.layers if layer.name.startswith('conv')])
assert nb_conv >= 1
assert cnn.model.input_shape == (None, 28, 28, 1)
assert cnn.model.output_shape == (None, 10)
X_ = np.random.randn(16, 28, 28, 1)
y_pred = cnn.model.predict_classes(X_, verbose=0)
assert np.all(0 <= y_pred) and np.all(y_pred < 10)

passed()

✅


## Task

- Optimize your `CNNTrainer` model with every optimizer in keras with the exception of `TFOptimizer`

## Suggestion

- Start with a simple model like `MLRTrainer` and only consider a small number of training examples so you can debug quickly until you're sure everything is working correctly and them experiment with larger slower `CNNTrainer`

## Requirement

- Keep the sacred [Mongo Observer](http://sacred.readthedocs.io/en/latest/observers.html#mongo-observer) so you can view the results afterward in [sacredboard](https://github.com/chovanecm/sacredboard)

In [6]:
from train import ex
import keras
import keras.backend as K
from sacred.observers import MongoObserver
mongo_observer = MongoObserver.create()
ex.observers.append(mongo_observer)

for trainer in ['MLRTrainer']:
    run = ex.run(config_updates=dict(trainer=trainer), options={'--name': trainer})
    K.clear_session()

INFO - MLRTrainer - Running command 'main'
INFO - MLRTrainer - Started run with ID "17"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
flatten_2 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 10)                7850      
_________________________________________________________________
activation_1 (Activation)    (None, 10)                0         
Total params: 7,850.0
Trainable params: 7,850.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name dense_3/kernel:0 is illegal; using dense_3/kernel_0 instead.


INFO - tensorflow - Summary name dense_3/kernel:0 is illegal; using dense_3/kernel_0 instead.


INFO:tensorflow:Summary name dense_3/bias:0 is illegal; using dense_3/bias_0 instead.


INFO - tensorflow - Summary name dense_3/bias:0 is illegal; using dense_3/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


INFO - MLRTrainer - Result: 0.9530881419181824
INFO - MLRTrainer - Completed after 0:00:04


## Task

- Visualize optimizer dynamics with TensorBoard

## Requirements

- Select out plots which track only training loss during your experiment runs in TensorBoard with an appropriate regex (e.g. `loss`)
- Take a screenshot
- Load it into a `IPython.display.Image` object called `tensorboard_screenshot`
- Display it

In [7]:
optimizers = ['adam','rmsprop', 'adagrad','adadelta','adam','adamax','nadam','sgd']
for trainer in ['CNNTrainer']:
    for optimizer in optimizers:
        run = ex.run(config_updates=dict(trainer=trainer, optimizer=optimizer), options={'--name': optimizer})
        K.clear_session()

INFO - adam - Running command 'main'
INFO - adam - Started run with ID "18"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


INFO - adam - Result: 0.55132279753685
INFO - adam - Completed after 0:00:12
INFO - rmsprop - Running command 'main'
INFO - rmsprop - Started run with ID "19"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

INFO - rmsprop - Result: 0.4865875277519226
INFO - rmsprop - Completed after 0:00:14





INFO - adagrad - Running command 'main'
INFO - adagrad - Started run with ID "20"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

INFO - adagrad - Result: 0.5176237754821778
INFO - adagrad - Completed after 0:00:12





INFO - adadelta - Running command 'main'
INFO - adadelta - Started run with ID "21"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

INFO - adadelta - Result: 0.5207791938781738
INFO - adadelta - Completed after 0:00:10





INFO - adam - Running command 'main'
INFO - adam - Started run with ID "22"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


INFO - adam - Result: 0.5383777227401734
INFO - adam - Completed after 0:00:15
INFO - adamax - Running command 'main'
INFO - adamax - Started run with ID "23"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

INFO - adamax - Result: 0.507279653429985
INFO - adamax - Completed after 0:00:13





INFO - nadam - Running command 'main'
INFO - nadam - Started run with ID "24"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

INFO - nadam - Result: 0.5189736087322235
INFO - nadam - Completed after 0:00:13





INFO - sgd - Running command 'main'
INFO - sgd - Started run with ID "25"


_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 27, 27, 36)        180       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 13, 13, 36)        0         
_________________________________________________________________
flatten_1 (Flatten)          (None, 6084)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 100)               608500    
_________________________________________________________________
dense_2 (Dense)              (None, 10)                1010      
Total params: 609,690.0
Trainable params: 609,690.0
Non-trainable params: 0.0
_________________________________________________________________
INFO:tensorflow:Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO - tensorflow - Summary name conv2d_1/kernel:0 is illegal; using conv2d_1/kernel_0 instead.


INFO:tensorflow:Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO - tensorflow - Summary name conv2d_1/bias:0 is illegal; using conv2d_1/bias_0 instead.


INFO:tensorflow:Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO - tensorflow - Summary name dense_1/kernel:0 is illegal; using dense_1/kernel_0 instead.


INFO:tensorflow:Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO - tensorflow - Summary name dense_1/bias:0 is illegal; using dense_1/bias_0 instead.


INFO:tensorflow:Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO - tensorflow - Summary name dense_2/kernel:0 is illegal; using dense_2/kernel_0 instead.


INFO:tensorflow:Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


INFO - tensorflow - Summary name dense_2/bias:0 is illegal; using dense_2/bias_0 instead.


Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10

INFO - sgd - Result: 0.5050789251327514
INFO - sgd - Completed after 0:00:11





# TensorBoard Screenshot Test

In [8]:
tensorboard_screenshot = Image('tensorboard_screenshot.png')

NameError: name 'Image' is not defined

In [None]:
import IPython

assert type(tensorboard_screenshot) == IPython.core.display.Image
assert 'PNG' in str(tensorboard_screenshot.data) or 'JPG' in str(tensorboard_screenshot.data)

passed()

## Task

- Visualize optimizer performance with sacredboard

## Requirements

- Sort the optimizer runs in sacredboard by loss
- Take a screenshot
- Load it into a `IPython.display.Image` object called `sacredboard_screenshot`
- Display it

In [None]:
# YOUR CODE HERE
raise NotImplementedError()

# SacredBoard Screenshot Test

In [None]:
from IPython.display import Image

In [None]:
sacredboard_screenshot = Image('sacredboard_screenshot.png')

In [None]:
sacredboard_screenshot

In [None]:
import IPython

assert type(sacredboard_screenshot) == IPython.core.display.Image
assert 'PNG' in str(sacredboard_screenshot.data) or 'JPG' in str(sacredboard_screenshot.data)

passed()

## Question

- Which optimizer did you find worked best? Does that surprise you? If so, why?

The best optimzier was nadam. Nadam is Adam RMSprop with Nesterov momentum. No that is not suprising because it normalized by the mean and variance of the previous gradients. 

## Question

- What is the intution behind that optimitzer?

Momentum helps avoid saddle points in optimization. In addition, RMSprop helps avoid exploding, and vanishing, gradients. Combined, this is the best optimizer.

# Challenge Activities

- Tune the hyperparameters of the best optimizer
- Do the same thing in TensorFlow