## Introduction

In this notebook, we will:

1. Get more practice manipulating our training data to fit our neural networks
2. Learn how to use some of the tooling in the Keras ecosystem
3. Learn how to debug issues with our neural networks


In [1]:
%tensorflow_version 1.x
import numpy as np

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.datasets import mnist
from keras.optimizers import Adam
from keras.utils import to_categorical
from keras.callbacks import TensorBoard

%load_ext tensorboard

Using TensorFlow backend.


## Load Dataset

In [2]:
# Load Datset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

print("Training Dataset Size: ", x_train.shape)
print("Training Labels Size: ", y_train.shape)
print("Testing Dataset Size: ", x_test.shape)
print("Testing Labels Size: ", y_test.shape)

Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
Training Dataset Size:  (60000, 28, 28)
Training Labels Size:  (60000,)
Testing Dataset Size:  (10000, 28, 28)
Testing Labels Size:  (10000,)


### Preparing the Data

1. Use the `numpy.resize()` API to transform our input data from a `28x28` image to a `784` 1-D array. See: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.resize.html
2. Each pixel value is an integer between $[0, 255]$.  Scale the input to be in range of $[0, 1]$ by dividing by `255`.
3. Use the `keras.utils.to_categorical` API to convert our digit labels from integers (0, 1, 2,..., 9) to a one-hot encoding format.  See: https://keras.io/utils/#to_categorical

In [3]:
# Resize image to 1D array and values to between [0, 1]
# x_train = ...
# x_test = ...

# Convert our labels to one-hot encoded format
# y_train = ...
# y_test = ...

x_train = np.resize(x_train, (60000, 28 * 28)) / 255.
x_test = np.resize(x_test, (60000, 28 * 28)) / 255.

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

print("Training Dataset Size: ", x_train.shape)
print("Training Labels Size: ", y_train.shape)
print("Testing Dataset Size: ", x_test.shape)
print("Testing Labels Size: ", y_test.shape)

Training Dataset Size:  (60000, 784)
Training Labels Size:  (60000, 10)
Testing Dataset Size:  (60000, 784)
Testing Labels Size:  (10000, 10)


## Defining Our Neural Network 

Network specificatoins:

* `3` hidden layer neural network
* For each hidden layer `50` hidden units with the `relu` activation function
* Define an output layer with the appropriate activation and output units
* Use an appropriate loss function for categorical data
* Use the `Adam` optimizer with a learning rate of `0.01`
* Add the `accuracy` metric to our training. See: https://keras.io/metrics/

In [4]:
# Define our neural network
model = Sequential()

# Add your neural network layers here
# model.add(...)
# ...
model.add(Dense(50, activation="relu", input_dim=784))
model.add(Dense(50, activation="relu"))
model.add(Dense(50, activation="relu"))
model.add(Dense(10, activation="softmax"))

model.summary()




Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense_1 (Dense)              (None, 50)                39250     
_________________________________________________________________
dense_2 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_3 (Dense)              (None, 50)                2550      
_________________________________________________________________
dense_4 (Dense)              (None, 10)                510       
Total params: 44,860
Trainable params: 44,860
Non-trainable params: 0
_________________________________________________________________


In [5]:
# Compile your model and define your loss and optimization algorithm
# model.compile(...)
model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=0.01), metrics=['accuracy'])





# Training our Neural Network with Additional Tooling

* `batch_size` of `100`
* `epoch` of `5`
* 15% validation split

Add the following additonal callbacks:

* Tensorboard callback: Records data about training so we can visualize it in the TensorBoard tool.  Use the following settings: `log_dir=./logs_2.3, histogram_freq=1, batch_size=32, write_graph=True, write_grads=True, write_images=True`)  See: https://keras.io/callbacks/ and https://www.tensorflow.org/guide/summaries_and_tensorboard


In [6]:
# Fit your model on the training data
# callbacks = []
# history = model.fit(..., callbacks=callbacks)
callbacks = [
    TensorBoard(log_dir='./logs23', histogram_freq=1, batch_size=32,
                write_graph=True, write_grads=True, write_images=True,),
]

history = model.fit(x_train, y_train, batch_size=10, epochs=5, validation_split=0.15,
                    callbacks=callbacks, verbose=0)
history.history

Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where















{'acc': [0.8916666597872972,
  0.9266274433393105,
  0.9358039145142424,
  0.9407646987835566,
  0.9418039144721686],
 'loss': [0.40204413004897765,
  0.30066736097522295,
  0.2829510455171643,
  0.26388072243606003,
  0.26860611337207013],
 'val_acc': [0.9425555490122901,
  0.937333326207267,
  0.9502222167121039,
  0.9323333266046312,
  0.9505555491977268],
 'val_loss': [0.23591159666742897,
  0.2771712839444,
  0.23044781093545477,
  0.38695099426793855,
  0.22923739687304887]}

## Viewing Results in Tensorboard

In [7]:
%tensorboard --logdir logs23

## Questions

1. Look at Tensorboard's `scalars` tab.  Are we overfitting the model?
2. Look at the `Graphs` tab.  Our `model.summary()` only shows four layers above.  Why are there so many more blocks in this diagram of our neural network?
3. Look at the `Histograms` and `Distributions` tab.  What are they showing us?  How can this help use debug our neural network?

## Summary

* Keras callbacks allow you to record metrics (and customize the behavior) during training.
* Tensorboard is a useful tool for the Tensorflow backend to visualize and debug your networks.
* These tools are essential when building your own networks because of all the debugging and hyperparameter tuning that is needed.