# Building Deep Learning Applications with Keras 2.0
**Instructor:** Adam Geitgey

Keras is a popular programming framework for deep learning that simplifies the process of building deep learning applications. Instead of providing all the functionality itself, it uses either TensorFlow or Theano behind the scenes and adds a standard, simplified programming interface on top. In this course, learn how to install Keras and use it to build a simple deep learning model. Explore the many powerful pre-trained deep learning models included in Keras and how to use them. Discover how to deploy Keras models, and how to transfer data between Keras and TensorFlow so that you can take advantage of all the TensorFlow tools while using Keras. When you wrap up this course, you'll be ready to start building and deploying your own models with Keras.

* **Keras:** A high-level framework for building neural networks with only a few lines of code
    * Backends: Keras doesn't do all of the work itself; it uses either **TensorFlow** or **Theano** behind the scenes to do all of its calculations. Keras abstracts away a lot of the complexity of using those tools while still giving you many of the benefits.
    * Best practices are built-in to Keras
    * The default settings in Keras as designed to give you good results in most cases
    * Keras comes with several pre-trained built-in models for image recognition; you can use the pre-trained models to recognize common objects in images or **you can adapt these models to create a custom image recognition system with your own data**
    
#### Keras backends
* Keras is a **front-end layer** that relies on a separate, deep learning, **back-end library** under the hood for the processing
* Keras supports multiple backends, including:
    * TensorFlow (by Google)
    * Theano (by University of Montreal)

#### Theano
* Created at MILA (Montreal Institute for Learning Algorithms) at the University of Montreal
* Theano has been around for a decade (much longer than TensorFlow)
* It has been the tool behind many breakthroughs in ML research
* Works well with Python
* Fullt supports GPU acceleration

#### TensorFlow
* Created at Google in 2015
* Google uses TensorFlow internally to build many of their popular services, like GoogleTranslate
* Advanced support for distributed processing across multiple
* TensorFlow works with Google's cloud ML platform
* It's even easy to switch between Theano and TensorFlow with Keras


* The decsion to use either Theano or TensorFlow often depends on the other tools you wish to use with your project and which library supports them

#### Using Keras vs. TensorFlow

<img src='../data/keras1.png' width="600" height="300" align="center"/>

* In this course we'll be using Keras with a TensorFlow backend
* TensorFlow gives you more control over almost every detail, whereas Keras offers fast and easy experimentation
* When is using TensorFlow alone a better choice?
    * Researching new types of machine learning models
    * Building a large-scale system to support many users
    * If processing and memory efficiency is more important than time saved while coding
* When to choose Keras?
    * Education and experimentation
    * Prototyping
    * Producton systems that don't have highly specialized requirements
    
#### Supervised Learning
* The branch of machine learning where the computer learns how to perform a function by looking at labeled training data

#### Customizable layer settings
* Layer activation function
* Initializer function for node weights
* Regulariztion funciton for node weights
* But remember that default settings are a good start

#### Types of layers
* Dense
    * Example: `keras.layers.Dense()`
* Convolutional
    * Example: `keras.layers.convolutional.Conv2D()`
    * Usually used to process images or spatial data
* Recurrent
    * Example: `keras.layers.recurrent.LSTM()`
    * Special layers that have a "memory" built-in to each neuron
    * Useful for processing sequential data (like sentences, where many previous words are used to help determine the meaning and context of a given word).
* You can mix layers of different types in a single model

### Neural networks train best when data in each column is each scaled to the same range (often 0-1)

```
# Data needs to be scaled to a small range like 0 to 1 for the neural network to work well.
scaler = MinMaxScaler(feature_range=(0, 1))

# Scale both the training inputs and outputs
scaled_training = scaler.fit_transform(training_data_df)
scaled_testing = scaler.transform(test_data_df)

# Print out the adjustment that the scaler applied to the total_earnings column of data
print("total_earnings values were scaled by multiplying by {:.10f} and adding {:.6f}".format(scaler.scale_[8], 
                                                            scaler.min_[8]))
````

* **Note:** default `activation` is `linear`
* When we compile the model, we have to specify the loss function. The loss function is how Keras measures how close the NN's predictions are to the expected values. `mse` or `mean_squared_error` is the most common loss function
* A good choice for optimizer that works for most ML tasks is `adam`

#### Training and evaluating the model
* `model.fit()`
* The most important parameters we have to pass in are the training features and the expected output
* **epoch:** A single training pass across the entire training dataset is called an epoch
    * If we do too few passes, the NN won't make accurate predictions
    * If we do too many, it will waste time and might also cause overfitting problems
    * The best way to tune this is to try training the neural network and stop doing additional training passes when the network stops getting more accurate
* We can also ask Keras to shuffle the order of the training data randomly
    * Note that NN's typically train best when the data is shuffled
* `verbose=2` tells Keras to print more detailed information during training so we can watch what's going on.

```
# Load the separate test data set
test_data_df = pd.read_csv("sales_data_test_scaled.csv")

X_test = test_data_df.drop('total_earnings', axis=1).values
Y_test = test_data_df[['total_earnings']].values

test_error_rate = model.evaluate(X_test, Y_test, verbose = 0)
print("The mean squared error (MSE) for the test data set is: {}".format(test_error_rate))
```

* The above code may generate some warnings from TensorFlow, but that's normal and don't worry about it (according to Adam Geitgey)
* The smaller the error term, the better (it means that the neural network is making predictions that are, on average, very close to the actual values)

#### Making predictions
* **One trick to watch out for:** Keras always assumes that we are going to ask for multiple predictions with multiple output values in each prediction, so it **always returns predictions as a 2D array**
* **Also remember:** Usually data is scaled before being processed through a NN. When this is the case, **always perform the reverse scale operations on predictions to calculate actual predicitions in their *original units*.**
    * $\star$ This is where having printed the multiplicative and additive constants from the scaling process becomes useful!!
   
#### Saving and loading models
* So far, we've always retrained the NN every time we've used it, but *there are problems with this approach!!*
    * Large neural networks can take hours or days to train; instead of retraining each time we run our program, we can train it once and save the results to a file. Then, whenever we want to use the neural network, we can just load it back up and use it
* To save a Keras model:

```
# Save the model to disk
model.save("trained_model.h5")
print("Model saved to disk.")
```
* Note that `hdf5` is the standard format in which to save nn models; `.h5` means the data will be stored in the `hdf5` format
* **hdf5** is a binary file format, designed for storing Python array data; the convention is to use `.h5` as the filename extension
* **When you save the model, it saves both the structure of the neural network, and the trained weights that determine how the neural network work.**

#### Load a model saved to disk

```
from keras.models import load_model

model = load_model('trained_model.h5')
```
* Since the file contains the structure and training parameters, this single line recreates our entire trained NN
* We don't have to redeclare the structure of our neural network again
* $\star$ **We can now use the model we loaded exactly like any other model**

## Pre-trained models in Keras
* Not only can you build your own ML NN models with Keras, but you can use models built by other developers
* Keras provides several popular image recognition models built-in
* Just by installing Keras, you have access to pre-trained image recognition models that you can use in your own projects
* The image recognition models included with Keras are all trained to recognize images from the ImageNet dataset
* The **ImageNet dataset** is a collection of millions of pictures of objects that have been labelled so that you can use them to train computers to recognize those objects
* **ILSVRC:** ImageNet Large-Scale Visual Recognition Challenge
* The pre-trained models included with Keras are trained on the more limited data used by this contest: data set of 1,000 types of common objects

### Image Recognition Models included with Keras (4)

#### VGG (Visual Geometry Group at the University of Oxford)
   * Deep NN with either 16 or 19 layers
   * State-of-the-art in 2014
   * Still widely used, but takes a lot of memory to run

#### ResNet50 (Microsoft Research)
   * State-of-the-art from 2015
   * 50 layer NN 
   * Manages to be more accurate but still use less memory than the VGG design

#### Inception-v3 (Google)
   * Design from 2015 that also performs very well

#### Xception (François Chollet, author of Keras)
   * Xception is an improved version of Inception-v3
   * More accurate than Inception-v3, while using the same amount of memory
   
* $\star$ **Even if you want to recognize an object that's not in the 1,000 object training set, it's much faster to start with a pre-trained model and fine-tune it to your needs, instead of training your own model from scratch.** $\star$

* Also: $\star$ **These models illustrate the progression of the state-of-the-art in image recognition. It's very useful to be familiar with these common neural network designs since they are so often re-used or adapted to solve real-world problems.** $\star$

### ResNet50
* All of the pre-trained models included with Keras are in the `applications` package
* Needs input images to be 224 pixels x 224 pixels
* Note that using low-resolution images (like 224 x 224) is common even in the latest image recognition models.
* **Remember that NNs work best with small numbers so we need to normalize the data before we feed it to the model. ResNet has a built-in normalization function called `preprocess_input()` which will do that for us.**


```
import numpy as np
from keras.preprocessing import image
from keras.applications import resnet50

# Load Keras' ResNet50 model that was pre-trained against the ImageNet database
model = resnet50.ResNet50()

# Load the image file, resizing it to 224x224 pixels (required by this model)
img = image.load_img("bay.jpg", target_size=(224, 224))

# Convert the image to a numpy array
x = image.img_to_array(img)

# Add a forth dimension since Keras expects a list of images
x = np.expand_dims(x, axis=0)

# Scale the input image to the range used in the trained network
x = resnet50.preprocess_input(x)

# Run the image through the deep neural network to make a prediction
predictions = model.predict(x)

# Look up the names of the predicted classes. Index zero is the results for the first image.
predicted_classes = resnet50.decode_predictions(predictions, top=9)

print("This is an image of:")

for imagenet_id, name, likelihood in predicted_classes[0]:
    print(" - {}: {:2f} likelihood".format(name, likelihood))
```

* **Note that `model.predict()` returns a predictions object.** The predictions object is a 1,000 element array of floating point numbers. Each element in the array tells us how likely a picture contains each of the 1,000 objects the model was trained to recognize.
* `resnet50.decode_predictions()` decodes these prediction objects for us and by default gives us the top five most likely predictions
    * However, if we add in `top=9` we'll get, for example, the top 9 most likely predictions
    
* **Note!!:** The first time you run the code, Keras will connect to the internet and download the latest version of the ResNet50 model. This means you'll need internet access to run it and about 100MB of data will be downloaded
* **FOR MACS:** 
    * 1) Open Finder window
    * 2) Go to Applications folder
    * 3) Find Python 3 folder
    * 4) Inside folder, run "Install Certificates command"
* **Now you can run the model**

## Monitoring a Keras model with TensorBoard

### Export Keras logs in TensorFlow format
* TensorFlow comes with a great web-based tool called **TensorBoard**, which lets us visualize our model's structure and monitor its training
* To use TensorBoard, we need our Keras model to write log files in a format that TensorBoard can read
* TensorBoard uses the information in these log files to generate its visualizations
* Below we'll add TensorBoard logging to our Keras model

```
import pandas as pd
import keras
from keras.models import Sequential
from keras.layers import *

training_data_df = pd.read_csv("sales_data_training_scaled.csv")

X = training_data_df.drop('total_earnings', axis=1).values
Y = training_data_df[['total_earnings']].values

# Define the model
model = Sequential()
model.add(Dense(50, input_dim=9, activation='relu', name='layer_1'))
model.add(Dense(100, activation='relu', name='layer_2'))
model.add(Dense(50, activation='relu', name='layer_3'))
model.add(Dense(1, activation='linear', name='output_layer'))
model.compile(loss='mean_squared_error', optimizer='adam')

# Create a TensorBoard logger
logger = keras.callbacks.TensorBoard(
    log_dir='logs',
    write_graph=True,
    histogram_freq=5
)

# Train the model
model.fit(
    X,
    Y,
    epochs=50,
    shuffle=True,
    verbose=2,
    callbacks=[logger]
)

# Load the separate test data set
test_data_df = pd.read_csv("sales_data_test_scaled.csv")

X_test = test_data_df.drop('total_earnings', axis=1).values
Y_test = test_data_df[['total_earnings']].values

test_error_rate = model.evaluate(X_test, Y_test, verbose=0)
print("The mean squared error (MSE) for the test data set is: {}".format(test_error_rate))
```

* Note that by default, Keras won't create any TensorFlow log files; to do that, we need to add a few lines of code

#### TensorBoard syntax and parameters
* `logger = keras.callbacks.TensorBoard(...)`
* **Parameters:**
    * `log_dir`: tell Keras which folder to write the log files to 
    * `write_graph`: True/False: tells Keras to log model structure
    * `histogram_freq`: int: log extra statistics on how each layer of our NN is working; the `5` means that for every five passes through the training data, it will write out statistics; the more often you log data, obviously, the bigger the log files get
    
* Customize what data gets logged:
    * By default Keras will only log details of the training process, but it won't log the structure of the model.
    
* **We also need to tell our model to use the logger when we call `model.fit()`**
    * **`callbacks=[logger]`**
    * Note that callbacks expects an array
    
#### Naming your layers
* One more tweak to make the visualizations easier to read: name your layers and the name will show up in the TensorBoard visualization

### Visualize the computational graph
* It's always helpful to visualize what's happening with your data. This is where TensorBoard comes in; it let us visualize exactly what Kera's TensorFlow backend is doing
* The `logdir` parameter tells TensorBoard which set of log files you want to visualize
* **To open TensorBoard, open a terminal window**:
    * `tensorboard --logdir=06/logs`

<img src='../data/tensorboard1.png' width="800" height="400" align="center"/>

<img src='../data/tensorboard2.png' width="900" height="450" align="center"/>

<img src='../data/tensorboard3.png' width="900" height="450" align="center"/>

<img src='../data/tensorboard4.png' width="600" height="300" align="center"/>

* Each line represents a tensor or array of data being passed between the layers. The numbers along the lines represent the size of the tensor or array
* `? x 9`: `?` represents batch size
* Click on a layer to expand:

<img src='../data/tensorboard5.png' width="600" height="300" align="center"/>

* A "neat" feature in TensorBoard is the ability to trace the path of data through the graph
* Let's see that we want to see what is required to generate an output from the neural network
* Very helpful to debug what's going on, especially for complex models
* There's a lot going on here (above) other than just the neural network itself
    * When you use Keras, it will tend to build more complex TensorFlow models than what you would have built yourself if you were using TensorFlow directly
    * Keras will try to build models that capture best practices that you may not even be aware of 
    * For example, **gradient clipping** can help prevent issues when you're training very deep neural networks and Keras always includes gradient clippng in your model by default

### Visualizing training progress
* You'll often want to compare different neural network designs to see which gives you the best results with your training data
* Using Tensorboard, we can visually monitor the progress of training as it happens and even compare different training runs against each other

### Exporting Google Cloud-compatible models
* Now, you want to be able to scale your Keras model up in production to serve lots of users
* Since we're using Keras with a TensorFlow backend, we can export our Keras model as a TensorFlow model
* Once we have a TensorFlow model, we can upload that to the Google Cloud ML service
* Using the Google Cloud ML service, we can use as many users as we need
* To export this model as a TensorFlow model, we have to use some TensorFlow specific code
* We create a TensorFlow `saved_model` builder object
    * This object lets us save a TensorFlow model with custom options
    * The only parameter is the name of the folder we want to save the model in 
* We also need to declare what the inputs and outputs of our model are
* TensorFlow models can have several inputs and outputs so we need to tell TensorFlow which specific inputs and outputs we'll use when making predictions
* Keras makes this easy; it keeps track of the input and output of our model for us so we just need to pass that to TensorFlow: `model.input` and `model.output`
* We also have to create a TensorFlow **`signature_def`**:
    * a `signature_def` is sort of like a function declaration in a programming language
    * TF looks for this to know how to run the prediction function of our model
    * **This code will be the same every time**
* We also have to call TF's **`.add_meta_graph_and_variables()`** function:
    * This tells TF that we want to save both the structure of our model and the current trained weights of the model 
    * First pass reference to current Keras session : `K.get_session()`
    * Then assign model a special tag, `tags` so that TF knows this model is meant for serving users
    * Lastly, pass in the `signature_def` created above
        * This code will also be the same every time. 
    * **(Don't forget to save the model with `model_builder.save()`**

### Configuring a new Google Cloud account and project
* A **Project** is a workspace within Google cloud
* Each application you build can be set up as its own project
* Your project will get assigned a unique project ID corresponding the the project title; this is the ID you'll use to access this project from your programs
* Since the Google Cloud platform has so many features, not all of them are enabled by default
    * For example, we need to enable the Google Cloud ML service before we can use it
    
#### Google Cloud SDK
* The Google Cloud SDK is a set of tools that lets us work from the Google cloud service from our computer
* `cloud.google.com/sdk/downloads`
* The advantage of hosting your Keras model in the Google cloud is that it is accessible from anywhere in the world
* Google servers will run the model and you are only charged based on how many requests are made
* It's a great solution for using a Keras model in production if you don't want to maintain your own servers