# Introducing the Keras Sequential API

**Learning Objectives**
  - Build a DNN model using the Keras Sequential API
  - Learn how to use feature columns in a Keras model
  - Learn how to save/load, and deploy a Keras model on GCP

## Introduction

The [Keras sequential API](https://keras.io/models/sequential/) allows you to create Tensorflow models layer-by-layer. This is useful for building most kinds of machine learnig models but it does not allow you to create models that share layers, re-use layers or have multiple inputs or outputs. In this lab, we'll see how to build a simple deep neural network model using the keras sequential api and feature columns. Once we have trained our model, we will deploy it using AI Platform and see how to call our model for online prediciton.


In [None]:
#  Ensure that we have the latest version of Tensorflow installed.
!pip3 freeze | grep tf-nightly-2.0-preview || pip3 install tf-nightly-2.0-preview

Start by importing the necessary libraries for this lab.

In [None]:
import shutil, os, datetime
import numpy as np
import tensorflow as tf

from matplotlib import pyplot as plt
from tensorflow import keras

import shutil, os, datetime

print(tf.__version__)

## Load raw data 

We will start with the CSV files that we wrote out in the [first notebook](../01_explore/taxifare.iypnb) of this sequence. Just so you don't have to run the notebook, we saved a copy in ../data.

In [None]:
!ls -l ../data/*.csv

In [None]:
!head ../data/taxi*.csv

## Use tf.data to read the CSV files

We wrote these functions for reading data in the [third notebook](../03_tfdata/input_pipeline.ipynb) of this sequence. We set up the column names from our csv files, denote the label column and specify the default values for each column.

In [None]:
CSV_COLUMNS  = ['fare_amount',  'pickup_datetime',
                'pickup_longitude', 'pickup_latitude', 
                'dropoff_longitude', 'dropoff_latitude', 
                'passenger_count', 'key']
LABEL_COLUMN = 'fare_amount'
DEFAULTS     = [[0.0],['na'],[0.0],[0.0],[0.0],[0.0],[0.0],['na']]

def features_and_labels(row_data):
    for unwanted_col in ['pickup_datetime', 'key']:
        row_data.pop(unwanted_col)
    label = row_data.pop(LABEL_COLUMN)
    return row_data, label  # features, label

# load the training data
def load_dataset(pattern, batch_size=1, mode=tf.estimator.ModeKeys.EVAL):
    dataset = (tf.data.experimental.make_csv_dataset(pattern, batch_size, CSV_COLUMNS, DEFAULTS)
             .map(features_and_labels) # features, label
             .cache())
    if mode == tf.estimator.ModeKeys.TRAIN:
        dataset = dataset.shuffle(buffer_size=1000).repeat()
    dataset = dataset.prefetch(1) # take advantage of multi-threading; 1=AUTOTUNE
    return dataset

## Build a simple keras DNN model

We will use feature columns to connect our raw data to our keras DNN model. Feature columns make it easy to perform common type of feature engineering on your raw data. For example you can one-hot encode categorical data, create feature crosses, embeddings and more. We'll cover these later in the course, but if you want to a sneak peak browse the official TensorFlow [feature columns guide](https://www.tensorflow.org/guide/feature_columns).

In our case we won't do any feature engineering. However we still need to create a list of feature columns to specify the numeric values which will be passed on to our model. To do this we use `tf.feature_column.numeric_column()`

We use a python dictionary comprehension to create the feature columns for our model, which is just an elegant alternative to a for loop.

In [None]:
INPUT_COLS = ['pickup_longitude', 'pickup_latitude', 
              'dropoff_longitude', 'dropoff_latitude', 
              'passenger_count']

# input layer of feature columns
feature_columns = {
    colname : tf.feature_column.numeric_column(colname)
       for colname in INPUT_COLS
}

Next, we create the DNN model. The Sequential model is a linear stack of layers and when building a model using the Sequential API, you configure each layer of the model in turn. Once all the layers have been added, you compile the model. 

Before training a model, you must configure the learning process, which is done using the compile method. The compile method takes three arguments:

* An optimizer. This could be the string identifier of an existing optimizer (such as `rmsprop` or `adagrad`), or an instance of the [Optimizer class](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/optimizers).
* A loss function. This is the objective that the model will try to minimize. It can be the string identifier of an existing loss function from the [Losses class](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/losses) (such as categorical_crossentropy or mse), or it can be a custom objective function.
* A list of metrics. For any machine learning problem you will want a set of metrics to evaluate your model. A metric could be the string identifier of an existing metric or a custom metric function.

We will add an additional custom metric called `rmse` to our list of metrics which will return the root mean square error. 

In [None]:
# Build a keras DNN model using Sequential API
model = keras.models.Sequential()    
model.add(keras.layers.DenseFeatures(feature_columns=feature_columns.values()))
model.add(keras.layers.Dense(units=32, activation="relu", name="h1"))
model.add(keras.layers.Dense(units=8, activation="relu", name="h2"))
model.add(keras.layers.Dense(units=1, activation="linear", name="output"))

# Create a custom evalution metric
def rmse(y_true, y_pred):
    return tf.sqrt(tf.reduce_mean(tf.square(y_pred - y_true))) 

# Compile the keras model
model.compile(optimizer='adam', loss='mse', metrics=[rmse, 'mse'])

## Train the model

To train our model, we typically use the `fit()` function. 

First, we'll set up some training variables like the batch size, the number of training examples, number of evaluations and number of evaluation examples. We'll choose a large number of training examples, since the training dataset repeats (similarly for the eval examples). Next, we create the training and evaluation dataset using the `load_dataset` function we wrote above. 

In [None]:
TRAIN_BATCH_SIZE = 32
NUM_TRAIN_EXAMPLES = 10000 * 5 # training dataset repeats, so it will wrap around
NUM_EVALS = 5  # how many times to evaluate
NUM_EVAL_EXAMPLES = 10000 # enough to get a reasonable sample, but not so much that it slows down

trainds = load_dataset(pattern='../data/taxi-train*', 
                       batch_size=TRAIN_BATCH_SIZE,
                       mode=tf.estimator.ModeKeys.TRAIN)

evalds = load_dataset(pattern='../data/taxi-valid*',
                      batch_size=1000,
                      mode=tf.estimator.ModeKeys.EVAL).take(NUM_EVAL_EXAMPLES//1000)

There are various arguments you can set when calling the [.fit() function](https://www.tensorflow.org/api_docs/python/tf/keras/Model#fit)
. Here we will specify the training data, the evaluation data, the number of epochs for training and a parameter called `steps_per_epoch`. 

The `steps_per_epoch` parameter is used to mark the end of training for a single epoch. Since the data generator function continuously feeds training samples into the .fit() method, it is not possible to know the end of an epoch unless the parameter steps_per_epoch is specified. 

In [None]:
%%time
steps_per_epoch = NUM_TRAIN_EXAMPLES // (TRAIN_BATCH_SIZE * NUM_EVALS)

history = model.fit(x=trainds,
                    validation_data=evalds,
                    epochs=NUM_EVALS,
                    steps_per_epoch=steps_per_epoch)

### High-level model evaluation

Once we've run data through the model, we can call `.summary()` on the model to get a high-level summary of our network. We can use the keras utilities to plot a diagram of our model architecture. And, we can plot the training and evaluation curves for the metrics we computed above. 

In [None]:
model.summary()

In [None]:
tf.keras.utils.plot_model(model, 'dnn_model.png', show_shapes=True, rankdir='LR')

In [None]:
# plot
nrows = 1
ncols = 2
fig = plt.figure(figsize=(10, 5))

for idx, key in enumerate(['loss', 'rmse']):
    ax = fig.add_subplot(nrows, ncols, idx+1)
    plt.plot(history.history[key])
    plt.plot(history.history['val_{}'.format(key)])
    plt.title('model {}'.format(key))
    plt.ylabel(key)
    plt.xlabel('epoch')
    plt.legend(['train', 'validation'], loc='upper left');

# Making predictions with our model

To make predictions with our trained model, we can call the [predict method](https://www.tensorflow.org/api_docs/python/tf/keras/Model#predict), passing to it a dictionary of values. The `steps` parameter determines the total number of steps before declaring the prediction round finished. Here since we have just one example, we set `steps=1` (setting `steps=None` would also work). Note, however, that if x is a `tf.data` dataset or a dataset iterator, and steps is set to None, predict will run until the input dataset is exhausted.

In [None]:
model.predict(x={'pickup_longitude': tf.convert_to_tensor([-73.982683]),
                 'pickup_latitude': tf.convert_to_tensor([40.742104]),
                 'dropoff_longitude': tf.convert_to_tensor([-73.983766]),
                 'dropoff_latitude': tf.convert_to_tensor([40.755174]),
                 'passenger_count': tf.convert_to_tensor([3.0])}, 
              steps=1)

# Export and deploy our model

Of course, making individual predictions is not realistic, because we can't expect client code to have a model object in memory. For others to use our trained model, we'll have to export our model to a file, and expect client code to instantiate the model from that exported file. 

We'll export the model to a TensorFlow SavedModel format. Once we have a model in this format, we have lots of ways to "serve" the model, from a web application, from JavaScript, from mobile applications, etc.

In [None]:
OUTPUT_DIR = "./export/savedmodel"
shutil.rmtree(OUTPUT_DIR, ignore_errors=True)
EXPORT_PATH = os.path.join(OUTPUT_DIR, datetime.datetime.now().strftime("%Y%m%d%H%M%S"))
tf.saved_model.save(model, EXPORT_PATH) # with default serving function

In [None]:
!saved_model_cli show --tag_set serve --signature_def serving_default --dir {EXPORT_PATH}
!find {EXPORT_PATH}
os.environ['EXPORT_PATH'] = EXPORT_PATH

### Deploy our model to AI Platform

Finally, we will deploy our trained model to AI Platform and see how we can make online predicitons.

In [None]:
%%bash
PROJECT=munn-sandbox
BUCKET=${PROJECT}
REGION=us-east1
MODEL_NAME=taxifare
VERSION_NAME=dnn

if [[ $(gcloud ai-platform models list --format='value(name)' | grep $MODEL_NAME) ]]; then
    echo "$MODEL_NAME already exists"
else
    # create model
    echo "Creating $MODEL_NAME"
    gcloud ai-platform models create --regions=$REGION $MODEL_NAME
fi

if [[ $(gcloud ai-platform versions list --model $MODEL_NAME --format='value(name)' | grep $VERSION_NAME) ]]; then
    echo "Deleting already existing $MODEL_NAME:$VERSION_NAME ... "
    gcloud ai-platform versions delete --model=$MODEL_NAME $VERSION_NAME
    echo "Please run this cell again if you don't see a Creating message ... "
    sleep 2
fi

# create model
echo "Creating $MODEL_NAME:$VERSION_NAME"
gcloud ai-platform versions create --model=$MODEL_NAME $VERSION_NAME --async \
       --framework=tensorflow --python-version=3.5 --runtime-version=1.14 \
       --origin=$EXPORT_PATH --staging-bucket=gs://$BUCKET

In [None]:
%%writefile input.json
{"pickup_longitude": -73.982683, "pickup_latitude": 40.742104,"dropoff_longitude": -73.983766,"dropoff_latitude": 40.755174,"passenger_count": 3.0}  

In [None]:
!gcloud ai-platform predict --model taxifare --json-instances input.json --version dnn

Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License