# **Introducing the Keras Sequential API**

**Learning objectives**

1. Build a DNN model using the Keras Sequential API
2. Learn how to use feature columns in a Keras model
3. Learn how to train a model with Keras
4. Learn how to save/load and deploy a Keras model on GCP
5. Learn how to deploy and make predictions with a Keras model

## **Introduction**

The Keras Sequential API allows you to create TensorFlow models **layer-by-layer**. This is useful for building most kind of meachine learning models but **it does not allow you to create model that share layers, re-use layers or have multiple inputs or outputs**.

In this lab, we'll see how to build a simple deep neural network (DNN) model using Keras sequential API and feature columns. Once we have trained our model, we will deploy it using AI Platform and see how to call our model for online prediction.

In [1]:
import datetime
import os
import shutil

import numpy as np
import pandas as pd
import tensorflow as tf

from matplotlib import pyplot as plt
%matplotlib inline
from tensorflow import keras

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, DenseFeatures
from tensorflow.keras.callbacks import TensorBoard

print(tf.__version__)

2.4.1


## **Load raw data**

We will use the taxi data set

In [6]:
!ls -l data/

total 171324
-rw-rw-r-- 1 antounes antounes     33558 mars  14 21:22 images.tfrecords
-rw-r--r-- 1 antounes antounes  63460629 mars  12 13:46 taxi-test.csv
-rw-r--r-- 1 antounes antounes 110926109 mars  12 13:44 taxi-train.csv
-rw-rw-r-- 1 antounes antounes   1003818 mars  14 21:01 test.tfrecord


In [5]:
!head data/taxi*

==> data/taxi-test.csv <==
id,pickup_datetime,passenger_count,pickup_longitude,pickup_latitude,dropoff_longitude,dropoff_latitude
id3004672,2016-06-30 23:59:58,1,-73.9881286621094,40.7320289611816,-73.9901733398438,40.7566795349121
id3505355,2016-06-30 23:59:53,1,-73.9642028808594,40.6799926757813,-73.9598083496094,40.655403137207
id1217141,2016-06-30 23:59:47,1,-73.9974365234375,40.7375831604004,-73.9861602783203,40.7295227050781
id2150126,2016-06-30 23:59:41,1,-73.9560699462891,40.771900177002,-73.9864273071289,40.73046875
id1598245,2016-06-30 23:59:33,1,-73.97021484375,40.761474609375,-73.9615097045899,40.7558898925781
id0668992,2016-06-30 23:59:30,1,-73.9913024902344,40.7497978210449,-73.9805145263672,40.786548614502
id1765014,2016-06-30 23:59:15,1,-73.9783096313477,40.7415504455566,-73.9520721435547,40.7170028686523
id0898117,2016-06-30 23:59:09,2,-74.0127105712891,40.7015266418457,-73.9864807128906,40.7195091247559
id3905224,2016-06-30 23:58:55,2,-73.9923324584961,40.730510711669

## **Use `tf.data` to read the CSV files**

In [7]:
# Defining the feature names into a list `CSV_COLUMNS`

CSV_COLUMNS = [
    "trip_duration",
    "pickup_datetime",
    "pickup_longitude",
    "pickup_latitude",
    "dropoff_longitude",
    "dropoff_latitude",
    "passenger_count",
    "id"
]

LABEL_COLUMN = "trip_duration"
# Defining the default values into a list `DEFAULTS`
DEFAULTS = [[0.0], ["na"], [0.0], [0.0], [0.0], [0.0], [0.0], ["na"]]
UNWANTED_COLS = ["pickup_datetime", "id"]

def features_and_labels(row_data):
    # The `.pop()` method will return item and drop from frame
    label = row_data.pop(LABEL_COLUMN)
    features = row_data
    
    for unwanted_col in UNWANTED_COLS:
        features.pop(unwanted_col)
        
    return features, label

def create_dataset(pattern, batch_size=1, mode="eval"):
    # The `tf.data.experimental.make_csv_dataset()` method reads CSV files into a data set
    dataset = tf.data.experimental.make_csv_dataset(
        pattern, batch_size, CSV_COLUMNS, DEFAULTS
    )
    
    # The `.map()` function executes a specified function for each item in the iterable
    # The item is sent to the function as a parameter
    dataset = dataset.map(features_and_labels)
    
    if mode == "train":
    # The `.shuffle()` method takes a sequence (list, string or tuple) and reorganise the order of the items
        dataset = dataset.shuffle(buffer_size=1000).repeat()
        
    # Take advantage of multi-threading; 1=AUTOTUNE
    dataset = dataset.prefetch(1)
    return dataset

## **Build a simple Keras DNN model**

We will use feature columns to connect our raw data to our Keras DNN model. **Feature columns make it easy to perform common types of feature engineering on your raw data**. For example, you can one-hot-encode categorical data, create feature crosses, embeddings, and more.

In our case we won't do any feature engineering. However, we still need to create a list of feature columns to specify the numeric values which will be passed on to our model. To do this, we use `tf.feature_column.numeric_column()`.

We use a Python dictionary comprehension to create the feature columns for our model, which is just an elegant alternative to a `for` loop.

In [8]:
# Defining the feature names into a list `INPUT_COLS`
INPUT_COLS = [
    "pickup_longitude",
    "pickup_latitude",
    "dropoff_longitude",
    "dropoff_latitude",
    "passenger_count"
]

# Create input layer of feature columns
feature_columns = {
    colname: tf.feature_column.numeric_column(colname) for colname in INPUT_COLS
}