**Chapter 10 – Introduction to Artificial Neural Networks with Keras**

<table align="left">
  <td>
    <a target="_blank" href="https://colab.research.google.com/github/ageron/handson-ml2/blob/master/10_neural_nets_with_keras.ipynb"><img src="https://www.tensorflow.org/images/colab_logo_32px.png" />Run in Google Colab</a>
  </td>
</table>

# Regression using the California housing dataset

## Setup

In [None]:
# Common imports
import sys
import os
import sklearn
import numpy as np
import pandas as pd
import tensorflow as tf
from tensorflow import keras

# to make this notebook's output stable across runs
np.random.seed(42)
tf.random.set_seed(42)

# To plot pretty figures
%matplotlib inline
import matplotlib.pyplot as plt

# Ignore useless warnings (see SciPy issue #5998)
import warnings
warnings.filterwarnings(action="ignore", message="^internal gelsd")

## Load, split and scale the dataset

In [None]:
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

housing = fetch_california_housing()

X_train_full, X_test, y_train_full, y_test = train_test_split(housing.data, housing.target, random_state=42)
X_train, X_valid, y_train, y_valid = train_test_split(X_train_full, y_train_full, random_state=42)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_valid = scaler.transform(X_valid)
X_test = scaler.transform(X_test)

In [None]:
# Computing the mean median_house_value.
# We can see that the target value have been scaled down by a factor of 100000
# compared to the Housing dataset used in chapter 2. We should keep this in mind,
# if we want to compare the RMSE of this model with the RMSEs of the other models
# that we trained and tested using the dataset from chapter 2.
housing.target.mean()

In [None]:
X_train.shape[1:]

## Build, compile, train and evaluate a model

In [None]:
# Build a model.
# We don't need to specify an input layer, since we don't need to convert the input array.
# For regression problems, we don't use an activation function in the output layer.

model = keras.models.Sequential([
    # hidden layer
    keras.layers.Dense(30, activation="relu", input_shape=X_train.shape[1:]),
    # output layer
    keras.layers.Dense(1)
])

# Compile the model.
# For regression problems, we use the "mean_squared_error" as loss function.
model.compile(loss="mean_squared_error", optimizer="sgd")

In [None]:
# Train the model.
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_valid, y_valid))

In [None]:
# Show the learning curves.         
pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1)
plt.show()         

In [None]:
# Evaluate the model (outputs the RMSE).
model.evaluate(X_test, y_test)

<b>Observation:</b><br/>
If we multipy the RMSE (=loss) with 100000, we can compare with values obtained by the best models that I trained and tested on the dataset used in chapter 2.

Gradient Boosted Forest: 47480 (best model that I trained and tested on the dataset used in chapter 2).

In [None]:
# Make predictions for the first 3 instances in the test set.
X_new = X_test[:3]
y_pred = model.predict(X_new)
y_pred

In [None]:
# Compare with the corresponding values target values
y_test[:3]

<b>Observation:</b><br/>
The second prediction is very bad given the RMSE. The other two predictions are okay.

## Hyperparameter tuning
We can tune hyperparameters by using Scikit-Learn's <b>GridSearchCV</b> or <b>RandomizedSearchCV</b>. The first will train the network with every combination of the specified hyperparameters, while the latter will randomly pick a number of combinations. For this example, we will use RandomizedSearchCV to avoid getting too many combinations.

We need to wrap the compiled Keras model in an object that mimic a regular Scikit-Learn regressor.

### Create a compiled Keras model and wrap it in a Scikit-Learn KerasRegressor.

In [None]:
# Create a function that will build and compile a Keras model.
def build_model(n_hidden=1, n_neurons=30, learning_rate=3e-3, input_shape=[8]):
    model = keras.models.Sequential()
    model.add(keras.layers.InputLayer(input_shape=input_shape))
    for layer in range(n_hidden):
        model.add(keras.layers.Dense(n_neurons, activation="relu"))
    model.add(keras.layers.Dense(1))
    optimizer = keras.optimizers.SGD(lr=learning_rate)
    model.compile(loss="mse", optimizer=optimizer)
    return model

In [None]:
# Wrap the Keras model in a Scikit-Learn KerasRegressor.
keras_reg = keras.wrappers.scikit_learn.KerasRegressor(build_model)

### Use RandomizedSearchCV to tune hyperparameters

In [None]:
from sklearn.model_selection import RandomizedSearchCV

# Define hyperparameter sets and ranges to explore
param_distribs = {
    "n_hidden": [1, 2, 3, 4],
    "n_neurons": list(range(10, 100)),
    "learning_rate": [5e-4, 5e-3, 5e-2, 5e-1]    #default learning rate is 1e-2
}
                      
# Create an instance of RandomizedSearchCV
rnd_search_cv = RandomizedSearchCV(keras_reg, param_distribs, n_iter=20, cv=3, n_jobs=-1, verbose=2)

# Search
rnd_search_cv.fit(X_train, y_train, epochs=100,
                  validation_data=(X_valid, y_valid),
                  callbacks=[keras.callbacks.EarlyStopping(patience=10)])

In [None]:
# Display the parameters of the best model.
rnd_search_cv.best_params_

In [None]:
# Display the score of the best model (note that Scikit-Learn computes a negative value)
rnd_search_cv.best_score_

In [None]:
# Get the model for the best estimator, and evaluate it on the test set.
rnd_search_model = rnd_search_cv.best_estimator_.model
rnd_search_model.evaluate(X_test, y_test)

### Note
There are many alternative techniques that can explore a search space more efficient than RandomSerarchCV. There are a list of libraries in the book on pages 322-333.

I have tried BayesSearchCV, but results have been dissapointing so far.

## TensorFlow Serving

You can deploy a model as a REST API using TensorFlow Serving (TF Serving).

In [None]:
# Before you start TF Serving, you should export the model to TensorFlow's SavedModel format.
model_version = "02"
model_name = "housing_model"
model_path = os.path.join(model_name, model_version)
tf.saved_model.save(model, model_path)

In this example, I will install TF Serving using a Docker image (there are also other options).

First, you should pull the Docker image by typing the following command from the command prompt:

docker pull tensorflow/serving

Then you can run the docker image by typing a command similar to the following:

docker run -it --rm -p 8500:8500 -p 8501:8501 -v "/Users/hk/Documents/Undervisning/ML/Examples/housing_model:/models/housing_model" -e MODEL_NAME=housing_model tensorflow/serving

In [None]:
# You can make a prediction by querying TF Serving REST API. A query must be a POST request,
# and the input data must be passed in the request body as a JSON object.

(I will make the request in Postman)
import json

input_data_json = json.dumps({
    "signature_name": "serving_default",
    "instances": X_new.tolist()
})

input_data_json