# This code book is modified and shortened from the original version which you can find here:
https://cloud.google.com/ai-platform/docs/getting-started-keras

* This will show you how to use the AI Platform in GCP to train, store, and serve predictions from a Keras model.

* This is best run in Colab, although you may also be able to use notebooks provided on the AI platform. In this live assignment please just run it in Colab.

* In order for this to work, the gmail that you use to run the Colab should have been added to the Fourthbrain GCP project (This has already been done for students in this cohort).

In [None]:
# Copyright 2019 Google LLC
# Redo for region= US-east1
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Getting started: Training and prediction with Keras in AI Platform

## Overview

This tutorial shows how to train a neural network on the GCP AI Platform
using the Keras sequential API and how to serve predictions from that
model.

Keras is a high-level API for building and training deep learning models.
[tf.keras](https://www.tensorflow.org/guide/keras) is TensorFlow’s
implementation of this API.

The first two parts of the tutorial walk through training a model on Cloud
AI Platform using prewritten Keras code, deploying the trained model to
AI Platform, and serving online predictions from the deployed model.

The last part of the tutorial digs into the training code used for this model and ensuring it's compatible with AI Platform. To learn more about building
machine learning models in Keras more generally, read [TensorFlow's Keras
tutorials](https://www.tensorflow.org/tutorials/keras).

### Dataset

This tutorial uses the [United States Census Income
Dataset](https://archive.ics.uci.edu/ml/datasets/census+income) provided by the
[UC Irvine Machine Learning
Repository](https://archive.ics.uci.edu/ml/index.php). This dataset contains
information about people from a 1994 Census database, including age, education,
marital status, occupation, and whether they make more than $50,000 a year.

### Objective

The goal is to train a deep neural network (DNN) using Keras that predicts
whether a person makes more than $50,000 a year (target label) based on other
Census information about the person (features).

This tutorial focuses more on using this model with AI Platform than on
the design of the model itself. However, it's always important to think about
potential problems and unintended consequences when building machine learning
systems. See the [Machine Learning Crash Course exercise about
fairness](https://developers.google.com/machine-learning/crash-course/fairness/programming-exercise)
to learn about sources of bias in the Census dataset, as well as machine
learning fairness more generally.

### Mount Google Drive

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import os
# The path below should point to the directory containing this notebook in Google Drive. We'll also clone the cloudml-samples GitHub repository at this location
# Change it if necessary
os.chdir('/content/drive/MyDrive/Live_session_notebooks/week_13')
!ls

### Get some code and dependencies

First, download the training code and change the notebook's working directory. The code and data get downloaded to your GDrive:

In [None]:
# Clone the repository of AI Platform samples
! git clone --depth 1 https://github.com/GoogleCloudPlatform/cloudml-samples

In [None]:
# Set the working directory to the sample code directory
%cd cloudml-samples/census/tf-keras

Notice that the training code is structured as a Python package in the
`trainer/` subdirectory:

In [None]:
# `ls` shows the working directory's contents. The `p` flag adds trailing 
# slashes to subdirectory names. The `R` flag lists subdirectories recursively.
! ls -pR

Run the following cell to install Python dependencies needed to train the model locally. When you run the training job in AI Platform,
dependencies are preinstalled based on the [runtime
version](https://cloud.google.com/ml-engine/docs/tensorflow/runtime-version-list)
you choose.

In [None]:
! pip install -r requirements.txt

^ If this tells you that you have to restart the runtime for these installs to take effect, then do this now. You'll have to re-run the cells that change the working directory

## Setting up your GCP project

**Steps 1 and 2 below are required only if you're setting up a NEW "project" on the Google Cloud Platform. You don't have to do these because you've already been added to a Fourthbrain GCP project (it's called 'FB-MLE-Jul-21')**

1. [Select or create a GCP project.](https://console.cloud.google.com/cloud-resource-manager)

2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project). 

**👇 For the Live assignment start from here.**

3. [Enable the AI Platform ("Cloud Machine Learning Engine") and Compute Engine APIs.](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,compute_component)

4. Enter the project ID in the cell below. It should already be populated for you as "fb-mle-jul-21", which is the project we're using for the July 2021 MLE cohort. Run the  cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook. 

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$` into these commands.

In [None]:
PROJECT_ID = "fb-mle-jul-21" #@param {type:"string"}
! gcloud config set project $PROJECT_ID

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**
https://cloud.google.com/storage/docs/creating-buckets#storage-create-bucket-console
When you submit a training job using the Cloud SDK, you upload a Python package
containing your training code to a Cloud Storage bucket. AI Platform runs
the code from this package. In this tutorial, AI Platform also saves the
trained model that results from your job in the same bucket. You can then
create an AI Platform model version based on this output in order to serve
online predictions.

Set the name of your Cloud Storage bucket below. It must be unique across all
Cloud Storage buckets. Pick something that doesn't conflict with other students' buckets. [Here is a guide on naming storage buckets](https://cloud.google.com/storage/docs/naming-buckets#:~:text=Bucket%20names%20reside%20in%20a,responds%20with%20an%20error%20message.). 

In [None]:
BUCKET_NAME = "sk-my-unique-bucket-name" #@param {type:"string"}
REGION = "us-east1" #@param {type:"string"}

# Click the link below to create your storage bucket. Give it the same name you chose in the cell above:
https://cloud.google.com/storage/docs/creating-buckets


In [None]:
#This will check connection and list the connected data bucket. Ensure Bitbucket created before this.
from google.colab import auth
auth.authenticate_user()
project_id = PROJECT_ID
!gcloud config set project {project_id}
!gsutil ls

## Part 1. Quickstart for training in AI Platform

This section of the tutorial walks you through submitting a training job to Cloud
AI Platform. This job runs sample code that uses Keras to train a deep neural
network on the United States Census data. It outputs the trained model as a
[TensorFlow SavedModel
directory](https://www.tensorflow.org/guide/saved_model#save_and_restore_models)
in your Cloud Storage bucket.

### Train your model locally

Before training on AI Platform, train the job locally to verify the file
structure and packaging is correct.

For a complex or resource-intensive job, you
may want to train locally on a small sample of your dataset to verify your code.
Then you can run the job on AI Platform to train on the whole dataset.

This sample runs a relatively quick job on a small dataset, so the local
training and the AI Platform job run the same code on the same data.

Run the following cell to train a model locally:

In [None]:
# Explicitly tell `gcloud ai-platform local train` to use Python 3 
! gcloud config set ml_engine/local_python $(which python3)

# This is similar to `python -m trainer.task --job-dir local-training-output`
# but it better replicates the AI Platform environment, especially for
# distributed training (not applicable here).
! gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir local-training-output

This should have saved the trained model in a directory called `local-training-output` (which is mounted to your Google Drive):

In [None]:
!ls local-training-output/

### Train your model using AI Platform

Next, compare this to submitting a training job to AI Platform. This runs the training module
in the cloud and exports the trained model to Cloud Storage.

First, give your training job a name and choose a directory within your Cloud
Storage bucket for saving intermediate and output files:

In [None]:
####ENTER HERE#############################################
JOB_NAME = 'sjk_gcp_tutorial_job'
JOB_DIR = 'gs://' + BUCKET_NAME + '/keras-job-dir'
print(JOB_DIR)

Run the following command to package the `trainer/` directory, upload it to the
specified `--job-dir`, and instruct AI Platform to run the
`trainer.task` module from that package.

The `--stream-logs` flag lets you view training logs in the cell below. You can
also see logs and other job details in the GCP Console.

Don't do this now, but you can optionally perform hyperparameter tuning by using the included
`hptuning_config.yaml` configuration file. This file tells AI Platform to tune the batch size and learning rate for training over multiple trials to maximize accuracy.

In this example, the training code uses a [TensorBoard
callback](https://www.tensorflow.org/api_docs/python/tf/keras/callbacks/TensorBoard),
which [creates TensorFlow `Summary`
`Event`s](https://www.tensorflow.org/api_docs/python/tf/summary/FileWriter#add_summary)
during training. AI Platform uses these events to track the metric you want to
optimize. Learn more about [hyperparameter tuning in
AI Platform Training](https://cloud.google.com/ml-engine/docs/tensorflow/hyperparameter-tuning-overview).

## While you wait for this step to complete, you can check on GCP console that your job is running.

After you run the cell below, you can check on the progress of your job: https://console.cloud.google.com/ai-platform/jobs

In [None]:
! gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.7 \
  --runtime-version 1.15 \
  --job-dir $JOB_DIR \
  --stream-logs

# Part 2 (Cloud Deployment). Quickstart for online predictions in AI Platform

This section shows how to use AI Platform and your trained model from Part 1
to predict a person's income bracket from other Census information about them.

### Create model and version resources in AI Platform

To serve online predictions using the model you trained and exported in Part 1,
create a *model* resource in AI Platform and a *version* resource
within it. The version resource is what actually uses your trained model to
serve predictions. This structure lets you adjust and retrain your model many times and
organize all the versions together in AI Platform. Learn more about [models
and
versions](https://cloud.google.com/ml-engine/docs/tensorflow/projects-models-versions-jobs).

First, name and create the model resource:

In [None]:
############ENTER HERE#########################
MODEL_NAME = "sk_gcp_model_census" #<Lastname_automl_demo>

! gcloud ai-platform models create $MODEL_NAME \
  --region $REGION

 # You can verify your job completion at the AI Training Dashboard:
 https://console.cloud.google.com/ai-platform/models/

 # Look under jobs and models on the left panel
```

Execute the following command to identify your SavedModel directory and use it to create a model version resource:

In [None]:
#ENTER HERE#####################
MODEL_VERSION = "v1"

# Get a list of directories in the `keras_export` parent directory
KERAS_EXPORT_DIRS = ! gsutil ls $JOB_DIR/keras_export/
print(KERAS_EXPORT_DIRS)

In [None]:
# Point to the folder ../keras_export. Your index number may be different that the one here based on how many times you ahve run.
idx=0
SAVED_MODEL_PATH = KERAS_EXPORT_DIRS[idx]
print(SAVED_MODEL_PATH)

In [None]:
print(MODEL_NAME)

In [None]:
# Create model version based on that SavedModel directory
! gcloud ai-platform versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --region $REGION \
  --runtime-version 1.15 \
  --python-version 3.7 \
  --framework tensorflow \
  --origin $SAVED_MODEL_PATH

# Now you are ready to submit a training job to Google AI platform

# Task 2: Data Preparation for Modeling

To receive valid and useful predictions, you must preprocess input for prediction in the same way that training data was preprocessed. In a production
system, you may want to create a preprocessing pipeline that can be used identically at training time and prediction time.

For this exercise, use the training package's data-loading code to select a random sample from the evaluation data. This data is in the form that was used to evaluate accuracy after each epoch of training, so it can be used to send test predictions without further preprocessing:

In [None]:
#First, lets review the data and features that are used for classifictaion
from trainer import util

_, _, eval_x, eval_y = util.load_data()

prediction_input = eval_x.sample(20)
prediction_targets = eval_y[prediction_input.index]

prediction_input

Notice that categorical fields, like `occupation`,  have already been converted to integers (with the same mapping that was used for training). Numerical fields, like `age`, have been scaled to a
[z-score](https://developers.google.com/machine-learning/crash-course/representation/cleaning-data). Some fields have been dropped from the original
data. Compare the prediction input with the raw data for the same examples:

In [None]:
#Now lets review the raw data with all features
import pandas as pd

_, eval_file_path = util.download(util.DATA_DIR)
raw_eval_data = pd.read_csv(eval_file_path,
                            names=util._CSV_COLUMNS,
                            na_values='?')

raw_eval_data.iloc[prediction_input.index]

Export the prediction input (filtered features only) to a newline-delimited JSON file:

In [None]:
import json

with open('prediction_input.json', 'w') as json_file:
  for row in prediction_input.values.tolist():
    json.dump(row, json_file)
    json_file.write('\n')

! cat prediction_input.json

The `gcloud` command-line tool accepts newline-delimited JSON for online
prediction, and this particular Keras model expects a flat list of
numbers for each input example.

AI Platform requires a different format when you make online prediction requests to the REST API without using the `gcloud` tool. The way you structure
your model may also change how you must format data for prediction. Learn more
about [formatting data for online
prediction](https://cloud.google.com/ml-engine/docs/tensorflow/prediction-overview#prediction_input_data).

### Submit the online prediction request

Use `gcloud` to submit your online prediction request.

In [None]:
! gcloud ai-platform predict \
  --model $MODEL_NAME \
  --region $REGION \
  --version $MODEL_VERSION \
  --json-instances prediction_input.json

Since the model's last layer uses a [sigmoid function](https://developers.google.com/machine-learning/glossary/#sigmoid_function) for its activation, outputs between 0 and 0.5 represent negative predictions ("<=50K") and outputs between 0.5 and 1 represent positive ones (">50K").

# ...and that's about it for this tutorial! We saw how to:
1. Use the Google Cloud AI Platform to **train a model locally** (using `gcloud ai-platform local train`)
2. **Train the model as a AI Platform "job"** (using `gcloud ai-platform jobs submit training`)
3. Create a "model" asset that can then be queried with any properly-formatted input data (using `gcloud ai-platform predict`)

Yay!

## Finally, Cleaning up

You can clean up individual resources by running the following
commands:

In [None]:
# Delete model version resource
! gcloud ai-platform versions delete $MODEL_VERSION --quiet --model $MODEL_NAME 

# Delete model resource
! gcloud ai-platform models delete $MODEL_NAME --quiet

# Delete Cloud Storage objects that were created
! gsutil -m rm -r $JOB_DIR

# If the training job is still running, cancel it
! gcloud ai-platform jobs cancel $JOB_NAME --quiet --verbosity critical

In [None]:
!gsutil rm -r gs://$BUCKET_NAME

## What's next?

* View the [complete training
code](https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census/tf-keras) used in this guide, which structures the code to accept custom
hyperparameters as command-line flags.
* Read about [packaging
code](https://cloud.google.com/ml-engine/docs/tensorflow/packaging-trainer) for an AI Platform training job.
* Read about [deploying a
model](https://cloud.google.com/ml-engine/docs/tensorflow/deploying-models) to serve predictions.