<b>Define environment variables</b>

To be used in future training steps.  Note that the BUCKET_NAME defined below must exist in the GCP project. 

In [1]:
%env BUCKET_NAME=ross-keras
%env LOCAL_JOB_DIR=local-training-output
%env JOB_NAME=keras_1101_job14
%env REGION=us-central1
%env MODEL_NAME=keras_wnd_model
%env MODEL_VERSION=v1

env: BUCKET_NAME=ross-keras
env: LOCAL_JOB_DIR=local-training-output
env: JOB_NAME=keras_1101_job14
env: REGION=us-central1
env: MODEL_NAME=keras_wnd_model
env: MODEL_VERSION=v1


<b>Perform training locally with default parameters</b>

Training detail will be written locally to the folder referenced in the job-dir parameter.

Note - creating the data will take some time as the MinMax normalizer needs to be fit over the 100 M plus training rows.

In [None]:
!gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir $LOCAL_JOB_DIR \
  -- \
  --create-data=False 

<b>Perform training on AI Platform</b>

The training job can also be run on AI Platform.  Note that in order for AI Platform to be able to complete the training job, the "Google Cloud ML Engine Service Agent" service account must be granted Cloud Storage and BigQuery admin roles.

Important: A single training job (either locally or using AI Platform) must complete with the create-data flag set to true for the remainig functionality to compolete.

In [None]:
!gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir gs://${BUCKET_NAME}/keras-job-dir \
  -- \
  --create-data=False 

<b>Perform hyperparameter tuning on AI Platform</b>

Training detail will be written to Cloud Storage in the folder referenced in the job-dir parameter

In [2]:
!gcloud ai-platform jobs submit training ${JOB_NAME}_hpt \
  --config hptuning_config.yaml \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir gs://${BUCKET_NAME}/keras-job-dir-hpt

Job [keras_1101_job14_hpt] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ai-platform jobs describe keras_1101_job14_hpt

or continue streaming the logs with the command

  $ gcloud ai-platform jobs stream-logs keras_1101_job14_hpt
jobId: keras_1101_job14_hpt
state: QUEUED


<b>Complete training on AI Platform</b>

Now that hyperparameters have been tuned, perform deeper training with the optimal hyperparameters in place.  Note that we've increased the training size by explicitly setting the train-steps and num-epochs parameters in addition to the tuned hyperparameters

In [None]:
!gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir gs://${BUCKET_NAME}/keras-job-dir \
  -- \
  --num-deep-layers=2 \
  --first-deep-layer-size=15 \
  --first-wide-layer-size=2262 \
  --learning-rate=0.005 \
  --wide-scale-factor=0.02 \
  --train-batch-size=12 \
  --train-steps=40 \
  --num-epochs=100 

<b>Host the trained model on AI Platform</b>

Because we're passing a list of numpy arrays and not a single numpy array as input for inference, we'll need to establish a custom prediction module.  

First, execute the setup script to create a distribution tarball

In [None]:
!python setup.py sdist --formats=gztar

Copy the tarball over to Cloud Storage

In [None]:
!gsutil cp dist/trainer-0.1.tar.gz gs://${BUCKET_NAME}/staging-dir/trainer-0.1.tar.gz

Next, create a new model on AI Platform

In [None]:
!gcloud ai-platform models create $MODEL_NAME --regions $REGION

Next we create new version using our trained model

In [None]:
!gcloud beta ai-platform versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --runtime-version 1.13 \
  --python-version 3.5 \
  --origin gs://${BUCKET_NAME}/keras-job-dir \
  --package-uris gs://${BUCKET_NAME}/staging-dir/trainer-0.1.tar.gz \
  --prediction-class predictor.MyPredictor

<b>Prepare a sample for inference</b>

Note that we are using the same preprocessing methods used for training.

In [None]:
!python create_sample.py

<b>Make an inference on a new sample.</b>

Pass the sample object to the model hosted in AI Platform to return a prediction.

In [None]:
!gcloud ai-platform predict \
  --model $MODEL_NAME \
  --version $MODEL_VERSION \
  --json-instances input_sample.json