<b>Define environment variables</b>

To be used in future training steps.  Note that the BUCKET_NAME defined below must exist in the GCP project. 

In [None]:
%env BUCKET_NAME=ross-keras
%env LOCAL_JOB_DIR=local-training-output
%env JOB_NAME=keras_wnd_job
%env REGION=us-central1
%env MODEL_NAME=keras_wnd_model
%env MODEL_VERSION=v16

<b>Perform training locally</b>

Training detail will be written locally to the folder referenced in the job-dir parameter

In [None]:
# train locally
!gcloud ai-platform local train \
  --package-path trainer \
  --module-name trainer.task \
  --job-dir $LOCAL_JOB_DIR 

# parameters can coptionally be explicitly passed into the training module (else defaults will be used)
'''
  -- \
  --num-deep-layers=1 \
  --first-deep-layer-size=20 \
  --first-wide-layer-size=1168 \
  --learning-rate=0.026203671309666113 \
  --wide-scale-factor=0.37760157070726325 \
  --train-batch-size=27
'''

<b>Perform hyperparameter tuning on AI Platform</b>

Training detail will be written to Cloud Storage in the folder referenced in the job-dir parameter

In [None]:
# train on AI Platform with hyperparameter tuning
#  --stream-logs
!gcloud ai-platform jobs submit training ${JOB_NAME}_hpt \
  --config hptuning_config.yaml \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir gs://${BUCKET_NAME}/keras-job-dir-hpt 

<b>Perform training on AI Platform</b>

Now that hyperparameters have been tuned, perform deeper training with the optimal hyperparameters in place.  Note that we've increased the training size by explicitly setting the train-steps and train-batch-size parameters in addition to the tuned hyperparameters

In [None]:
# train on AI Platform

!gcloud ai-platform jobs submit training $JOB_NAME \
  --package-path trainer/ \
  --module-name trainer.task \
  --region $REGION \
  --python-version 3.5 \
  --runtime-version 1.13 \
  --job-dir gs://${BUCKET_NAME}/keras-job-dir \
  -- \
  --num-deep-layers=1 \
  --first-deep-layer-size=15 \
  --first-wide-layer-size=2085 \
  --learning-rate=0.04 \
  --wide-scale-factor=0.237 \
  --train-batch-size=52 \
  --train-steps=500 \
  --train-batch-size=1000 

<b>Host the trained model on AI Platform</b>

Because we're passing a list of numpy arrays and not a single numpy array as input for inference, we'll need to establish a custom prediction module.  

First, execute the setup script to create a distribution tarball

In [None]:
!python setup.py sdist --formats=gztar

Copy the tarball over to Cloud Storage

In [None]:
!gsutil cp dist/trainer-0.1.tar.gz gs://${BUCKET_NAME}/staging-dir/trainer-0.1.tar.gz

Next, create a new model on AI Platform

In [None]:
# create model
!gcloud ai-platform models create $MODEL_NAME --regions $REGION

Next we create new version using our trained model

In [None]:
# create a version
!gcloud beta ai-platform versions create $MODEL_VERSION \
  --model $MODEL_NAME \
  --runtime-version 1.13 \
  --python-version 3.5 \
  --origin gs://${BUCKET_NAME}/keras-job-dir \
  --package-uris gs://${BUCKET_NAME}/staging-dir/trainer-0.1.tar.gz \
  --prediction-class predictor.MyPredictor
        
#  --framework tensorflow \

<b>Prepare a sample for inference</b>

Note that we are using the same preprocessing methods used for training.

In [None]:
# prepare data for inference
!python create_sample.py

<b>Make an inference on a new sample.</b>

Pass the sample object to the model hosted in AI Platform to return a prediction.

In [None]:
# submit the prediction request
!gcloud ai-platform predict \
  --model $MODEL_NAME \
  --version $MODEL_VERSION \
  --json-instances input_sample.json