<b>Define environment variables</b>

To be used in future training steps.  Note that the BUCKET_NAME defined below must exist in the GCP project.

In [5]:
%env BUCKET_NAME=ml-black-friday
%env JOB_NAME=rf_1107_train_job_13

%env TRAINING_PACKAGE_PATH=./trainer/
%env MAIN_TRAINER_MODULE=trainer.rf_trainer
%env REGION=us-central1
%env RUNTIME_VERSION=1.14
%env PYTHON_VERSION=3.5
%env SCALE_TIER=CUSTOM

%env MODEL_NAME=rf_mod
%env PROJECT_ID=mwpmltr
%env DATASET_ID=black_friday
%env VERSION_NAME=v1
%env FRAMEWORK=SCIKIT_LEARN

env: BUCKET_NAME=ml-black-friday
env: JOB_NAME=rf_1107_train_job_13
env: TRAINING_PACKAGE_PATH=./trainer/
env: MAIN_TRAINER_MODULE=trainer.rf_trainer
env: REGION=us-central1
env: RUNTIME_VERSION=1.14
env: PYTHON_VERSION=3.5
env: SCALE_TIER=CUSTOM
env: MODEL_NAME=rf_mod
env: PROJECT_ID=mwpmltr
env: DATASET_ID=black_friday
env: VERSION_NAME=v1
env: FRAMEWORK=SCIKIT_LEARN


<b>Perform training locally with default parameters</b>

In [None]:
!gcloud ai-platform local train \
  --package-path $TRAINING_PACKAGE_PATH \
  --module-name $MAIN_TRAINER_MODULE 

'''
  -- \
  --create-data=False \
  --hp-tune=False
'''

<b>Perform training on AI Platform</b>

The training job can also be run on AI Platform. 

Important: A single training job (either locally or using AI Platform) must complete with the --create-data  and --hp-tune flags set to True for the remainig functionality to complete.

Note that we've updated the compute allocated to the master machine for this job to allow for more muscle.

In [6]:
!gcloud ai-platform jobs submit training $JOB_NAME \
  --job-dir gs://${BUCKET_NAME}/rf-job-dir \
  --package-path $TRAINING_PACKAGE_PATH \
  --module-name $MAIN_TRAINER_MODULE \
  --region $REGION \
  --runtime-version=$RUNTIME_VERSION \
  --python-version=$PYTHON_VERSION \
  --scale-tier $SCALE_TIER \
  --master-machine-type n1-highcpu-16 \
  -- \
  --job-id $JOB_NAME \
  --project-id $PROJECT_ID \
  --bucket-name $BUCKET_NAME \
  --dataset-id $DATASET_ID 
    
'''
  --create-data=False \
  --hp-tune=False
'''

Job [rf_1107_train_job_13] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ai-platform jobs describe rf_1107_train_job_13

or continue streaming the logs with the command

  $ gcloud ai-platform jobs stream-logs rf_1107_train_job_13
jobId: rf_1107_train_job_13
state: QUEUED


'\n  --create-data=False   --hp-tune=False\n'

<b>Host the trained model on AI Platform</b>

Because our raw prediction output from the model is a numpy array that needs to be converted into a product category, we'll need to implement a custom prediction module.

First, execute the setup script to create a distribution tarball

In [None]:
!python setup.py sdist --formats=gztar

Next copy the tarball over to Cloud Storage

In [None]:
!gsutil cp dist/trainer-0.1.tar.gz gs://${BUCKET_NAME}/staging-dir/trainer-0.1.tar.gz

Create a new model on AI Platform.  Note that this needs to be done just once, and future iterations are saved as "versions" of the model.

In [None]:
!gcloud ai-platform models create $MODEL_NAME --regions $REGION

Next we create new version using our trained model

In [None]:
!gcloud beta ai-platform versions create $VERSION_NAME \
  --model $MODEL_NAME \
  --origin gs://${BUCKET_NAME}/black_friday_${JOB_NAME}/ \
  --runtime-version=1.14 \
  --python-version=3.5 \
  --package-uris gs://${BUCKET_NAME}/staging-dir/trainer-0.1.tar.gz \
  --prediction-class predictor.MyPredictor

<b>Prepare a sample for inference</b>

In [None]:
!python generate_sample.py

<b>Make an inference on a new sample.</b>

Pass the sample object to the model hosted in AI Platform to return a prediction.

In [None]:
# make an online prediction
!gcloud ai-platform predict --model $MODEL_NAME --version \
  $VERSION_NAME --json-instances input.json