<h1> Hyperparameter tuning </h1>

This notebook is Lab4b of CPB 102, Google's course on Machine Learning using Cloud ML.

This notebook builds on Lab 4a, adding hyperparameter tuning to the feature engineering done in that lab.  To save time, we will start from the preprocessed output of Lab 4a.

In [1]:
import google.cloud.ml as ml
import tensorflow as tf
print tf.__version__
print ml.sdk_location

0.11.0rc0
gs://cloud-ml/sdk/cloudml-0.1.6-alpha.dataflow.tar.gz


<h1> Retreiving preprocessed data </h1>

To save time, we'll go off the preprocessed data from Lab4a. To save time, let's start off from my Lab4a results (which I carried out on 10m row dataset).  Change the BUCKET below to be yours.

Tuning is carried out over a segment of the training data (you should not use the validation data for this).

In [3]:
%bash
BUCKET=cloud-training-demos-ml

SOURCE=gs://cloud-training-demos/taxifare/taxi_preproc4a_full
SUFFIX="-of-00003.tfrecord.gz"  
gsutil -m rm -rf gs://$BUCKET/taxifare/taxi_preproc4b/
gsutil cp $SOURCE/metadata.yaml gs://$BUCKET/taxifare/taxi_preproc4b/metadata.yaml
for file in features_train-0000* features_train-0002*; do
    gsutil -m cp $SOURCE/$file gs://$BUCKET/taxifare/taxi_preproc4b/
done

Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00000-of-00027.tfrecord.gz#1475538388169694...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00001-of-00027.tfrecord.gz#1475538388359963...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00002-of-00027.tfrecord.gz#1475538388470900...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00003-of-00027.tfrecord.gz#1475538388859838...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00004-of-00027.tfrecord.gz#1475538388514496...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00005-of-00027.tfrecord.gz#1475538389003590...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00006-of-00027.tfrecord.gz#1475538388887113...
Removing gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00007-of-00027.tfrecord.gz#1475538388846837...
Removing

<h2> Modify TensorFlow code </h2>

We want to make the number of buckets and the number of hidden nodes an optimizable parameter.
In order to do this, we have to get them from the command-line.

This shows all the code that now references the number_buckets hyperparameter.

In [4]:
%bash
grep -3 number_buckets taxifare/trainer/*.py

taxifare/trainer/task.py-  parser.add_argument('--metadata_path', type=str)
taxifare/trainer/task.py-  parser.add_argument('--output_path', type=str)
taxifare/trainer/task.py-  parser.add_argument('--max_steps', type=int, default=2000)
taxifare/trainer/task.py:  parser.add_argument('--number_buckets', type=int, default=5)
taxifare/trainer/task.py-  parser.add_argument('--hidden_layer1_size', type=int, default=256)
taxifare/trainer/task.py-
taxifare/trainer/task.py-  args = parser.parse_args()
taxifare/trainer/task.py-  HYPERPARAMS['hidden_layer1_size'] = args.hidden_layer1_size
taxifare/trainer/task.py-  HYPERPARAMS['hidden_layer2_size'] = args.hidden_layer1_size / 2
taxifare/trainer/task.py-  HYPERPARAMS['hidden_layer3_size'] = args.hidden_layer1_size / 4
taxifare/trainer/task.py:  HYPERPARAMS['number_buckets'] = args.number_buckets
taxifare/trainer/task.py-  
taxifare/trainer/task.py-  args.output_path = os.path.join(args.output_path, trial_id)
taxifare/trainer/task.py-  logging.info

We also have to add a summary metric named <b>training/hptuning/metric</b> to the TensorFlow graph.

In [5]:
%bash
grep -3 hptuning taxifare/trainer/task.py

      global_step = tf.Variable(0, name='global_step', trainable=False)

    tf.scalar_summary('rmse', rmse_op)
    tf.scalar_summary('training/hptuning/metric', rmse_op)
    summary = tf.merge_all_summaries() # make sure all scalar summaries are produced

    saver = tf.train.Saver()


<h2> Train once </h2>

Here, we package up the code and train as normal.

In [6]:
%bash
rm -rf taxifare.tar.gz taxi_trained
tar cvfz taxifare.tar.gz taxifare
gsutil cp taxifare.tar.gz gs://cloud-training-demos-ml/taxifare/source4b/taxifare.tar.gz

taxifare/
taxifare/PKG-INFO
taxifare/setup.cfg
taxifare/trainer.egg-info/
taxifare/trainer.egg-info/PKG-INFO
taxifare/trainer.egg-info/top_level.txt
taxifare/trainer.egg-info/dependency_links.txt
taxifare/trainer.egg-info/SOURCES.txt
taxifare/trainer/
taxifare/trainer/task.py
taxifare/trainer/taxifare.py
taxifare/trainer/__init__.py
taxifare/setup.py


Copying file://taxifare.tar.gz [Content-Type=application/x-tar]...
/ [0 files][    0.0 B/  7.2 KiB]                                                / [1 files][  7.2 KiB/  7.2 KiB]                                                
Operation completed over 1 objects/7.2 KiB.                                      


In [7]:
%bash
gsutil cp -R gs://cloud-training-demos-ml/taxifare/taxi_preproc4b /content/training-data-analyst/CPB102/lab4b

Copying gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00000-of-00027.tfrecord.gz...
/ [0 files][    0.0 B/645.6 KiB]                                                / [1 files][645.6 KiB/645.6 KiB]                                                Copying gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00001-of-00027.tfrecord.gz...
/ [1 files][645.6 KiB/  8.2 MiB]                                                / [2 files][  8.2 MiB/  8.2 MiB]                                                Copying gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00002-of-00027.tfrecord.gz...
/ [2 files][  8.2 MiB/ 16.9 MiB]                                                -- [3 files][ 16.9 MiB/ 16.9 MiB]                                                Copying gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-00003-of-00027.tfrecord.gz...
- [3 files][ 16.9 MiB/ 24.3 MiB]                                                - [4 fi

In [None]:
%%mlalpha train
package_uris: /content/training-data-analyst/CPB102/lab4b/taxifare.tar.gz
python_module: trainer.task
scale_tier: BASIC
region: us-central1
args:
  train_data_paths: /content/training-data-analyst/CPB102/lab4b/taxi_preproc4b/features_train-0000*
  eval_data_paths:  /content/training-data-analyst/CPB102/lab4b/taxi_preproc4b/features_train-0002*
  metadata_path: /content/training-data-analyst/CPB102/lab4b/taxi_preproc4b/metadata.yaml
  output_path: /content/training-data-analyst/CPB102/lab4b/taxi_trained
  max_steps: 200
  hidden_layer1_size: 8
  number_buckets: 2

In [1]:
%mlalpha summary --dir /content/training-data-analyst/CPB102/lab4b/taxi_trained/eval --name training/hptuning/metric accuracy --step

<h2> Hyperparameter training </h2>

Now, we carry out the training, but this time on the cloud, and this time with some hyperparameters

In [1]:
!gsutil -m -q rm -r gs://cloud-training-demos-ml/taxifare/taxi_trained4b

In [2]:
%%mlalpha train --cloud
package_uris: gs://cloud-training-demos-ml/taxifare/source4b/taxifare.tar.gz
python_module: trainer.task
scale_tier: BASIC
region: us-central1
args:
  train_data_paths: gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-0000*
  eval_data_paths: gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/features_train-0002*
  metadata_path: gs://cloud-training-demos-ml/taxifare/taxi_preproc4b/metadata.yaml
  output_path: gs://cloud-training-demos-ml/taxifare/taxi_trained4b
  max_steps: 1000
hyperparameters:
  goal: MINIMIZE
  max_trials: 20
  max_parallel_trials: 3
  params:
    - parameter_name: hidden_layer1_size
      type: INTEGER
      min_value: 128
      max_value: 256
      scale_type: UNIT_LINEAR_SCALE  
    - parameter_name: number_buckets
      type: INTEGER
      min_value: 10
      max_value: 25
      scale_type: UNIT_LINEAR_SCALE  

In [3]:
%mlalpha jobs --name  trainer_task_161003_235158

In [35]:
!gsutil ls gs://cloud-training-demos-ml/taxifare/taxi_trained4b

gs://cloud-training-demos-ml/taxifare/taxi_trained4b/1/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/10/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/11/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/12/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/13/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/14/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/15/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/16/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/17/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/18/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/19/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/2/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/20/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/3/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/4/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/5/
gs://cloud-training-demos-ml/taxifare/taxi_trained4b/6/
gs://cloud-training-

In [36]:
%tensorboard start --logdir gs://cloud-training-demos-ml/taxifare/taxi_trained4b

In [37]:
%tensorboard stop --pid 18341

In [None]:
%mlalpha summary --dir gs://cloud-training-demos-ml/taxifare/taxi_trained4b/*/summaries  gs://cloud-training-demos-ml/taxifare/taxi_trained4b/*/eval  --name training/hptuning/metric --step

Copyright 2016 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License