<h2> Hyperparameter tuning with Cloud MLEngine </h2>

** Objective**
- Instead of hand tuning of hyperparameters to improve model accuracy, setup cloud ML to do tuining  

In [32]:
# Load liberaries and setup compute environment
import os
PROJECT = 'qwiklabs-gcp-5720bb7433a520e9'   # project name
BUCKET = 'qwiklabs-gcp-5720bb7433a520e9'    # bucket name
REGION = 'us-central1'   # should be consisten with the BUCKET zone
os.environ['TFVERSION'] = '1.8'   # latest Tensorflow version


In [33]:
# setup environment for bash
os.environ['PROJECT'] = PROJECT
os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = REGION

In [34]:
%%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

Updated property [core/project].
Updated property [compute/region].


<h2>Create command-line program </h2>

**Step 1**: In order for CloudML to d automatic hyperparameter tuning, we need to create command-line argument for those hyperparameters we desire to fine-tune

**Step 2**: To submit jobs in jobs in parallel, we need to create a distributed training program. **tf.Estimator** will just do that 

In [35]:
%%bash
rm -rf trainer_2
mkdir trainer_2   # create a folder for trainer for packaging-up the modle
touch trainer_2/__init__.py    # create __init__.py 

Create reate model house.py for predicting median_house_value

In [36]:
%%writefile trainer_2/house.py   
import os
import math
import json
import shutil
import argparse
import numpy as np
import pandas as pd
import tensorflow as tf

# create train
def train(output_dir, batch_size, learning_rate):
  tf.loging.set_verbosity(tf.logging.INFO)
  
  # read dataset and split into train and eval
  df = pd.read_csv("https://storage.googleapis.com/ml_universities/california_housing_train.csv", sep=",")
  df["num_rooms"] = df['total_rooms'] / df['households']
  # create train and validation set
  msk = np.random.rand(len(df)) < 0.8
  train_df = df[msk]
  eval_df = df[~msk]
  
  # input pipeline for train and evaluation
  train_input_fn = tf.estimator.inputs.pandas_input_fn(x= train_df[["num_rooms"]],
                                                      y = train_df["median_house_value"]/ SCALE,
                                                      num_epochs = 1,
                                                      batch_size = batch_size, 
                                                      shutil = True)
  
  eval_input_fn = tf.estimator.inputs.pandas_input_fn(x= train_df[["num_rooms"]],
                                                      y = eval_df["median_house_value"]/ SCALE,
                                                      num_epochs = 1,
                                                      batch_size = len(eval_df), 
                                                      shutil = False)
  # define feature columns
  features = [tf.feature_column.numeric_column('num_rooms')]
  
  def train_and_evaluate(out_dir):
    # get number of steps, since tf igonore epochs
    num_steps = (len(train_df) / batch_size) / learning_rate
    
    # create custom optimzer
    my_opt = tf.train.FtrlOptimizer(learning_rate=learning_rate)
    
    # the rest of the estimator is as usual
    estimator = tf.estimator.LinearRegressor(model_dir = output_dir,
                                            feature_column = features,
                                            optimizer = my_opt)
    train_spec = tf.estimator.TrainSpec(input_fn = train_input_fn, 
                                       max_steps = num_steps)
    eval_spec = tf.estimator.EvalSpec(input_fn = eval_input_fn, 
                                       steps = None)
    tf.estimator.train_and_evaluate(estimator. train_spec, eval_spec)
    
    # run the training 
    shutil.rmtree(output_dir, ignore_errors=True)    # start fresh
    train_and_evaluate(output_dir)
if __name__ == '__main__' and "get_ipython" not in dir():
  parser = argparse.ArgumentParser()
  parser.add_argument(
    '--learning_rate',    # tuneable hyperparameter passed as command-line arg 
    type = float,
    defualt = 0.01
  )
  
  parser.add_argument(
    '--batch_size',    # tuneable hyperparameter passed as command-line arg
    type = int,
    defualy = 30
  ),
  parser.add_argument(
    '--job-dir',
    help = 'GCS location to write checkpoints and export models.',
    required = True
  )
  args = parser.parse_args()
  print("Writing checkpoints to {}".format(args.job_dir))
  train(args.job_dir, args.batch_size, args.learning_rate)
                                                      

Writing trainer_2/house.py


In [37]:
%%bash
rm -rf house_trained_2
gcloud ml-engine local train \
  --module-name=trainer_2.house \
  --job-dir=house_trained_2 \
  --package-path=$(pwd)/trainer_2 \
  -- \
  --batch_size=30 \
  --learning_rate=0.02

  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "/usr/local/envs/py2env/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/local/envs/py2env/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/content/datalab/training-data-analyst/courses/machine_learning/deepdive/05_artandscience/trainer_2/house.py", line 62, in <module>
    defualt = 0.01
  File "/usr/local/envs/py2env/lib/python2.7/argparse.py", line 1294, in add_argument
    action = action_class(**kwargs)
TypeError: __init__() got an unexpected keyword argument 'defualt'


<h2> Create hyperparam.yaml</h2>

In [38]:
%%writefile hyperparam.yaml
trainingInput:
  hyperparameters:
    goal: MINIMIZE
    maxTrials: 5
    maxParallelTrails: 1
    hyperparameterMetricTag: average_loss
    params:
      - parameterName: batch_size
        type: INTEGER
        miniValue: 8
        maxValue: 64
        scaleType: UNIT_LINEAR_SCALE
      - parameterName: learning_rate
        type: DOUBLE
        minValue: 0.01
        maxValue: 0.1
        scaleType: UNIT_LOG_SCALE

Overwriting hyperparam.yaml


In [39]:
%%bash
OUTDIR=gs://$BUCKET/house_trained_2
gsutil rm -rf $OUTDIR
gcloud ml-engine jobs submit training house_$(date -u +%y%m%d_%H%M%S) \
  --config=hyperparam.yaml \
  --module-name=trainer.house \
  --package-path=$(pwd)/trainer_2 \
  --job-dir=$OUTDIR \
  --runtime-version=$TFVERSION \

Removing gs://qwiklabs-gcp-5720bb7433a520e9/house_trained_2/packages/22c3dbe05a95fee0e4706a187ceca1b06cd9b79ee345ed20e1bb9d4714ae9bc6/trainer_2-0.0.0.tar.gz#1527303630035075...
/ [1 objects]                                                                   
Operation completed over 1 objects.                                              
ERROR: (gcloud.ml-engine.jobs.submit.training) INVALID_ARGUMENT: Invalid JSON payload received. Unknown name "max_parallel_trails" at 'job.training_input.hyperparameters': Cannot find field.
Invalid JSON payload received. Unknown name "mini_value" at 'job.training_input.hyperparameters.params[0]': Cannot find field.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: "Invalid JSON payload received. Unknown name \"max_parallel_trails\"\
      \ at 'job.training_input.hyperparameters': Cannot find field."
    field: job.training_input.hyperparameters
  - description: "Invalid JSON payload received. Unknown name \"min

In [40]:
# !gcloud ml-engine jobs describe job_ID