# Enable Virtual Environment For This Notebook.

### Activate Conda Environment

<b>`$ conda activate`</b>

### Install Or Upgrade necessary software for virtual environment.

<b>`$ sudo apt-get install --upgrade python3-pip`</b>

<b>`$ sudo pip3 install --upgrade virtualenv`</b>

<b>`$ sudo pip3 install --upgrade setuptools`</b>

Now we will go to the location of the directory, where we will create our virtual environment.

<b>`$ cd /media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/05.Art_And_Science_Of_Machine_Learning/WEEK_1/02.Improve_Model_Accuracy_By_Hyperparameter_Tuning_With_AI_Platform/Practice`</b>

### Deactivate conda environment

<b>`$ conda deactivate`</b>

### Create Virtual Environment

<b>`$ virtualenv Venv`</b>

### Activate newly created virtual environment

<b>`$ source Venv/bin/activate`</b>

<b>`$ (Venv) which python`</b>

<b>`$ (Venv) pip list`</b>

<b>`$ (Venv) pip3 install jupyter`</b>

In [1]:
%%writefile requirements.txt
numpy
pandas
tensorflow==1.8.0

Writing requirements.txt


In [2]:
%%bash
pip3 install -r requirements.txt

Collecting numpy
  Using cached numpy-1.18.2-cp36-cp36m-manylinux1_x86_64.whl (20.2 MB)
Collecting pandas
  Using cached pandas-1.0.3-cp36-cp36m-manylinux1_x86_64.whl (10.0 MB)
Collecting tensorflow==1.8.0
  Using cached tensorflow-1.8.0-cp36-cp36m-manylinux1_x86_64.whl (49.1 MB)
Collecting pytz>=2017.2
  Using cached pytz-2019.3-py2.py3-none-any.whl (509 kB)
Collecting astor>=0.6.0
  Using cached astor-0.8.1-py2.py3-none-any.whl (27 kB)
Processing /home/mujahid7292/.cache/pip/wheels/c3/af/84/3962a6af7b4ab336e951b7877dcfb758cf94548bb1771e0679/absl_py-0.9.0-py3-none-any.whl
Processing /home/mujahid7292/.cache/pip/wheels/93/2a/eb/e58dbcbc963549ee4f065ff80a59f274cc7210b6eab962acdc/termcolor-1.1.0-py3-none-any.whl
Collecting grpcio>=1.8.6
  Using cached grpcio-1.27.2-cp36-cp36m-manylinux2010_x86_64.whl (2.7 MB)
Collecting tensorboard<1.9.0,>=1.8.0
  Using cached tensorboard-1.8.0-py3-none-any.whl (3.1 MB)
Collecting gast>=0.2.0
  Using cached gast-0.3.3-py2.py3-none-any.whl (9.7 kB)
Collec

In [3]:
%%bash
pip3 list

Package            Version  
------------------ ---------
absl-py            0.9.0    
astor              0.8.1    
attrs              19.3.0   
backcall           0.1.0    
bleach             1.5.0    
decorator          4.4.2    
defusedxml         0.6.0    
entrypoints        0.3      
gast               0.3.3    
grpcio             1.27.2   
html5lib           0.9999999
importlib-metadata 1.5.0    
ipykernel          5.2.0    
ipython            7.13.0   
ipython-genutils   0.2.0    
ipywidgets         7.5.1    
jedi               0.16.0   
Jinja2             2.11.1   
jsonschema         3.2.0    
jupyter            1.0.0    
jupyter-client     6.1.0    
jupyter-console    6.1.0    
jupyter-core       4.6.3    
Markdown           3.2.1    
MarkupSafe         1.1.1    
mistune            0.8.4    
nbconvert          5.6.1    
nbformat           5.0.4    
notebook           6.0.3    
numpy              1.18.2   
pandas             1.0.3    
pandocfilters      1.4.2    
parso         

In [21]:
%%bash
which python

/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/05.Art_And_Science_Of_Machine_Learning/WEEK_1/02.Improve_Model_Accuracy_By_Hyperparameter_Tuning_With_AI_Platform/Practice/Venv/bin/python


In [23]:
%%bash
python --version

Python 3.6.9


<a>https://github.com/GoogleCloudPlatform/training-data-analyst/blob/master/courses/machine_learning/deepdive/05_artandscience/b_hyperparam.ipynb</a>

# Hyperparameter tuning with Cloud AI Platform

**Learning Objectives:**
  *  Improve the accuracy of the model by hyperparameter tuning

In [4]:
import os
PROJECT = 'ml-practice-260405'
BUCKET = 'buck-ml-practice-260405'
REGION = 'us-central1'

In [5]:
# For bash
os.environ['PROJECT'] = PROJECT
os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = REGION
os.environ['TFVERSION'] = '1.8' # Tensorflow version

In [6]:
%%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

Updated property [core/project].
Updated property [compute/region].


# Create command line programme

In order to submit to `Cloud AI Platform`, we need to create a distributed training programme. Let's convert our housing example to fit that paradigm, using the Estimator API.

In [7]:
%%bash
rm -rf house_prediction_module
mkdir house_prediction_module
mkdir house_prediction_module/trainer
touch house_prediction_module/trainer/__init__.py

In [13]:
%%writefile house_prediction_module/trainer/task.py
import argparse
import os
import json
import shutil

from .import model

if __name__ == '__main__' and "get_ipython" not in dir():
    # Create a parser object
    parser = argparse.ArgumentParser()
    
    # Add training argument to the parser object
    parser.add_argument(
        '--learning_rate',
        type=float,
        default=0.01
    )
    parser.add_argument(
        '--batch_size',
        type=int,
        default=30
    )
    parser.add_argument(
        '--output_dir',
        help="GCS location to write checkpoints and export models",
        required=True
    )
    parser.add_argument(
        '--job-dir',
        help="This model ignore this field, but it is required by gcloud",
        default='junk'
    )
    
    args = parser.parse_args()
    arguments = args.__dict__
    
    # Unused args provided by service
    arguments.pop('job_dir',None)
    arguments.pop('job-dir', None)
    
    # Append trail_id to path if we are doing hyperparameter tunning
    # This code can be removed if you are not doing any hyperparameter tuning
    arguments['output_dir'] = os.path.join(
        arguments['output_dir'],
        json.loads(
            os.environ.get('TF_CONFIG', '{}')
        ).get('task', {}).get('trail', '')
    )
    
    # Now run the training
    shutil.rmtree(arguments['output_dir'], ignore_errors=True) # Start fresh each time
    
    # Pass the command line arguments to our model's train_and_evaluate function
    model.train_and_evaluate(arguments)

Overwriting house_prediction_module/trainer/task.py


In [14]:
import tensorflow as tf
print(tf.__version__)

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


1.8.0


In [9]:
%%writefile house_prediction_module/trainer/model.py

import numpy as np
import pandas as pd
import tensorflow as tf

tf.logging.set_verbosity(tf.logging.INFO)

# Read the dataset from GCS
df = pd.read_csv(
    "https://storage.googleapis.com/ml_universities/california_housing_train.csv",
    sep=','
)

# Create new feature
df['num_rooms'] = df['total_rooms'] / df['households']

# Now split the whole data into training and evaluation
np.random.seed(seed=1) # Makes dataset split reproducible
msk = np.random.randn(len(df)) < 0.8 # As we want to split 80 To 20

# Now create training dataframe by keeping 80% of the total data
df_train = df[msk]
# Now create evaluation dataframe by keeping rest 20% of the total data
df_eval = df[~msk]

# Constant for our training
SCALE = 100000 # 1 Lac

# Now create our training input function
def train_input_fn(df_train, batch_size):
    """
    """
    return tf.estimator.inputs.pandas_input_fn(
        x=df_train[['num_rooms']],
        y=df_train['median_house_value'] / SCALE, # Note the scalling
        num_epochs=None,
        batch_size=batch_size, # Note the batch size
        shuffle=True
    )

# Now create our evaluation input function
def eval_input_fn(df_eval, batch_size):
    """
    """
    return tf.estimator.inputs.pandas_input_fn(
        x=df_eval[['num_rooms']],
        y=df_eval['median_house_value'] / SCALE, # Note the scalling
        num_epochs=1,
        batch_size=batch_size,
        shuffle=False
    )

# Define feature's column
features = [tf.feature_column.numeric_column('num_rooms')]

# Now create our train_and_evaluate() function
def train_and_evaluate(args):
    """
    """
    # Compute appropriate number of steps
    num_steps = (len(df_train) / args['batch_size']) / args['learning_rate'] # if learning rate = 0.01, 100 epochs
    
    # Create custom optimizer
    myopt = tf.train.FtrlOptimizer(learning_rate=args['learning_rate']) # Note the learning rate
    
    # Create Linear Regressor Estimator Object
    estimator = tf.estimator.LinearRegressor(
        feature_columns=features,
        model_dir=args['output_dir'],
        optimizer=myopt
    )
    
    # Add RMSE evaluation metric
    def rmse(labels, predictions):
        """
        """
        pred_values = tf.cast(predictions['predictions'], tf.float64)
        return {'rmse' : tf.metrics.root_mean_squared_error(labels*SCALE, pred_values*SCALE)}
    
    # Attach custom evaluation metric to the estimator object
    estimator = tf.contrib.estimator.add_metrics(estimator, rmse)
    
    # Now create our training specefication
    train_spec = tf.estimator.TrainSpec(
        input_fn=train_input_fn(df_train,args['batch_size']),
        max_steps=num_steps
    )
    
    # Now create our evaluation specefication
    eval_spec = tf.estimator.EvalSpec(
        input_fn=eval_input_fn(df_eval, len(df_eval)),
        steps=None
    )
    
    # Now finish our function
    tf.estimator.train_and_evaluate(estimator,train_spec,eval_spec)

Writing house_prediction_module/trainer/model.py


In [11]:
%%bash
# First delete the `house_trained` directory
rm -rf house_trained
# Export the python path
export PYTHONPATH=${PYTHONPATH}:${PWD}/house_prediction_module
# Now run the gcloud local training
gcloud ai-platform local train \
    --module-name=trainer.task \
    --job-dir=house_trained \
    --package-path=$(pwd)/trainer \
    -- \
    --batch_size=30 \
    --learning_rate=0.02 \
    --output_dir=house_trained

2020-03-23 15:55:08.970469: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-03-23 15:55:08.970475: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2.1.0
, using the default primary node name, aka "chief" for cluster settings
<subprocess.Popen object at 0x7ff300bf3c50>


CalledProcessError: Command 'b'rm -rf house_trained\nexport PYTHONPATH=${PYTHONPATH}:${PWD}/house_prediction_module\ngcloud ai-platform local train \\\n    --module-name=trainer.task \\\n    --job-dir=house_trained \\\n    --package-path=$(pwd)/trainer \\\n    -- \\\n    --batch_size=30 \\\n    --learning_rate=0.02 \\\n    --output_dir=house_trained\n'' returned non-zero exit status 1.

# Create hyperparam.yaml

In [17]:
%%writefile hyperparam.yaml
trainingInput:
  hyperparameters:
    goal: MINIMIZE
    maxTrials: 5
    maxParallelTrials: 1
    hyperparameterMetricTag: rmse
    params:
    - parameterName: batch_size
      type: INTEGER
      minValue: 8
      maxValue: 64
      scaleType: UNIT_LINEAR_SCALE
    - parameterName: learning_rate
      type: DOUBLE
      minValue: 0.01
      maxValue: 0.1
      scaleType: UNIT_LOG_SCALE

Overwriting hyperparam.yaml


Create GCS bucket if it does not exist

In [18]:
%%bash
if ! gsutil ls | grep -q gs://${BUCKET}/; then
    gsutil mb -l ${REGION} gs://${BUCKET}
fi

Creating gs://buck-ml-practice-260405/...


In [19]:
%%bash
OUTDIR=gs://${BUCKET}/house_trained   # CHANGE bucket name appropriately
gsutil rm -rf $OUTDIR
export PYTHONPATH=${PYTHONPATH}:${PWD}/house_prediction_module
gcloud ai-platform jobs submit training house_$(date -u +%y%m%d_%H%M%S) \
   --config=hyperparam.yaml \
   --module-name=trainer.task \
   --package-path=$(pwd)/house_prediction_module/trainer \
   --job-dir=$OUTDIR \
   --runtime-version=$TFVERSION \
   --\
   --output_dir=$OUTDIR \

CommandException: 1 files/objects could not be removed.
ERROR: (gcloud.ai-platform.jobs.submit.training) INVALID_ARGUMENT: Field: runtime_version Error: The specified runtime version '1.8' with the Python version '' is not supported or is deprecated.  Please specify a different runtime version. See https://cloud.google.com/ml-engine/docs/runtime-version-list for a list of supported versions
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: The specified runtime version '1.8' with the Python version '' is
      not supported or is deprecated.  Please specify a different runtime version.
      See https://cloud.google.com/ml-engine/docs/runtime-version-list for a list
      of supported versions
    field: runtime_version


CalledProcessError: Command 'b'OUTDIR=gs://${BUCKET}/house_trained   # CHANGE bucket name appropriately\ngsutil rm -rf $OUTDIR\nexport PYTHONPATH=${PYTHONPATH}:${PWD}/house_prediction_module\ngcloud ai-platform jobs submit training house_$(date -u +%y%m%d_%H%M%S) \\\n   --config=hyperparam.yaml \\\n   --module-name=trainer.task \\\n   --package-path=$(pwd)/house_prediction_module/trainer \\\n   --job-dir=$OUTDIR \\\n   --runtime-version=$TFVERSION \\\n   --\\\n   --output_dir=$OUTDIR \\\n'' returned non-zero exit status 1.