## List Of Imports

In [1]:
import os
# Load the TensorBoard notebook extension
%load_ext tensorboard

In [2]:
PROJECT = 'ml-practice-260405'
BUCKET = 'bucket_ml-practice-260405'
REGION = 'us-central1'

Model info for python code

In [3]:
MODEL_NAME = 'taxifare_practice'
MODEL_VERSION = 'v1'
TRAINING_DIR = 'taxi_trained_practice'

Info for bash code

In [24]:
os.environ['PROJECT'] = PROJECT
os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = REGION
os.environ['MODEL_NAME'] = MODEL_NAME
os.environ['MODEL_VERSION'] = MODEL_VERSION
os.environ['TRAINING_DIR'] = TRAINING_DIR
os.environ['TFVERSION'] = '1.40'

In [6]:
%%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

Updated property [core/project].
Updated property [compute/region].


### Create the bucket to store model and training data for deploying to Google Cloud Machine Learning Engine Component

In [7]:
%%bash
gsutil mb -p ${PROJECT} gs://${BUCKET}

Creating gs://bucket_ml-practice-260405/...


### Enable the Cloud Machine Learning Engine API

The next command works with Cloud AI Platform API.  In order for the command to work, you must enable the API using the Cloud Console UI.   Use this [link.](https://console.cloud.google.com/project/_/apis/library)  Then search the API list for Cloud Machine Learning and enable the API before executing the next cell.

### Allow the Cloud AI Platform service account to read/write to the bucket containing training data.

In [8]:
%%bash
PROJECT_ID=$PROJECT
AUTH_TOKEN=$(gcloud auth print-access-token)
SVC_ACCOUNT=$(curl -X GET -H "Content-Type: application/json" \
    -H "Authorization: Bearer $AUTH_TOKEN" \
    https://ml.googleapis.com/v1/projects/${PROJECT_ID}:getConfig \
    | python -c "import json; import sys; response = json.load(sys.stdin); \
    print(response['serviceAccount'])")

echo "Authorizing the Cloud AI Platform account $SVC_ACCOUNT to access files in $BUCKET"
gsutil -m defacl ch -u $SVC_ACCOUNT:R gs://$BUCKET
gsutil -m acl ch -u $SVC_ACCOUNT:R -r gs://$BUCKET  # error message (if bucket is empty) can be ignored
gsutil -m acl ch -u $SVC_ACCOUNT:W gs://$BUCKET

Authorizing the Cloud AI Platform account service-229327834475@cloud-ml.google.com.iam.gserviceaccount.com to access files in bucket_ml-practice-260405


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0100   235    0   235    0     0    103      0 --:--:--  0:00:02 --:--:--   103
Updated default ACL on gs://bucket_ml-practice-260405/
Encountered a problem: CommandException: No URLs matched: gs://bucket_ml-practice-260405/*
Updated ACL on gs://bucket_ml-practice-260405/


## Packaging up the code

Take your code and put into a standard Python package structure.  <a href="taxifare_practice/trainer/model.py">model.py</a> and <a href="taxifare_practice/trainer/task.py">task.py</a> containing the Tensorflow code from earlier (explore the <a href="taxifare_practice/trainer/">directory structure</a>).

In [9]:
%%bash
find ${MODEL_NAME}

taxifare_practice
taxifare_practice/trainer
taxifare_practice/trainer/model.py
taxifare_practice/trainer/task.py
taxifare_practice/trainer/__init__.py
taxifare_practice/trainer/__init__.pyc
taxifare_practice/trainer/__pycache__
taxifare_practice/trainer/__pycache__/model.cpython-37.pyc
taxifare_practice/trainer/__pycache__/task.cpython-37.pyc
taxifare_practice/trainer/__pycache__/__init__.cpython-37.pyc


In [None]:
%%bash
cat ${MODEL_NAME}/trainer/model.py

## Find absolute paths to your data

In [10]:
%%bash
echo "Working Directory: ${PWD}"

Working Directory: /media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice


### View Training & Validation Datasets Head

In [11]:
%%bash
echo "Head Of taxi-train.csv"
head -1 $PWD/taxi-train.csv
echo "Head Of taxi-valid.csv"
head -1 $PWD/taxi-valid.csv

Head Of taxi-train.csv
12.0,-73.987625,40.750617,-73.971163,40.78518,1,0
Head Of taxi-valid.csv
6.0,-74.013667,40.713935,-74.007627,40.702992,2,0


## Running the Python module from the command-line

#### Clean model training dir/output dir

In [12]:
%%bash
# This is so that the trained model is started fresh each time. However, 
# this needs to be done before tensorboard is started
rm -rf $PWD/${TRAINING_DIR}

### Run Tensorboard

In [None]:
%tensorboard --logdir ./taxi_trained_practice

### Run Training & Monitor Using Tensorboard

In [13]:
%%bash
# Setup python so it sees the task module which controls the model.py
export PYTHONPATH=${PYTHONPATH}:${PWD}/${MODEL_NAME}
# Run training
# Currently set for python 2.  To run with python 3 
#    1.  Replace 'python' with 'python3' in the following command
#    2.  Edit trainer/task.py to reflect proper module import method 
python3 -m trainer.task \
    --train_data_paths="${PWD}/taxi-train*" \
    --eval_data_paths="${PWD}/taxi-valid.csv" \
    --output_dir=${PWD}/${TRAINING_DIR} \
    --train_steps=1000 \
    --job-dir=./tmp

INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi_trained_practice', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7fb5ffbc2e50>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster

In [14]:
%%bash
ls $PWD/${TRAINING_DIR}/export/exporter/

1575952670


In [15]:
%%writefile ./test.json
{"pickuplon": -73.885262,"pickuplat": 40.773008,"dropofflon": -73.987232,"dropofflat": 40.732403,"passengers": 2}

Overwriting ./test.json


# Local Predict using AI Platform

In [16]:
%%bash
# This model dir is the model exported after training and is used for prediction
#
model_dir=$(ls ${PWD}/${TRAINING_DIR}/export/exporter | tail -1)
# predict using the trained model
gcloud ai-platform local predict  \
    --model-dir=${PWD}/${TRAINING_DIR}/export/exporter/${model_dir} \
    --json-instances=./test.json

If the signature defined in the model is not serving_default then you must specify it via --signature-name flag, otherwise the command may fail.
ERROR: (gcloud.ai-platform.local.predict) You must be running an installed Cloud SDK to perform local prediction.


CalledProcessError: Command 'b'# This model dir is the model exported after training and is used for prediction\n#\nmodel_dir=$(ls ${PWD}/${TRAINING_DIR}/export/exporter | tail -1)\n# predict using the trained model\ngcloud ai-platform local predict  \\\n    --model-dir=${PWD}/${TRAINING_DIR}/export/exporter/${model_dir} \\\n    --json-instances=./test.json\n'' returned non-zero exit status 1.

## Running locally using gcloud

#### Clean model training dir/output dir

In [17]:
%%bash
# This is so that the trained model is started fresh each time. However, this needs to be done before 
# tensorboard is started
rm -rf $PWD/${TRAINING_DIR}

In [18]:
%%bash
# Use Cloud Machine Learning Engine to train the model in local file system
gcloud ai-platform local train \
   --module-name=trainer.task \
   --package-path=${PWD}/${MODEL_NAME}/trainer \
   -- \
   --train_data_paths=${PWD}/taxi-train.csv \
   --eval_data_paths=${PWD}/taxi-valid.csv  \
   --train_steps=1000 \
   --output_dir=${PWD}/${TRAINING_DIR} 

INFO:tensorflow:TF_CONFIG environment variable: {'job': {'job_name': 'trainer.task', 'args': ['--train_data_paths=/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi-train.csv', '--eval_data_paths=/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi-valid.csv', '--train_steps=1000', '--output_dir=/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi_trained_practice']}, 'task': {}, 'cluster': {}, 'environment': 'cloud'}
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_model_dir': '/media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_

Use TensorBoard to examine results.  When I ran it (due to random seeds, your results will be different), the ```average_loss``` (Mean Squared Error) on the evaluation dataset was 187, meaning that the RMSE was around 13.

In [19]:
%%bash
ls $PWD/${TRAINING_DIR}

checkpoint
eval
events.out.tfevents.1575952979.mujahid7292-HP-ENVY-Notebook
export
graph.pbtxt
model.ckpt-0.data-00000-of-00001
model.ckpt-0.index
model.ckpt-0.meta
model.ckpt-1000.data-00000-of-00001
model.ckpt-1000.index
model.ckpt-1000.meta


## Submit training job using gcloud

First copy the training data to the cloud.  Then, launch a training job.

After you submit the job, go to the cloud console (http://console.cloud.google.com) and select <b>AI Platform | Jobs</b> to monitor progress.  

<b>Note:</b> Don't be concerned if the notebook stalls (with a blue progress bar) or returns with an error about being unable to refresh auth tokens. This is a long-lived Cloud job and work is going on in the cloud.  Use the Cloud Console link (above) to monitor the job.

#### Clear Coud Storage Bucket

In [20]:
%%bash
echo "Clearing Bucket: " $BUCKET
# This Below Command Will Throw Error, if bucket is already empty
gsutil -m rm -rf gs://${BUCKET}/${MODEL_NAME}/smallinput/

Clearing Bucket:  bucket_ml-practice-260405


CommandException: 1 files/objects could not be removed.


CalledProcessError: Command 'b'echo "Clearing Bucket: " $BUCKET\n# This Below Command Will Throw Error, if bucket is already empty\ngsutil -m rm -rf gs://${BUCKET}/${MODEL_NAME}/smallinput/\n'' returned non-zero exit status 1.

### Copy the training data to cloud

In [21]:
%%bash
#            FROM         TO
gsutil -m cp ${PWD}/*.csv gs://${BUCKET}/${MODEL_NAME}/smallinput/

Copying file:///media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi-test.csv [Content-Type=text/csv]...
Copying file:///media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi-valid.csv [Content-Type=text/csv]...
Copying file:///media/mujahid7292/Data/GoogleDriveSandCorp2014/ML_With_TensorFlow_On_GCP/03.Intro_To_TensorFlow/Week_3/01.Qwiklabs_Scaling_TensorFlow_With_Cloud_AI_Platform/Practice/taxi-train.csv [Content-Type=text/csv]...
/ [0/3 files][    0.0 B/591.4 KiB]   0% Done                                    / [0/3 files][    0.0 B/591.4 KiB]   0% Done                                    / [0/3 files][    0.0 B/591.4 KiB]   0% Done                                    -- [0/3 files][352.2 KiB/591.4 KiB]  59% Done                                    \\

### Submit training job

In [25]:
%%bash
OUTDIR=gs://${BUCKET}/${MODEL_NAME}/smallinput/${TRAINING_DIR}
JOBNAME=${MODEL_NAME}_$(date -u +%y%m%d_%H%M%S)
echo "Region Of Computation: " $REGION
echo "Model Output Directory: " $OUTDIR
echo "Job name: " $JOBNAME
# Clear the Cloud Storage Bucket used for the training job
echo "Clearing Output Directory"
gsutil -m rm -rf $OUTDIR
echo "Submitting The Job"
gcloud ai-platform jobs submit training $JOBNAME \
    --region=$REGION \
    --module-name=trainer.task \
    --package-path=${PWD}/${MODEL_NAME}/trainer \
    --job-dir=$OUTDIR \
    --staging-bucket=gs://$BUCKET \
    --scale-tier=BASIC \
    --runtime-version=1.14 \
    -- \
    --train_data_paths="gs://${BUCKET}/${MODEL_NAME}/smallinput/taxi-train*" \
    --eval_data_paths="gs://${BUCKET}/${MODEL_NAME}/smallinput/taxi-valid*" \
    --output_dir=$OUTDIR \
    --train_steps=10000

Region Of Computation:  us-central1
Model Output Directory:  gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice
Job name:  taxifare_practice_191210_044603
Clearing Output Directory
Submitting The Job
jobId: taxifare_practice_191210_044603
state: QUEUED


CommandException: 1 files/objects could not be removed.
Job [taxifare_practice_191210_044603] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ai-platform jobs describe taxifare_practice_191210_044603

or continue streaming the logs with the command

  $ gcloud ai-platform jobs stream-logs taxifare_practice_191210_044603


In [27]:
%%bash
gcloud ai-platform jobs describe taxifare_practice_191210_044603

createTime: '2019-12-10T04:46:08Z'
etag: martRm19yH4=
jobId: taxifare_practice_191210_044603
startTime: '2019-12-10T04:47:03Z'
state: RUNNING
trainingInput:
  args:
  - --train_data_paths=gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi-train*
  - --eval_data_paths=gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi-valid*
  - --output_dir=gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice
  - --train_steps=10000
  jobDir: gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice
  packageUris:
  - gs://bucket_ml-practice-260405/taxifare_practice_191210_044603/39fb9851741bfca3038f628736d8b6c596ab906a29fc18ce9fa73ce287573ee8/trainer-0.0.0.tar.gz
  pythonModule: trainer.task
  region: us-central1
  runtimeVersion: '1.14'
trainingOutput:
  consumedMLUnits: 0.02



View job in the Cloud Console at:
https://console.cloud.google.com/mlengine/jobs/taxifare_practice_191210_044603?project=ml-practice-260405

View logs at:
https://console.cloud.google.com/logs?resource=ml.googleapis.com%2Fjob_id%2Ftaxifare_practice_191210_044603&project=ml-practice-260405


## Deploy model

Find out the actual name of the subdirectory where the model is stored and use it to deploy the model.  Deploying model will take up to <b>5 minutes</b>.

In [28]:
%%bash
gsutil ls gs://${BUCKET}/${MODEL_NAME}/smallinput/${TRAINING_DIR}/export/exporter

gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice/export/exporter/
gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice/export/exporter/1575953662/


#### Deploy model : step 1 - remove version info 
Before an existing cloud model can be removed, it must have any version info removed.  If an existing model does not exist, this command will generate an error but that is ok.

In [32]:
%%bash
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/${MODEL_NAME}/smallinput/${TRAINING_DIR}/export/exporter | tail -1)
echo "Model Location: " $MODEL_LOCATION
echo ""
echo "Deleting "$MODEL_NAME" model's version "$MODEL_VERSION"'s info'"
gcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}

Model Location:  gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice/export/exporter/1575953662/

Deleting taxifare_practice model's version v1's info'


This will delete version [v1]...

Do you want to continue (Y/n)?  
ERROR: (gcloud.ai-platform.versions.delete) NOT_FOUND: Field: name Error: The model resource: "taxifare_practice" was not found. Please create the Cloud ML model resource first by using 'gcloud ml-engine models create taxifare_practice'.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: "The model resource: \"taxifare_practice\" was not found. Please\
      \ create the Cloud ML model resource first by using 'gcloud ml-engine models\
      \ create taxifare_practice'."
    field: name


CalledProcessError: Command 'b'MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/${MODEL_NAME}/smallinput/${TRAINING_DIR}/export/exporter | tail -1)\necho "Model Location: " $MODEL_LOCATION\necho ""\necho "Deleting "$MODEL_NAME" model\'s version "$MODEL_VERSION"\'s info\'"\ngcloud ai-platform versions delete ${MODEL_VERSION} --model ${MODEL_NAME}\n'' returned non-zero exit status 1.

#### Deploy model: step 2 - remove existing model
Now that the version info is removed from an existing model, the actual model can be removed.  If an existing model is not deployed, this command will generate an error but that is ok.  It just means the model with the given name is not deployed.

In [33]:
%%bash
gcloud ai-platform models delete ${MODEL_NAME}

This will delete model [taxifare_practice]...

Do you want to continue (Y/n)?  
ERROR: (gcloud.ai-platform.models.delete) NOT_FOUND: Field: name Error: The model resource: "taxifare_practice" was not found. Please create the Cloud ML model resource first by using 'gcloud ml-engine models create taxifare_practice'.
- '@type': type.googleapis.com/google.rpc.BadRequest
  fieldViolations:
  - description: "The model resource: \"taxifare_practice\" was not found. Please\
      \ create the Cloud ML model resource first by using 'gcloud ml-engine models\
      \ create taxifare_practice'."
    field: name


CalledProcessError: Command 'b'gcloud ai-platform models delete ${MODEL_NAME}\n'' returned non-zero exit status 1.

#### Deploy model: step 3 - deploy new model

In [34]:
%%bash
gcloud ai-platform models create ${MODEL_NAME} --regions $REGION

Created ml engine model [projects/ml-practice-260405/models/taxifare_practice].


#### Deploy model: step 4 - add version info to the new model

In [38]:
%%bash
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/${MODEL_NAME}/smallinput/${TRAINING_DIR}/export/exporter | tail -1)

echo "Model Location: "$MODEL_LOCATION
echo ""
echo "Now We will add version info to our model "$MODEL_NAME

gcloud ai-platform versions create ${MODEL_VERSION} \
--model ${MODEL_NAME} \
--origin ${MODEL_LOCATION} \
--runtime-version 1.14

Model Location: gs://bucket_ml-practice-260405/taxifare_practice/smallinput/taxi_trained_practice/export/exporter/1575953662/

Now We will add version info to our model taxifare_practice


Creating version (this might take a few minutes)......
...........................................................................................................................................................................................................................................................................................................................................................................................................done.


## Prediction

In [39]:
%%bash
gcloud ai-platform predict \
--model ${MODEL_NAME} \
--version ${MODEL_VERSION} \
--json-instances=./test.json

PREDICTIONS
[7.865230083465576]


In [40]:
from googleapiclient import discovery
from oauth2client.client import GoogleCredentials
import json

credentials = GoogleCredentials.get_application_default()
api = discovery.build('ml', 'v1', credentials=credentials,
            discoveryServiceUrl='https://storage.googleapis.com/cloud-ml/discovery/ml_v1_discovery.json')

request_data = {'instances':
  [
      {
        'pickuplon': -73.885262,
        'pickuplat': 40.773008,
        'dropofflon': -73.987232,
        'dropofflat': 40.732403,
        'passengers': 2,
      }
  ]
}

parent = 'projects/%s/models/%s/versions/%s' % (PROJECT, MODEL_NAME, MODEL_VERSION)
response = api.projects().predict(body=request_data, name=parent).execute()
print ("response={0}".format(response))

ApplicationDefaultCredentialsError: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.