# Cloud Machine Learning Engine

In [1]:
import os
# use here the name of your own bucket
BUCKET = 'telemar-flights'


os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = 'europe-west1'

## Authorize CMLE
Cloud Machine Learning Engine needs to have access to train and test csv files.

In [2]:
%%bash
PROJECT_ID=$(gcloud config get-value core/project)

AUTH_TOKEN=$(gcloud auth print-access-token)
SVC_ACCOUNT=$(curl -X GET -H "Content-Type: application/json" \
    -H "Authorization: Bearer $AUTH_TOKEN" \
    https://ml.googleapis.com/v1/projects/${PROJECT_ID}:getConfig \
    | python -c "import json; import sys; response = json.load(sys.stdin); \
    print response['serviceAccount']")

echo "Authorizing the Cloud ML Service account $SVC_ACCOUNT to access files in $BUCKET"
gsutil -m defacl ch -u $SVC_ACCOUNT:R gs://$BUCKET
gsutil -m acl ch -u $SVC_ACCOUNT:R -r gs://$BUCKET  # error message (if bucket is empty) can be ignored
gsutil -m acl ch -u $SVC_ACCOUNT:W gs://$BUCKET

Authorizing the Cloud ML Service account cloud-ml-service@injenia-ricerca-48dd4.iam.gserviceaccount.com to access files in telemar-flights


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0100   229    0   229    0     0    233      0 --:--:-- --:--:-- --:--:--   233100   229    0   229    0     0    233      0 --:--:-- --:--:-- --:--:--   233
No changes to gs://telemar-flights/
No changes to gs://telemar-flights/flights/chapter8/output/testFlights-00000-of-00007.csv
No changes to gs://telemar-flights/flights/chapter8/output/testFlights-00001-of-00007.csv
No changes to gs://telemar-flights/flights/chapter8/output/testFlights-00002-of-00007.csv
No changes to gs://telemar-flights/flights/chapter8/output/testFlights-00003-of-00007.csv
No changes to gs://telemar-flights/flights/chapter8/output/testFlights-00004-of-00007.csv
No changes to gs://telemar-flights/flights/chapter8/output/testFlights-00005-of-00007.csv
No changes to gs://telemar-f

## Run a full training session on datalab
Code below launches a training job on Google Cloud CMLE. 
Note that:
- python package is contained in sub-folder flights: you can browse source cod using datalab;
- JOBNAME environmental variable is created starting from current date and time: check and find relevant lines of code;
- OUTPUT_DIR points to a specific folder in your bucket: it will contain training informations. 


In [7]:
%%bash

OUTPUT_DIR=gs://${BUCKET}/flights/chapter9/output
DATA_DIR=gs://${BUCKET}/flights/chapter8/output
JOBNAME=flights_$(date -u +%y%m%d_%H%M%S)

PATTERN="Flights*"

echo "Launching training job ... trained model will be in $OUTPUT_DIR"

gsutil -m rm -rf $OUTPUT_DIR
gcloud ml-engine jobs submit training $JOBNAME \
  --region=$REGION \
  --module-name=trainer.task \
  --package-path=$(pwd)/flights/trainer \
  --job-dir=$OUTPUT_DIR \
  --staging-bucket=gs://$BUCKET \
  --runtime-version="1.4" \
  --scale-tier=STANDARD_1 \
  -- \
   --output_dir=$OUTPUT_DIR \
   --traindata $DATA_DIR/train$PATTERN \
   --evaldata $DATA_DIR/test$PATTERN   \
   --num_training_epochs=5

Launching training job ... trained model will be in gs://telemar-flights/flights/chapter9/output
jobId: flights_180611_085939
state: QUEUED


Removing gs://telemar-flights/flights/chapter9/output/#1528278708751410...
Removing gs://telemar-flights/flights/chapter9/output/checkpoint#1528278711261406...
Removing gs://telemar-flights/flights/chapter9/output/eval/#1528273757205120...
Removing gs://telemar-flights/flights/chapter9/output/eval/events.out.tfevents.1528273757.cmle-training-master-112f22993c-0-cftwj#1528278556796826...
Removing gs://telemar-flights/flights/chapter9/output/eval/events.out.tfevents.1528278719.cmle-training-master-112f22993c-0-cftwj#1528278720416855...
/ [1/39 objects]   2% Done                                                      Removing gs://telemar-flights/flights/chapter9/output/events.out.tfevents.1528273711.cmle-training-master-112f22993c-0-cftwj#1528278712880264...
Removing gs://telemar-flights/flights/chapter9/output/export/#1528278725019621...
/ [2/39 objects]   5% Done                                                      / [3/39 objects]   7% Done                                             

## Control CMLE log status
Browse to [https://console.cloud.google.com/mlengine](https://console.cloud.google.com/mlengine) and select your job

## Launch TensorBoard: visualize graph and metrics
We can keep track of the behavior of accuracy (on test set) and ()

In [17]:
from google.datalab.ml import TensorBoard
TensorBoard().start('gs://'+BUCKET+'/flights/chapter9/output')
TensorBoard().list()

Unnamed: 0,logdir,pid,port
0,gs://telemar-flights/flights/chapter9/output,5781,43707


In [18]:
# to stop TensorBoard
for pid in TensorBoard.list()['pid']:
    TensorBoard().stop(pid)
    print 'Stopped TensorBoard with pid {}'.format(pid)

Stopped TensorBoard with pid 5781
