# Classifying your own images using transfer learning and Google Cloud ML Engine
---
## Introduction
This notebook can be used to classify a new dataset of images using *transfer learning* based on *Google Cloud Machine Learning Engine*.

It is based on the following github repo: https://github.com/amygdala/tensorflow-workshop.git

The notebook is intended to be executed from inside the *__tensorflow-workshop/workshop_sections/transfer_learning/cloudml/__* directory.

## Setup

In [None]:
project_name = "hugs"
user_name = "bardi"
model_version = "v1"

In [None]:
# Helper function for printing out streaming subprocess output
import subprocess
import sys
def exec_subprocess(cmd):
  proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
  while proc.poll() is None:
    line = proc.stdout.readline()
    sys.stdout.write(line)
  # Might still be data on stdout at this point. Grab any remainder.
  for line in proc.stdout.read().split('\n'):
    sys.stdout.write(line)

In [None]:
# List the hugs label definition
!gsutil cat gs://oscon-tf-workshop-materials/transfer_learning/cloudml/hugs_photos/dict.txt

In [None]:
# Retrieve the Project ID
project_id_rd = !gcloud config list project --format "value(core.project)"
project_id = project_id_rd.fields()[0][0]
print ("Project ID: %s" % project_id)

In [None]:
# Define the Google Storage bucket
bucket = "gs://%s-%s-ml" % (project_id, project_name)
print ("Bucket name: %s" % bucket)

In [None]:
# Define a timestemp, later used for our JOB ID
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
print("Time stamp: %s" % timestamp)

## Pre-processing

In [None]:
# Execute the pre-processing
exec_subprocess("chmod a+x ./%s_preproc.sh" % project_name)
exec_subprocess("USER=%s DATE=%s ./%s_preproc.sh %s" % (user_name, timestamp, project_name, bucket))

## Training

In [None]:
# Define where the pre-processed data is located
gcs_path = "%s/%s/%s_%s_%s" % (bucket, user_name, project_name, user_name, timestamp)
print ("Google Storage path: %s" % gcs_path)

In [None]:
# Define Job ID
job_id=("%s_%s_%s" % (project_name, user_name, timestamp)).replace('-', "_")
print ("Job ID: %s" % job_id)

In [None]:
# Run the training script
# This script will output summary and model checkpoint information under <gcs_path>/training
exec_subprocess("chmod a+x ./%s_train.sh" % project_name)
exec_subprocess("./%s_train.sh %s %s %s" % (project_name, bucket, gcs_path, job_id))

In [None]:
# Monitor the training
exec_subprocess("gcloud ml-engine jobs stream-logs %s" % (job_id))

In [None]:
# See the results in TensorBoard
from google.datalab.ml import TensorBoard
pid = TensorBoard.start("%s/training" % gcs_path)

In [None]:
# See the running TensorBoard's
TensorBoard.list()

In [None]:
# Execute this cell to stop the previously started TensorBoard process
TensorBoard.stop(pid)

## Deployment

In [None]:
# Deploy the model
exec_subprocess("chmod a+x ./model.sh")
exec_subprocess("./model.sh %s %s %s" % (gcs_path, model_version, project_name))

In [None]:
# Get a list of deployed models
!gcloud ml-engine models list

## Inference

In [None]:
# Run predictions on a number of images
!python images_to_json.py -o request.json ./prediction_images/hedgehog.jpg ./prediction_images/puppy1.jpg ./prediction_images/puppy2.jpg
exec_subprocess("gcloud ml-engine predict --model %s --json-instances request.json " % (project_name))

In [None]:
# If needed, run the following to update gcloud
#!yes | gcloud components update