# Flowers Image Classification with TensorFlow on Cloud ML Engine

This notebook demonstrates how to do image classification from scratch on a flowers dataset using the Estimator API.

In [10]:
import os
PROJECT = 'qwiklabs-gcp-bb61496856b796ca' # REPLACE WITH YOUR PROJECT ID
BUCKET = 'qwiklabs-gcp-bb61496856b796ca' # REPLACE WITH YOUR BUCKET NAME
REGION = 'us-central1' # REPLACE WITH YOUR BUCKET REGION e.g. us-central1
MODEL_TYPE = 'cnn'

# do not change these
os.environ['PROJECT'] = PROJECT
os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = REGION
os.environ['MODEL_TYPE'] = MODEL_TYPE
os.environ['TFVERSION'] = '1.8'  # Tensorflow version

In [11]:
%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

Updated property [core/project].
Updated property [compute/region].


## Input functions to read JPEG images

The key difference between this notebook and [the MNIST one](./mnist_models.ipynb) is in the input function.
In the input function here, we are doing the following:
* Reading JPEG images, rather than 2D integer arrays.
* Reading in batches of batch_size images rather than slicing our in-memory structure to be batch_size images.
* Resizing the images to the expected HEIGHT, WIDTH. Because this is a real-world dataset, the images are of different sizes. We need to preprocess the data to, at the very least, resize them to constant size.

## Run as a Python module

Since we want to run our code on Cloud ML Engine, we've packaged it as a python module.

The `model.py` and `task.py` containing the model code is in <a href="flowersmodel">flowersmodel</a>

**Complete the TODOs in `model.py` before proceeding!**

Once you've completed the TODOs, run it locally for a few steps to test the code.

In [8]:
%bash
rm -rf flowersmodel.tar.gz flowers_trained
gcloud ml-engine local train \
   --module-name=flowersmodel.task \
   --package-path=${PWD}/flowersmodel \
   -- \
   --output_dir=${PWD}/flowers_trained \
   --train_steps=5 \
   --learning_rate=0.01 \
   --batch_size=2 \
   --model=$MODEL_TYPE \
   --augment \
   --train_data_path=gs://cloud-ml-data/img/flower_photos/train_set.csv \
   --eval_data_path=gs://cloud-ml-data/img/flower_photos/eval_set.csv

  from ._conv import register_converters as _register_converters
INFO:tensorflow:TF_CONFIG environment variable: {u'environment': u'cloud', u'cluster': {}, u'job': {u'args': [u'--output_dir=/content/datalab/training-data-analyst/courses/machine_learning/deepdive/08_image/labs/flowers_trained', u'--train_steps=5', u'--learning_rate=0.01', u'--batch_size=2', u'--model=cnn', u'--augment', u'--train_data_path=gs://cloud-ml-data/img/flower_photos/train_set.csv', u'--eval_data_path=gs://cloud-ml-data/img/flower_photos/eval_set.csv'], u'job_name': u'flowersmodel.task'}, u'task': {}}
INFO:tensorflow:Using config: {'_save_checkpoints_secs': 300, '_session_config': None, '_keep_checkpoint_max': 5, '_task_type': 'worker', '_train_distribute': None, '_is_chief': True, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4ceca2bcd0>, '_evaluation_master': '', '_save_checkpoints_steps': None, '_keep_checkpoint_every_n_hours': 10000, '_service': None, '_num_ps_replicas': 

In [15]:
%%writefile config.yaml
trainingInput:
  scaleTier: CUSTOM
  masterType: complex_model_m_gpu
  workerType: complex_model_m_gpu
  parameterServerType: large_model
  workerCount: 6
  parameterServerCount: 3

Writing config.yaml


Now, let's do it on ML Engine. Note the --model parameter

In [20]:
%bash
OUTDIR=gs://${BUCKET}/flowers/trained_${MODEL_TYPE}
JOBNAME=flowers_${MODEL_TYPE}_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
   --region=$REGION \
   --module-name=flowersmodel.task \
   --package-path=${PWD}/flowersmodel \
   --job-dir=$OUTDIR \
   --staging-bucket=gs://$BUCKET \
   --scale-tier=BASIC_TPU \
   --runtime-version=$TFVERSION \
   -- \
   --output_dir=$OUTDIR \
   --train_steps=1000 \
   --learning_rate=0.01 \
   --batch_size=40 \
   --model=$MODEL_TYPE \
   --augment \
   --batch_norm \
   --train_data_path=gs://cloud-ml-data/img/flower_photos/train_set.csv \
   --eval_data_path=gs://cloud-ml-data/img/flower_photos/eval_set.csv

gs://qwiklabs-gcp-bb61496856b796ca/flowers/trained_cnn us-central1 flowers_cnn_180928_152600
jobId: flowers_cnn_180928_152600
state: QUEUED


Removing gs://qwiklabs-gcp-bb61496856b796ca/flowers/trained_cnn/events.out.tfevents.1538146279.cmle-training-16505190885097379417#1538148360520210...
/ [1/1 objects] 100% Done                                                       
Operation completed over 1 objects.                                              
Job [flowers_cnn_180928_152600] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ml-engine jobs describe flowers_cnn_180928_152600

or continue streaming the logs with the command

  $ gcloud ml-engine jobs stream-logs flowers_cnn_180928_152600


In [21]:
%%bash
gcloud ml-engine jobs describe flowers_cnn_180928_152600


createTime: '2018-09-28T15:26:07Z'
etag: VDvklPUMQCI=
jobId: flowers_cnn_180928_152600
state: PREPARING
trainingInput:
  args:
  - --output_dir=gs://qwiklabs-gcp-bb61496856b796ca/flowers/trained_cnn
  - --train_steps=1000
  - --learning_rate=0.01
  - --batch_size=40
  - --model=cnn
  - --augment
  - --batch_norm
  - --train_data_path=gs://cloud-ml-data/img/flower_photos/train_set.csv
  - --eval_data_path=gs://cloud-ml-data/img/flower_photos/eval_set.csv
  jobDir: gs://qwiklabs-gcp-bb61496856b796ca/flowers/trained_cnn
  packageUris:
  - gs://qwiklabs-gcp-bb61496856b796ca/flowers_cnn_180928_152600/3e56e705f9d6ff260b3fda01a5ce6ae73ae9a70b27f67f28d3cb7aa1d167c751/flowersmodel-0.0.0.tar.gz
  pythonModule: flowersmodel.task
  region: us-central1
  runtimeVersion: '1.8'
  scaleTier: BASIC_TPU
trainingOutput: {}



View job in the Cloud Console at:
https://console.cloud.google.com/ml/jobs/flowers_cnn_180928_152600?project=qwiklabs-gcp-bb61496856b796ca

View logs at:
https://console.cloud.google.com/logs?resource=ml.googleapis.com%2Fjob_id%2Fflowers_cnn_180928_152600&project=qwiklabs-gcp-bb61496856b796ca


## Monitoring training with TensorBoard

Use this cell to launch tensorboard

In [14]:
from google.datalab.ml import TensorBoard
TensorBoard().start('gs://{}/flowers/trained_{}'.format(BUCKET, MODEL_TYPE))

  from ._conv import register_converters as _register_converters


4483

In [None]:
for pid in TensorBoard.list()['pid']:
  TensorBoard().stop(pid)
  print 'Stopped TensorBoard with pid {}'.format(pid)

Here are my results:

Model | Accuracy | Time taken | Run time parameters
--- | :---: | ---
cnn with batch-norm | 0.582 | 47 min | 1000 steps, LR=0.01, Batch=40
as above, plus augment | 0.615 | 3 hr | 5000 steps, LR=0.01, Batch=40

What was your accuracy?

## Deploying and predicting with model

Deploy the model:

In [22]:
%bash
MODEL_NAME="flowers"
MODEL_VERSION=${MODEL_TYPE}
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/flowers/trained_${MODEL_TYPE}/export/exporter | tail -1)
echo "Deleting and deploying $MODEL_NAME $MODEL_VERSION from $MODEL_LOCATION ... this will take a few minutes"
#gcloud ml-engine versions delete --quiet ${MODEL_VERSION} --model ${MODEL_NAME}
#gcloud ml-engine models delete ${MODEL_NAME}
gcloud ml-engine models create ${MODEL_NAME} --regions $REGION
gcloud ml-engine versions create ${MODEL_VERSION} --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --runtime-version=$TFVERSION

Deleting and deploying flowers cnn from gs://qwiklabs-gcp-bb61496856b796ca/flowers/trained_cnn/export/exporter/1538148976/ ... this will take a few minutes


Created ml engine model [projects/qwiklabs-gcp-bb61496856b796ca/models/flowers].
Creating version (this might take a few minutes)......
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................done.


To predict with the model, let's take one of the example images that is available on Google Cloud Storage <img src="http://storage.googleapis.com/cloud-ml-data/img/flower_photos/sunflowers/1022552002_2b93faf9e7_n.jpg" />

The online prediction service expects images to be base64 encoded as described [here](https://cloud.google.com/ml-engine/docs/tensorflow/online-predict#binary_data_in_prediction_input).

In [23]:
%%bash
IMAGE_URL=gs://cloud-ml-data/img/flower_photos/sunflowers/1022552002_2b93faf9e7_n.jpg

# Copy the image to local disk.
gsutil cp $IMAGE_URL flower.jpg

# Base64 encode and create request message in json format.
python -c 'import base64, sys, json; img = base64.b64encode(open("flower.jpg", "rb").read()); print json.dumps({"image_bytes":{"b64": img}})' &> request.json

Copying gs://cloud-ml-data/img/flower_photos/sunflowers/1022552002_2b93faf9e7_n.jpg...
/ [0 files][    0.0 B/ 41.7 KiB]                                                -- [1 files][ 41.7 KiB/ 41.7 KiB]                                                
Operation completed over 1 objects/41.7 KiB.                                     


Send it to the prediction service

In [24]:
%%bash
gcloud ml-engine predict \
  --model=flowers \
  --version=${MODEL_TYPE} \
  --json-instances=./request.json

CLASS       CLASSID  PROBABILITIES
sunflowers  3        [0.01260288991034031, 0.0620645172894001, 0.047199297696352005, 0.8369098901748657, 0.041223421692848206]


<pre>
# Copyright 2017 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
</pre>