# PCB Placement Training with TensorFlow on Cloud ML Engine

This notebook demonstrates how to do PCB placement training from scratch using a placement dataset using Estimator/Experiment.

In [None]:
import os
PROJECT = 'gcp-spb-magestic' # REPLACE WITH YOUR PROJECT ID
BUCKET = 'gcp-spb-magestic' # REPLACE WITH YOUR BUCKET NAME
REGION = 'us-central1' # REPLACE WITH YOUR BUCKET REGION e.g. us-central1
MODEL_TYPE = 'cnn'

# do not change these
os.environ['PROJECT'] = PROJECT
os.environ['BUCKET'] = BUCKET
os.environ['REGION'] = REGION
os.environ['MODEL_TYPE'] = MODEL_TYPE

In [None]:
%bash
gcloud config set project $PROJECT
gcloud config set compute/region $REGION

## Data Preprocessing, Bin the scores
Bucketize the scores into 9 bins between 0 and 1

In [None]:
%bash
export DATALAB_OUTDIR=${PWD}/export_model
export OUTDIR=gs://${BUCKET}/export_model
gsutil cp -r $DATALAB_OUTDIR $OUTDIR

In [None]:
import csv
import numpy as np
import pandas as pd
from io import BytesIO
import StringIO


def transformFileName(imageName):
  return("gs://" + BUCKET + "/images/" + imageName)

labelFileName = "gs://" + BUCKET + "/images/labelmap.txt"
trainFileName = "gs://" + BUCKET + "/images/train.csv"
evalFileName = "gs://" + BUCKET + "/images/eval.csv"

%gcs read --object $labelFileName --variable csv_as_bytes

labeldf = pd.read_csv(BytesIO(csv_as_bytes), names=['type','filename','score'])
scores = labeldf['score']
bins = np.linspace(0, 1, 10,endpoint=False)
digitized = np.digitize(scores, bins)
labeldf['score'] = digitized
traindf = labeldf[labeldf['type']=='train']
evaldf = labeldf[labeldf['type']=='test']
traindf = traindf[['filename','score']]
evaldf = evaldf[['filename','score']]
traindf['filename'] = traindf['filename'].apply(transformFileName)
evaldf['filename'] = evaldf['filename'].apply(transformFileName)
traincsv=StringIO.StringIO()
evalcsv=StringIO.StringIO()
traindf.to_csv(path_or_buf=traincsv,header=None,index=False)
evaldf.to_csv(path_or_buf=evalcsv,header=None,index=False)
trainstr = traincsv.getvalue()
evalstr = evalcsv.getvalue()

%gcs write --object $trainFileName --variable trainstr
%gcs write --object $evalFileName --variable evalstr


## Invoke Tensorboard on output directory

In [None]:
evalcsv.getvalue()

In [None]:
from google.datalab.ml import TensorBoard
OUTDIR = 'gs://' +BUCKET + '/export_model'
TensorBoard().start(OUTDIR)

In [None]:
import shutil
shutil.rmtree(OUTDIR, ignore_errors = True)

## Run as a Python module

Let's run it as Python module.  Note the --model parameter

In [None]:
%bash
export OUTDIR=gs://${BUCKET}/export_model
export DATADIR=gs://${BUCKET}/images
export PYTHONPATH=${PYTHONPATH}:${PWD}/placermodel
python -m trainer.task --output_dir=$OUTDIR --dataset_dir=$DATADIR \
   --train_steps=6000 --learning_rate=0.01 --train_batch_size=40 \
   --model=$MODEL_TYPE --batch_norm

Now, let's do it on ML Engine. Note the --model parameter

In [None]:
%bash
OUTDIR=gs://${BUCKET}/placer/trained_${MODEL_TYPE}
JOBNAME=flowers_${MODEL_TYPE}_$(date -u +%y%m%d_%H%M%S)
echo $OUTDIR $REGION $JOBNAME
gsutil -m rm -rf $OUTDIR
gcloud ml-engine jobs submit training $JOBNAME \
   --region=$REGION \
   --module-name=trainer.task \
   --package-path=${PWD}/placermodel/trainer \
   --job-dir=$OUTDIR \
   --staging-bucket=gs://$BUCKET \
   --scale-tier=BASIC_GPU \
   --runtime-version=1.2 \
   -- \
   --output_dir=$OUTDIR \
   --train_steps=5000 --learning_rate=0.01 --train_batch_size=40 \
   --model=$MODEL_TYPE --batch_norm --augment

Here are my results:

Model | Accuracy | Time taken | Run time parameters
--- | :---: | ---
cnn with batch-norm | 0.582 | 47 min | 1000 steps, LR=0.01, Batch=40
as above, plus augment | 0.615 | 3 hr | 5000 steps, LR=0.01, Batch=40

## Deploying and predicting with model

Deploy the model:

In [None]:
%bash
MODEL_NAME="placermodel"
MODEL_VERSION=${MODEL_TYPE}
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/placermodel/trained_${MODEL_TYPE}/export/Servo | tail -1)
echo "Deleting and deploying $MODEL_NAME $MODEL_VERSION from $MODEL_LOCATION ... this will take a few minutes"
gcloud ml-engine versions delete --quiet ${MODEL_VERSION} --model ${MODEL_NAME}
#gcloud ml-engine models delete ${MODEL_NAME}
#gcloud ml-engine models create ${MODEL_NAME} --regions $REGION
gcloud ml-engine versions create ${MODEL_VERSION} --model ${MODEL_NAME} --origin ${MODEL_LOCATION}

To predict with the model, let's take one of the example images that is available on Google Cloud Storage <img src="http://storage.googleapis.com/cloud-ml-data/img/flower_photos/sunflowers/1022552002_2b93faf9e7_n.jpg" />

In [None]:
%writefile test.json
{"imageurl": "gs://cloud-ml-data/img/flower_photos/sunflowers/1022552002_2b93faf9e7_n.jpg"}

Send it to the prediction service

In [None]:
%bash
gcloud ml-engine predict --model=placer --version=${MODEL_TYPE} --json-instances=./test.json

<pre>
# Copyright 2017 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#      http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
</pre>