<a href="https://colab.research.google.com/github/hailusong/colab-god-idclass/blob/master/god_idclass_mltrain.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training: Custom Train Google Object Detection to Detect ID BBox

**FIRST OF ALL: CHOOSE RUNTIME ENVIRONMENT TYPE TO BE GPU**<br>
Environment variables setup.<br>
**Tensorflow runtime version list** can be found at [here](https://cloud.google.com/ml-engine/docs/tensorflow/runtime-version-list)

In [0]:
DEFAULT_HOME='/content'
TF_RT_VERSION='1.13'
PYTHON_VERSION='3.5'

YOUR_GCS_BUCKET='id-norm'
YOUR_PROJECT='orbital-purpose-130316'

## Session and Environment Verification (Destination - Local)

Establish security session with Google Cloud

In [0]:
from google.colab import auth
auth.authenticate_user()


################# RE-RUN ABOVE CELLS IF NEED TO RESTART RUNTIME #################

Verify Versions: TF, Python, IPython and prompt_toolkit (these two need to have compatible version), and protoc

In [0]:
import tensorflow as tf
print(tf.__version__)
assert(tf.__version__.startswith(TF_RT_VERSION + '.')), f'tf.__version__ {tf.__version__} not matching with specified TF runtime version env variable {TF_RT_VERSION}'

1.13.1


In [0]:
!python -V
!ipython --version
!pip show prompt_toolkit
!protoc --version

Python 3.6.7
5.5.0
Name: prompt-toolkit
Version: 1.0.15
Summary: Library for building powerful interactive command lines in Python
Home-page: https://github.com/jonathanslenders/python-prompt-toolkit
Author: Jonathan Slenders
Author-email: UNKNOWN
License: UNKNOWN
Location: /usr/local/lib/python3.6/dist-packages
Requires: six, wcwidth
Required-by: jupyter-console, ipython
libprotoc 3.0.0


## Install Google Object Detection API in Colab
Reference is https://colab.research.google.com/drive/1kHEQK2uk35xXZ_bzMUgLkoysJIWwznYr


### Downgrade prompt-toolkit to 1.0.15 (Destination - Local)
Run this **ONLY** if the Installation not Working

In [0]:
# !pip install 'prompt-toolkit==1.0.15'

### Google Object Detection API Installation (Destination - Local)

In [0]:
!apt-get install -y -qq protobuf-compiler python-pil python-lxml
![ ! -e {DEFAULT_HOME}/models ] && git clone --depth=1 --quiet https://github.com/tensorflow/models.git {DEFAULT_HOME}/models
!ls {DEFAULT_HOME}/models

AUTHORS     CONTRIBUTING.md    LICENSE	 README.md  samples    WORKSPACE
CODEOWNERS  ISSUE_TEMPLATE.md  official  research   tutorials


In [0]:
import os
os.chdir(f'{DEFAULT_HOME}/models/research')
!pwd

/content/models/research


*From Wikipedia ...*: 

**protocol buffers** are a language-neutral, platform-neutral extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. 

You define how you want your data to be structured once, then you can **use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages**.

Remember **.proto defines structured data** and **protoc generates the source code** the serailize/de-serialize.

In [0]:
!protoc object_detection/protos/*.proto --python_out=.
# !ls object_detection/protos/*.proto
# !cat object_detection/protos/anchor_generator.proto
!ls {DEFAULT_HOME}/models/research/object_detection/builders/anchor*

/content/models/research/object_detection/builders/anchor_generator_builder.py
/content/models/research/object_detection/builders/anchor_generator_builder_test.py


#### Add Google Object Detection API into System Path

In [0]:
import sys
sys.path.append(f'{DEFAULT_HOME}/models/research')
sys.path.append(f'{DEFAULT_HOME}/models/research/slim')

Note that ! calls out to a shell (in a **NEW** process), while % affects the **SAME** process associated with the notebook.

Since we append pathes to sys.path, we **HAVE TO** use % command to run the Python

Also it is **IMPORTANT** to have **%matplotlib inline** otherwise %run model_builder_test.py will **cause function attribute error** when accessing matplotlib.pyplot attributes from **iPython's run_line_magic** 

In [0]:
# !find . -name 'inception*' -print
%matplotlib inline

In [0]:
# If see the error 'function' object has no attribute 'called', just run the %matplotlib cell and this cell AGAIN 
%run object_detection/builders/model_builder_test.py

import os
os.chdir(f'{DEFAULT_HOME}')

............s...
----------------------------------------------------------------------
Ran 16 tests in 0.154s

OK (skipped=1)


## Git Sync for any Change in colab-god-idclass 

In [0]:
![ -e {DEFAULT_HOME}/colab-god-idclass ] && git -C {DEFAULT_HOME}/colab-god-idclass pull
![ ! -e {DEFAULT_HOME}/colab-god-idclass ] && git clone --depth=1 https://github.com/hailusong/colab-god-idclass.git {DEFAULT_HOME}/colab-god-idclass

Cloning into '/content/colab-god-idclass'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects:   8% (1/12)   [Kremote: Counting objects:  16% (2/12)   [Kremote: Counting objects:  25% (3/12)   [Kremote: Counting objects:  33% (4/12)   [Kremote: Counting objects:  41% (5/12)   [Kremote: Counting objects:  50% (6/12)   [Kremote: Counting objects:  58% (7/12)   [Kremote: Counting objects:  66% (8/12)   [Kremote: Counting objects:  75% (9/12)   [Kremote: Counting objects:  83% (10/12)   [Kremote: Counting objects:  91% (11/12)   [Kremote: Counting objects: 100% (12/12)   [Kremote: Counting objects: 100% (12/12), done.[K
remote: Compressing objects:   9% (1/11)   [Kremote: Compressing objects:  18% (2/11)   [Kremote: Compressing objects:  27% (3/11)   [Kremote: Compressing objects:  36% (4/11)   [Kremote: Compressing objects:  45% (5/11)   [Kremote: Compressing objects:  54% (6/11)   [Kremote: Compressing objects:  63% (7/11)   [Kremo

### Checking Your Google Cloud Storage Bucket

In [0]:
!gsutil ls gs://{YOUR_GCS_BUCKET}/data
!gsutil ls gs://{YOUR_GCS_BUCKET}/generated

gs://id-norm/data/faster_rcnn_resnet101_processed.config
gs://id-norm/data/label_map.pbtxt
gs://id-norm/data/model.ckpt.data-00000-of-00001
gs://id-norm/data/model.ckpt.index
gs://id-norm/data/model.ckpt.meta
gs://id-norm/data/test.record
gs://id-norm/data/train.record
CommandException: One or more URLs matched no objects.


## Start the Training and Evaluation Jobs on Google Cloud ML Engine

### Option 1: Start the Training Job on Google Cloud ML (Destination - Cloud ML)

#### Package the Tensorflow Object Detection code (Destination - Local but to be submitted as package to Cloud ML)
Before you can run on GCP, you must first **package the TensorFlow Object Detection API and TF Slim**.

In [0]:
import os
os.chdir(f'{DEFAULT_HOME}/models/research')

# From tensorflow/models/research/
!bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
!python setup.py sdist
!(cd slim && python setup.py sdist)

Three python packages will be created:
* dist/object_detection-0.1.tar.gz
* slim/dist/slim-0.1.tar.gz
* /tmp/pycocotools/pycocotools-2.0.tar.gz

In [0]:
!ls -l dist/object_detection-0.1.tar.gz
!ls -l slim/dist/slim-0.1.tar.gz
!ls -l /tmp/pycocotools/pycocotools-2.0.tar.gz

-rw-r--r-- 1 root root 439299460 Mar 23 22:11 dist/object_detection-0.1.tar.gz
-rw-r--r-- 1 root root 973963 Mar 23 22:11 slim/dist/slim-0.1.tar.gz
-rw-r--r-- 1 root root 1376450 Mar 23 22:10 /tmp/pycocotools/pycocotools-2.0.tar.gz


If you experience 'Permission denied on resource project ...' issue, make sure Cloud ML engine API has been enabled for the project. Use this [link](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,compute_component) to check and enable if not yet done.

In [0]:
# From tensorflow/models/research/
!gcloud config set project {YOUR_PROJECT}

Updated property [core/project].


In [0]:
!cat object_detection/samples/cloud/cloud.yml

trainingInput:
  runtimeVersion: "1.12"
  scaleTier: CUSTOM
  masterType: standard_gpu
  workerCount: 5
  workerType: standard_gpu
  parameterServerCount: 3
  parameterServerType: standard





 #### Start the Training Job on Google Cloud ML (Destination - Cloud ML)

In [0]:
!gcloud ml-engine jobs submit training `whoami`_object_detection_ids_`date +%m_%d_%Y_%H_%M_%S` \
    --runtime-version {TF_RT_VERSION} \
    --job-dir=gs://{YOUR_GCS_BUCKET}/model_dir \
    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
    --module-name object_detection.model_main \
    --region us-central1 \
    --config object_detection/samples/cloud/cloud.yml \
    --python-version {PYTHON_VERSION} \
    -- \
    --model_dir=gs://{YOUR_GCS_BUCKET}/model_dir \
    --pipeline_config_path=gs://{YOUR_GCS_BUCKET}/data/pipeline_faster_rcnn_resnet101_processed.config

Job [root_object_detection_ids_03_23_2019_22_25_20] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ml-engine jobs describe root_object_detection_ids_03_23_2019_22_25_20

or continue streaming the logs with the command

  $ gcloud ml-engine jobs stream-logs root_object_detection_ids_03_23_2019_22_25_20
jobId: root_object_detection_ids_03_23_2019_22_25_20
state: QUEUED


### Option 1 - Monitor the Cloud ML Training Job in Tensorboard
To monitor in the ML Engine dashboard, click on [this link](https://console.cloud.google.com/mlengine/jobs).<br>