<a href="https://colab.research.google.com/github/hailusong/colab-god-idclass/blob/master/god_idclass.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Custom Train Google Object Detection to Detect ID BBox

Environment variables setup.<br>
**Tensorflow runtime version list** can be found at [here](https://cloud.google.com/ml-engine/docs/tensorflow/runtime-version-list)

In [0]:
DEFAULT_HOME='/content'
TF_RT_VERSION='1.13'
PYTHON_VERSION='3.5'

YOUR_GCS_BUCKET='id-norm'
YOUR_PROJECT='orbital-purpose-130316'

## Session and Environment Verification (Destination - Local)

Establish security session with Google Cloud

In [0]:
from google.colab import auth
auth.authenticate_user()


################# RE-RUN ABOVE CELLS IF NEED TO RESTART RUNTIME #################

Verify Versions: TF, Python, IPython and prompt_toolkit (these two need to have compatible version), and protoc

In [4]:
import tensorflow as tf
print(tf.__version__)
assert(tf.__version__.startswith(TF_RT_VERSION + '.')), f'tf.__version__ {tf.__version__} not matching with specified TF runtime version env variable {TF_RT_VERSION}'

1.13.1


In [5]:
!python -V
!ipython --version
!pip show prompt_toolkit
!protoc --version

Python 3.6.7
5.5.0
Name: prompt-toolkit
Version: 1.0.15
Summary: Library for building powerful interactive command lines in Python
Home-page: https://github.com/jonathanslenders/python-prompt-toolkit
Author: Jonathan Slenders
Author-email: UNKNOWN
License: UNKNOWN
Location: /usr/local/lib/python3.6/dist-packages
Requires: six, wcwidth
Required-by: jupyter-console, ipython
libprotoc 3.0.0


## Install Google Object Detection API in Colab
Reference is https://colab.research.google.com/drive/1kHEQK2uk35xXZ_bzMUgLkoysJIWwznYr


### Downgrade prompt-toolkit to 1.0.15 (Destination - Local)
Run this **ONLY** if the Installation not Working

In [0]:
!pip install 'prompt-toolkit==1.0.15'

### Google Object Detection API Installation (Destination - Local)

In [4]:
!apt-get install -y -qq protobuf-compiler python-pil python-lxml
![ ! -e {DEFAULT_HOME}/models ] && git clone --depth=1 --quiet https://github.com/tensorflow/models.git {DEFAULT_HOME}/models
!ls {DEFAULT_HOME}/models

AUTHORS     CONTRIBUTING.md    LICENSE	 README.md  samples    WORKSPACE
CODEOWNERS  ISSUE_TEMPLATE.md  official  research   tutorials


In [5]:
import os
os.chdir(f'{DEFAULT_HOME}/models/research')
!pwd

/content/models/research


*From Wikipedia ...*: 

**protocol buffers** are a language-neutral, platform-neutral extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. 

You define how you want your data to be structured once, then you can **use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages**.

Remember **.proto defines structured data** and **protoc generates the source code** the serailize/de-serialize.

In [6]:
!protoc object_detection/protos/*.proto --python_out=.
# !ls object_detection/protos/*.proto
# !cat object_detection/protos/anchor_generator.proto
!ls {DEFAULT_HOME}/models/research/object_detection/builders/anchor*

/content/models/research/object_detection/builders/anchor_generator_builder.py
/content/models/research/object_detection/builders/anchor_generator_builder_test.py


#### Add Google Object Detection API into System Path

In [0]:
import sys
sys.path.append(f'{DEFAULT_HOME}/models/research')
sys.path.append(f'{DEFAULT_HOME}/models/research/slim')

Note that ! calls out to a shell (in a **NEW** process), while % affects the **SAME** process associated with the notebook.

Since we append pathes to sys.path, we **HAVE TO** use % command to run the Python

Also it is **IMPORTANT** to have **%matplotlib inline** otherwise %run model_builder_test.py will **cause function attribute error** when accessing matplotlib.pyplot attributes from **iPython's run_line_magic** 

In [0]:
# !find . -name 'inception*' -print
%matplotlib inline

In [11]:
# If see the error 'function' object has no attribute 'called', just run the %matplotlib cell and this cell AGAIN 
%run object_detection/builders/model_builder_test.py

import os
os.chdir(f'{DEFAULT_HOME}')

............s...
----------------------------------------------------------------------
Ran 16 tests in 0.154s

OK (skipped=1)


### Pre-trained Data Prepatation (Destination - GCS)
e.g. pre-trained model weights

Download, unzip and move COCO-pretrained weights data to GCS<br>
Other possible pretrained models:<br>
* ssd_mobilenet_v1_coco_11_06_2017
* ssd_inception_v2_coco_11_06_2017
* rfcn_resnet101_coco_11_06_2017
* faster_rcnn_resnet101_coco_11_06_2017
* faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017

In [0]:
import os
os.chdir(f'{DEFAULT_HOME}')
!wget http://storage.googleapis.com/download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_11_06_2017.tar.gz

--2019-03-19 14:19:45--  http://storage.googleapis.com/download.tensorflow.org/models/object_detection/faster_rcnn_resnet101_coco_11_06_2017.tar.gz
Resolving storage.googleapis.com (storage.googleapis.com)... 74.125.141.128, 2607:f8b0:400c:c06::80
Connecting to storage.googleapis.com (storage.googleapis.com)|74.125.141.128|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 595490113 (568M) [application/x-tar]
Saving to: ‘faster_rcnn_resnet101_coco_11_06_2017.tar.gz’


2019-03-19 14:19:49 (156 MB/s) - ‘faster_rcnn_resnet101_coco_11_06_2017.tar.gz’ saved [595490113/595490113]



In [0]:
!ls {DEFAULT_HOME}/faster_rcnn_resnet101_coco_11_06_2017

frozen_inference_graph.pb  model.ckpt.data-00000-of-00001  model.ckpt.meta
graph.pbtxt		   model.ckpt.index


In [0]:
![ ! -e faster_rcnn_resnet101_coco_11_06_2017 ] && tar -xvf faster_rcnn_resnet101_coco_11_06_2017.tar.gz
!gsutil cp faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.* gs://{YOUR_GCS_BUCKET}/data/

faster_rcnn_resnet101_coco_11_06_2017/
faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.index
faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.meta
faster_rcnn_resnet101_coco_11_06_2017/frozen_inference_graph.pb
faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.data-00000-of-00001
faster_rcnn_resnet101_coco_11_06_2017/graph.pbtxt
Copying file://faster_rcnn_resnet101_coco_11_06_2017/model.ckpt.data-00000-of-00001 [Content-Type=application/octet-stream]...
==> NOTE: You are uploading one or more large file(s), which would run
significantly faster if you enable parallel composite uploads. This
feature can be enabled by editing the
"parallel_composite_upload_threshold" value in your .boto
configuration file. However, note that if you do this large files will
be uploaded as `composite objects
<https://cloud.google.com/storage/docs/composite-objects>`_,which
means that any user who downloads such objects will need to have a
compiled crcmod installed (see "gsutil help crcmod"). This is because

## Configuring the Object Detection Pipeline (Destination - GCS)

In [12]:
![ -e {DEFAULT_HOME}/colab-god-idclass ] && git -C {DEFAULT_HOME}/colab-god-idclass pull
![ ! -e {DEFAULT_HOME}/colab-god-idclass ] && git clone --depth=1 https://github.com/hailusong/colab-god-idclass.git {DEFAULT_HOME}/colab-god-idclass

Cloning into '/content/colab-god-idclass'...
remote: Enumerating objects: 12, done.[K
remote: Counting objects:   8% (1/12)   [Kremote: Counting objects:  16% (2/12)   [Kremote: Counting objects:  25% (3/12)   [Kremote: Counting objects:  33% (4/12)   [Kremote: Counting objects:  41% (5/12)   [Kremote: Counting objects:  50% (6/12)   [Kremote: Counting objects:  58% (7/12)   [Kremote: Counting objects:  66% (8/12)   [Kremote: Counting objects:  75% (9/12)   [Kremote: Counting objects:  83% (10/12)   [Kremote: Counting objects:  91% (11/12)   [Kremote: Counting objects: 100% (12/12)   [Kremote: Counting objects: 100% (12/12), done.[K
remote: Compressing objects:   9% (1/11)   [Kremote: Compressing objects:  18% (2/11)   [Kremote: Compressing objects:  27% (3/11)   [Kremote: Compressing objects:  36% (4/11)   [Kremote: Compressing objects:  45% (5/11)   [Kremote: Compressing objects:  54% (6/11)   [Kremote: Compressing objects:  63% (7/11)   [Kremo

In [15]:
!ls -al {DEFAULT_HOME}/colab-god-idclass/configs/faster_rcnn_resnet101.config
!sed 's/..YOUR_GCS_BUCKET./{YOUR_GCS_BUCKET}/g' < {DEFAULT_HOME}/colab-god-idclass/configs/faster_rcnn_resnet101.config > {DEFAULT_HOME}/colab-god-idclass/configs/faster_rcnn_resnet101_processed.config
!gsutil cp {DEFAULT_HOME}/colab-god-idclass/configs/faster_rcnn_resnet101_processed.config \
           {DEFAULT_HOME}/colab-god-idclass/configs/label_map.pbtxt \
           gs://{YOUR_GCS_BUCKET}/data

-rw-r--r-- 1 root root 3673 Mar 26 01:46 /content/colab-god-idclass/configs/faster_rcnn_resnet101.config
Copying file:///content/colab-god-idclass/configs/faster_rcnn_resnet101_processed.config [Content-Type=application/octet-stream]...
Copying file:///content/colab-god-idclass/configs/label_map.pbtxt [Content-Type=application/octet-stream]...
/ [2 files][  3.6 KiB/  3.6 KiB]                                                
Operation completed over 2 objects/3.6 KiB.                                      


### Checking Your Google Cloud Storage Bucket

In [16]:
!gsutil ls gs://{YOUR_GCS_BUCKET}/data
!gsutil ls gs://{YOUR_GCS_BUCKET}/generated

gs://id-norm/data/faster_rcnn_resnet101_processed.config
gs://id-norm/data/label_map.pbtxt
gs://id-norm/data/model.ckpt.data-00000-of-00001
gs://id-norm/data/model.ckpt.index
gs://id-norm/data/model.ckpt.meta
gs://id-norm/data/test.record
gs://id-norm/data/train.record
CommandException: One or more URLs matched no objects.


## Prepare Our Own Data: Download, Convert and Upload (Destination - GCS)

Use Google Cloud SDK gsutil to download the data file **generated.tar.gz**<br>
Note that the file **generated.tar.gz** MUST BE uploaded to GCS bucket by:<br>
* Run the BB project idaug to generate images, bbox csv and key-points csv in folder **generated**
* Tar/gzip the whole **generated** folder to **generated.tar.gz**

In [17]:
# Download the file.
!gsutil cp gs://{YOUR_GCS_BUCKET}/generated.tar.gz /tmp/generated.tar.gz
!ls /tmp/*gz

Copying gs://id-norm/generated.tar.gz...
\ [1 files][131.2 MiB/131.2 MiB]                                                
Operation completed over 1 objects/131.2 MiB.                                    
/tmp/generated.tar.gz


Prepare the data file (unzip, untar)

In [18]:
import os
os.chdir(f'{DEFAULT_HOME}')

![[ ! -f /tmp/generated.tar && -f /tmp/generated.tar.gz ]] && gunzip /tmp/generated.tar.gz
![[ ! -e ./generated && -f /tmp/generated.tar ]] && tar xf /tmp/generated.tar
!pwd
!ls {DEFAULT_HOME}/generated

/content
bbox-train-non-id1.csv	bbox-valid-on-dl.csv	pnts-valid-non-id2.csv
bbox-train-non-id2.csv	bbox-valid-on-hc.csv	pnts-valid-non-id3.csv
bbox-train-non-id3.csv	pnts-train-non-id1.csv	pnts-valid-on-dl.csv
bbox-train-on-dl.csv	pnts-train-non-id2.csv	pnts-valid-on-hc.csv
bbox-train-on-hc.csv	pnts-train-non-id3.csv	Train
bbox-valid-non-id1.csv	pnts-train-on-dl.csv	Valid
bbox-valid-non-id2.csv	pnts-train-on-hc.csv
bbox-valid-non-id3.csv	pnts-valid-non-id1.csv


In [0]:
# Copy unzip generated back
!gsutil cp -R {DEFAULT_HOME}/generated gs://{YOUR_GCS_BUCKET}

Concat all train csv together, keep only one header and name the first column (no name in the input as it is considered as index column in BB project idaug).<br>
Apply the same processing to validation data as well.

In [20]:
!head -1 {DEFAULT_HOME}/generated/bbox-train-on-dl.csv | sed 's/^,/filename,/' > {DEFAULT_HOME}/train-merged.csv
!head -1 {DEFAULT_HOME}/generated/bbox-valid-on-dl.csv | sed 's/^,/filename,/' > {DEFAULT_HOME}/valid-merged.csv
!tail -q --lines=+2 {DEFAULT_HOME}/generated/bbox-train-*.csv | sed 's/\\/\//g' >> {DEFAULT_HOME}/train-merged.csv
!tail -q --lines=+2 {DEFAULT_HOME}/generated/bbox-valid-*.csv | sed 's/\\/\//g' >> {DEFAULT_HOME}/valid-merged.csv
!ls {DEFAULT_HOME}/generated
!head {DEFAULT_HOME}/train-merged.csv {DEFAULT_HOME}/valid-merged.csv

bbox-train-non-id1.csv	bbox-valid-on-dl.csv	pnts-valid-non-id2.csv
bbox-train-non-id2.csv	bbox-valid-on-hc.csv	pnts-valid-non-id3.csv
bbox-train-non-id3.csv	pnts-train-non-id1.csv	pnts-valid-on-dl.csv
bbox-train-on-dl.csv	pnts-train-non-id2.csv	pnts-valid-on-hc.csv
bbox-train-on-hc.csv	pnts-train-non-id3.csv	Train
bbox-valid-non-id1.csv	pnts-train-on-dl.csv	Valid
bbox-valid-non-id2.csv	pnts-train-on-hc.csv
bbox-valid-non-id3.csv	pnts-valid-non-id1.csv
==> /content/train-merged.csv <==
filename,bbox1_x1,bbox1_y1,bbox1_x2,bbox1_y2,label
generated/Train/non-id1/0.png,10,5,143,93,UNKNOWN
generated/Train/non-id1/1.png,15,0,126,74,UNKNOWN
generated/Train/non-id1/2.png,40,23,119,76,UNKNOWN
generated/Train/non-id1/3.png,20,51,246,202,UNKNOWN
generated/Train/non-id1/4.png,15,33,129,109,UNKNOWN
generated/Train/non-id1/5.png,38,43,114,94,UNKNOWN
generated/Train/non-id1/6.png,51,10,223,125,UNKNOWN
generated/Train/non-id1/7.png,38,48,198,155,UNKNOWN
generated/Train/non-id1/8.png,38,33,255,178,UNKNO

Upload unzip data file to GCS bucket in parallel mode (-m)

In [21]:
!gsutil cp {DEFAULT_HOME}/train-merged.csv {DEFAULT_HOME}/valid-merged.csv gs://{YOUR_GCS_BUCKET}

Copying file:///content/train-merged.csv [Content-Type=text/csv]...
Copying file:///content/valid-merged.csv [Content-Type=text/csv]...
/ [2 files][ 60.0 KiB/ 60.0 KiB]                                                
Operation completed over 2 objects/60.0 KiB.                                     


### Convert Our Label CSV Data to TF REcord
Source code is based on https://github.com/datitran/raccoon_dataset/blob/master/generate_tfrecord.py

In [22]:
%pdb

Automatic pdb calling has been turned ON


In [0]:
import os
os.chdir(f'{DEFAULT_HOME}')

!head {DEFAULT_HOME}/train-merged.csv
!mkdir -p {DEFAULT_HOME}/coversion
!git -C {DEFAULT_HOME}/colab-god-idclass pull

# Train records first
%run {DEFAULT_HOME}/colab-god-idclass/src/generate_tfrecord.py --csv_input={DEFAULT_HOME}/train-merged.csv --output_path={DEFAULT_HOME}/coversion/train.record

In [0]:
# Validation records second
!head {DEFAULT_HOME}/valid-merged.csv
%run {DEFAULT_HOME}/colab-god-idclass/src/generate_tfrecord.py --csv_input={DEFAULT_HOME}/valid-merged.csv --output_path={DEFAULT_HOME}/coversion/test.record

In [26]:
!gsutil cp {DEFAULT_HOME}/coversion/train.record {DEFAULT_HOME}/coversion/test.record gs://{YOUR_GCS_BUCKET}/data

Copying file:///content/coversion/train.record [Content-Type=application/octet-stream]...
Copying file:///content/coversion/test.record [Content-Type=application/octet-stream]...
|
Operation completed over 2 objects/131.8 MiB.                                    


## Start the Training and Evaluation Jobs on Google Cloud ML Engine

### Option 1: Start the Training Job on Google Cloud ML (Destination - Cloud ML)

#### Package the Tensorflow Object Detection code (Destination - Local but to be submitted as package to Cloud ML)
Before you can run on GCP, you must first **package the TensorFlow Object Detection API and TF Slim**.

In [0]:
import os
os.chdir(f'{DEFAULT_HOME}/models/research')

# From tensorflow/models/research/
!bash object_detection/dataset_tools/create_pycocotools_package.sh /tmp/pycocotools
!python setup.py sdist
!(cd slim && python setup.py sdist)

Three python packages will be created:
* dist/object_detection-0.1.tar.gz
* slim/dist/slim-0.1.tar.gz
* /tmp/pycocotools/pycocotools-2.0.tar.gz

In [0]:
!ls -l dist/object_detection-0.1.tar.gz
!ls -l slim/dist/slim-0.1.tar.gz
!ls -l /tmp/pycocotools/pycocotools-2.0.tar.gz

-rw-r--r-- 1 root root 439299460 Mar 23 22:11 dist/object_detection-0.1.tar.gz
-rw-r--r-- 1 root root 973963 Mar 23 22:11 slim/dist/slim-0.1.tar.gz
-rw-r--r-- 1 root root 1376450 Mar 23 22:10 /tmp/pycocotools/pycocotools-2.0.tar.gz


If you experience 'Permission denied on resource project ...' issue, make sure Cloud ML engine API has been enabled for the project. Use this [link](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,compute_component) to check and enable if not yet done.

In [28]:
# From tensorflow/models/research/
!gcloud config set project {YOUR_PROJECT}

Updated property [core/project].


In [0]:
!cat object_detection/samples/cloud/cloud.yml

trainingInput:
  runtimeVersion: "1.12"
  scaleTier: CUSTOM
  masterType: standard_gpu
  workerCount: 5
  workerType: standard_gpu
  parameterServerCount: 3
  parameterServerType: standard





 #### Start the Training Job on Google Cloud ML (Destination - Cloud ML)

In [0]:
!gcloud ml-engine jobs submit training `whoami`_object_detection_ids_`date +%m_%d_%Y_%H_%M_%S` \
    --runtime-version {TF_RT_VERSION} \
    --job-dir=gs://{YOUR_GCS_BUCKET}/model_dir \
    --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz,/tmp/pycocotools/pycocotools-2.0.tar.gz \
    --module-name object_detection.model_main \
    --region us-central1 \
    --config object_detection/samples/cloud/cloud.yml \
    --python-version {PYTHON_VERSION} \
    -- \
    --model_dir=gs://{YOUR_GCS_BUCKET}/model_dir \
    --pipeline_config_path=gs://{YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_processed.config

Job [root_object_detection_ids_03_23_2019_22_25_20] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ml-engine jobs describe root_object_detection_ids_03_23_2019_22_25_20

or continue streaming the logs with the command

  $ gcloud ml-engine jobs stream-logs root_object_detection_ids_03_23_2019_22_25_20
jobId: root_object_detection_ids_03_23_2019_22_25_20
state: QUEUED


### Option 2: Start the Training Job on CoLab

**Clean up the those tf.app.flags usde by Object Detection API** (so that we don't need to restart the runtime if we want to %run the Google object detection API again in the same session)

In [56]:
def del_all_flags(FLAGS, excls):
    flags_dict = FLAGS._flags()
    keys_list = [keys for keys in flags_dict]
    for keys in keys_list:
        if keys in excls:
          print(f'SKIPPING exclusion attribute {keys}')
          continue
          
        print(f'removing attribute {keys}')
        FLAGS.__delattr__(keys)


# if running inside IPython notebook, the python session will be maintained across
# cells, so does the tf.app.flags. That will cause flags defined twice error
# if we %run the app multiple times. The workaroud is to always clean up
# the flags before defining them.
flags = tf.app.flags
del_all_flags(flags.FLAGS, ['logtostderr'])

# flags.DEFINE_string('logtostderr', '', '')


SKIPPING exclusion attribute logtostderr


MAKE SURE YOU SET RUNTIME TYPE TO **GPU or TPU**

In [0]:
import os
os.chdir(f'{DEFAULT_HOME}/models/research')

import sys
sys.path.append(f'{DEFAULT_HOME}/models/research')
sys.path.append(f'{DEFAULT_HOME}/models/research/slim')

# Start the training
%run object_detection/model_main.py \
     --logtostderr \
     --model_dir=gs://{YOUR_GCS_BUCKET}/model_dir \
     --pipeline_config_path=gs://{YOUR_GCS_BUCKET}/data/faster_rcnn_resnet101_processed.config

W0326 02:30:38.783686 139713059559296 model_lib.py:598] Forced number of epochs for all eval validations to be 1.
W0326 02:30:38.785445 139713059559296 model_lib.py:614] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0326 02:30:38.788498 139713059559296 estimator.py:1924] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f11026b8730>) includes params argument, but params are not passed to Estimator.
W0326 02:30:39.114752 139713059559296 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
W0326 02:30:39.843899 139713059559296 dataset_builder.py:66] num_readers has been reduced to 1 to match input file shards.
W0326 02:30:39.854527 

### Option 1 - Monitor the Cloud ML Training Job in Tensorboard
To monitor in the ML Engine dashboard, click on [this link](https://console.cloud.google.com/mlengine/jobs).<br>

### Option 2 - Monitor CoLab Training using Tensorboard running on Colab
**OBVIOUSLY YOU CANNOT BOTH TRAIN and MONITOR on COLAB AT THE SAME TIME. ONE SESSION WILL BE STOPED**<br>
You will need to install ngrok for tunneling purpose

In [0]:
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip ngrok-stable-linux-amd64.zip

--2019-03-23 20:19:20--  https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
Resolving bin.equinox.io (bin.equinox.io)... 52.45.248.161, 52.22.145.207, 34.226.180.131, ...
Connecting to bin.equinox.io (bin.equinox.io)|52.45.248.161|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13584026 (13M) [application/octet-stream]
Saving to: ‘ngrok-stable-linux-amd64.zip’


2019-03-23 20:19:21 (32.9 MB/s) - ‘ngrok-stable-linux-amd64.zip’ saved [13584026/13584026]

Archive:  ngrok-stable-linux-amd64.zip
  inflating: ngrok                   


In [0]:
get_ipython().system_raw('./ngrok http 6006 &')
!curl -s http://localhost:4040/api/tunnels | python3 -c \
   "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

https://c2e3a421.ngrok.io


Start the local Tensorboard with log data feed from your GCS bucket and then **click on the link above**

First, we need to do the authorization

In [0]:
# This command needs to be run once to allow your local machine to access your
# GCS bucket.
!gcloud auth application-default login


The environment variable [GOOGLE_APPLICATION_CREDENTIALS] is set to:
  [/content/adc.json]
Credentials will still be generated to the default location:
  [/content/.config/application_default_credentials.json]
To use these credentials, unset this environment variable before
running your application.

Do you want to continue (Y/n)?  

Go to the following link in your browser:

    https://accounts.google.com/o/oauth2/auth?redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&prompt=select_account&response_type=code&client_id=764086051850-6qr4p6gpi6hn506pt8ejuq83di341hur.apps.googleusercontent.com&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&access_type=offline


Enter verification code: 4/EwF9jcyTzUNaKSsCYeYC004t59soSUGh-N_tXuenfdWAr91XFFlUIEQ

Credentials saved to file: [/content/.config/application_default_credentials.json]

These credentials will be used by any library that requests
Application Default Credentials

Now time to start the **Tensorboard**

In [0]:
!tensorboard --logdir=gs://{YOUR_GCS_BUCKET}/model_dir

TensorBoard 1.13.1 at http://66f6bb7f3a12:6006 (Press CTRL+C to quit)
W0323 22:35:54.594110 139803632195328 plugin_event_accumulator.py:294] Found more than one graph event per run, or there was a metagraph containing a graph_def, as well as one or more graph events.  Overwriting the graph with the newest event.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
W0323 22:36:30.326428 139803600250624 deprecation.py:323] From /usr/local/lib/python2.7/dist-packages/tensorboard/plugins/projector/projector_plugin.py:410: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-03-23 23:02:30.570712: E tensorflow/core/platform/cloud/curl_http_request.cc:596] The transmission  of request 0x561d07d6f9e0 (URI: https://storage.googleapis.com/id-norm/model_dir%2Fevents.out.tfevents.1553380551.cmle-tr

### Option 3 - Monitor CoLab Training Job on Google Cloud Shell
* Log into the Google Cloud and run cloud shell
* In the shell run
```
export YOUR_GCS_BUCKET='id-norm'
tensorboard --logdir=gs://$YOUR_GCS_BUCKET/model_dir
```
