Based on:

https://www.dlology.com/blog/how-to-train-an-object-detection-model-easy-for-free/

https://towardsdatascience.com/deeppicar-part-6-963334b2abe0


Pretrained Model from (Tensorflow detection model zoo):

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

# Section 0: Actions done before running this notebook
1- Took around 300 photos of cars with lables

2- Run script to scale images dimentions to 800x600 and Divide images to 80% train, 10% test, 10% validation

3- Use [labelImg](https://github.com/tzutalin/labelImg) for annotating "plate no" and "bus run" objects on each of train & test images

In [None]:
# These versions required to work with the scripts available in https://github.com/tensorflow/models/tree/master/research/object_detection
# for transfer learning
!pip install numpy==1.17.4
%tensorflow_version 1.x
!pip install --user gast==0.2.2

# Section 1: Mount Google drive
Mount my Google Drive to save modeling output files there, so that it won't be wiped out when colab Virtual Machine restarts.

In [None]:
import os
from google.colab import drive
drive.mount('/content/gdrive')
model_dir = '/content/gdrive/My Drive/UniMelb/Semester1_2020/Internship/SilverPond/FinalSource'
#!rm -rf '{model_dir}'
os.makedirs(model_dir, exist_ok=True)
!ls -ltra '{model_dir}'/..

# Section 2: Configs and Hyperparameters

Support a variety of models, you can find more pretrained model from [Tensorflow detection model zoo: COCO-trained models](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models), as well as their pipline config files in [object_detection/samples/configs/](https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs).

In [None]:
# Number of training steps
num_steps = 5000  # 200000

# Number of evaluation steps
num_eval_steps = 50


# model name and configs are from Model Zoo github: 
# https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md#coco-trained-models
# https://github.com/tensorflow/models/tree/master/research/object_detection/samples/configs
MODELS_CONFIG = {
    'ssdlite_mobilenet_v2': {
        'model_name': 'ssdlite_mobilenet_v2_coco_2018_05_09',
        'pipeline_file': 'ssdlite_mobilenet_v2_coco.config',
        'batch_size': 12
    },
    'ssd_mobilenet_v2_quantized': {
        'model_name': 'ssd_mobilenet_v2_quantized_300x300_coco_2019_01_03',
        'pipeline_file': 'ssd_mobilenet_v2_quantized_300x300_coco.config',
        'batch_size': 12
    },
    #http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v3_small_coco_2020_01_14.tar.gz
    'ssd_mobilenet_v3_small_coco': {
        'model_name': 'ssd_mobilenet_v3_small_coco_2020_01_14',
        'pipeline_file': 'ssdlite_mobilenet_v3_small_320x320_coco.config',
        'batch_size': 12
    },
    #https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssdlite_mobilenet_v3_large_320x320_coco.config
    #http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v3_large_coco_2020_01_14.tar.gz
    'ssd_mobilenet_v3_large_coco': {
        'model_name': 'ssd_mobilenet_v3_large_coco_2020_01_14',
        'pipeline_file': 'ssdlite_mobilenet_v3_large_320x320_coco.config',
        'batch_size': 12
    },
    #https://storage.cloud.google.com/mobilenet_edgetpu/checkpoints/ssdlite_mobilenet_edgetpu_coco_quant.tar.gz
    'ssdlite_mobilenet_edgetpu_coco_quant': {
        'model_name': 'ssdlite_mobilenet_edgetpu_coco_quant',
        'pipeline_file': 'ssdlite_mobilenet_edgetpu_320x320_coco_quant.config',
        'batch_size': 12
    },
    'faster_rcnn_inception_v2': {
        'model_name': 'faster_rcnn_inception_v2_coco_2018_01_28',
        'pipeline_file': 'faster_rcnn_inception_v2_pets.config',
        'batch_size': 12
    },
    'rfcn_resnet101': {
        'model_name': 'rfcn_resnet101_coco_2018_01_28',
        'pipeline_file': 'rfcn_resnet101_pets.config',
        'batch_size': 12
    }
}

# Select a model in MODELS_CONFIG
# Note: Must be a quantized model, which reduces the model size significantly for mobile/edge use
selected_model = 'ssd_mobilenet_v2_quantized'
#selected_model = 'ssd_mobilenet_v3_large_coco'  # used 15K training steps to get better results
#selected_model = 'ssd_mobilenet_v3_small_coco'  # didn't give good results
#selected_model = 'ssdlite_mobilenet_v2'  # the False Positives for plate numbers were almost in every images

# Name of the object detection model to use.
MODEL = MODELS_CONFIG[selected_model]['model_name']

# Name of the pipline file in tensorflow object detection API.
pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file']

# Training batch size fits in Colabe's Tesla K80 GPU memory for selected model.
batch_size = MODELS_CONFIG[selected_model]['batch_size']

# Section 3: Set up Training Environment

In [None]:
# Where the new trained model will be saved
from datetime import datetime
curdate = datetime.now().strftime("%Y_%m_%d")

trainedmodel_dir = os.path.join(model_dir, 'trainedmodel', selected_model, curdate)
os.makedirs(trainedmodel_dir, exist_ok = True)

print(trainedmodel_dir)

## Install required packages

In [None]:
%cd '{model_dir}'
#!git clone --quiet https://github.com/tensorflow/models.git
print('installing protobuf')
!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk
print('installing Cython')
!pip install -q Cython contextlib2 pillow lxml matplotlib
print('installing pycocotools')
!pip install -q pycocotools
print('run protoc')
%cd '{model_dir}/models/research'
!protoc object_detection/protos/*.proto --python_out=.
print('setting environment variable')
import os
os.environ['PYTHONPATH'] += ':/content/gdrive/My Drive/UniMelb/Semester1_2020/Internship/SilverPond/FinalSource/models/research/:/content/gdrive/My Drive/UniMelb/Semester1_2020/Internship/SilverPond/FinalSource/models/research/slim/'

print('run model_builder_test')
# To verify all dependencies are successfull installed
#!python object_detection/builders/model_builder_test.py

## Prepare `tfrecord` files

Use the following scripts to generate the `tfrecord` files which is used for model training and evaluation.

In [None]:
%cd {model_dir}
print("train_cars xml_to_csv")
# Convert train folder annotation xml files to a single csv file,
# generate the `label_map.pbtxt` file to `data/` directory as well.
!python code/xml_to_csv.py -i Data/train_cars -o Data/annotations/train_labels.csv -l Data/annotations
print("test_cars xml_to_csv")
# Convert test folder annotation xml files to a single csv.
!python code/xml_to_csv.py -i Data/test_cars -o Data/annotations/test_labels.csv

print("train_labels generate_tfrecord")
# Generate `train.record`
!python code/generate_tfrecord.py --csv_input=Data/annotations/train_labels.csv --output_path=Data/annotations/train.record --img_path=Data/train_cars --label_map Data/annotations/label_map.pbtxt
print("test_labels generate_tfrecord")
# Generate `test.record`
!python code/generate_tfrecord.py --csv_input=Data/annotations/test_labels.csv --output_path=Data/annotations/test.record --img_path=Data/test_cars --label_map Data/annotations/label_map.pbtxt

In [None]:
test_record_fname = model_dir + '/Data/annotations/test.record'
train_record_fname = model_dir + '/Data/annotations/train.record'
label_map_pbtxt_fname = model_dir + '/Data/annotations/label_map.pbtxt'

In [None]:
!cat Data/annotations/test_labels.csv

## Download base model

In [None]:
%cd '{model_dir}/models/research'

In [None]:
import os
import shutil
import glob
import urllib.request
import tarfile
MODEL_FILE = MODEL + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
DEST_DIR = model_dir+'/models/research/pretrained_model'

# commented if already done
#'''
print('MODEL_FILE=', MODEL_FILE)
if not (os.path.exists(MODEL_FILE)):
    urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)

print('extracting model file')
tar = tarfile.open(MODEL_FILE)
tar.extractall()
tar.close()

os.remove(MODEL_FILE)
if (os.path.exists(DEST_DIR)):
    shutil.rmtree(DEST_DIR)

os.rename(MODEL, DEST_DIR)
#'''

In [None]:
!pwd

In [None]:
!echo '{DEST_DIR}'
!ls -alh '{DEST_DIR}'

In [None]:
fine_tune_checkpoint = os.path.join(DEST_DIR, "model.ckpt")
fine_tune_checkpoint

# Section 4: Transfer Learning Training

## Configuring a Training Pipeline

In [None]:
import os
pipeline_fname = os.path.join(model_dir+'/models/research/object_detection/samples/configs/', pipeline_file)
print(pipeline_fname)
assert os.path.isfile(pipeline_fname), '`{}` not exist'.format(pipeline_fname)

In [None]:
def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

In [None]:
# model's .config file is protocol buffer file so can be edited with via google.protobuf
import sys
import tensorflow as tf
from google.protobuf import text_format
from object_detection.protos import pipeline_pb2

pipeline_config = pipeline_pb2.TrainEvalPipelineConfig()

# read pipeline config file
with tf.gfile.GFile(pipeline_fname, "r") as f:
  proto_str = f.read()
  text_format.Merge(proto_str, pipeline_config)

# pipeline_config will have all config parameters that can be overwritten
pipeline_config.train_input_reader.tf_record_input_reader.input_path[0] = train_record_fname
pipeline_config.eval_input_reader[0].tf_record_input_reader.input_path[0] = test_record_fname

pipeline_config.train_input_reader.label_map_path = label_map_pbtxt_fname
pipeline_config.eval_input_reader[0].label_map_path = label_map_pbtxt_fname

if pipeline_config.train_config.fine_tune_checkpoint:
  pipeline_config.train_config.fine_tune_checkpoint = fine_tune_checkpoint
pipeline_config.train_config.batch_size = batch_size
pipeline_config.train_config.num_steps = num_steps

# Depending on the base model type, currently there is either ssd or faster_rcnn
num_classes = get_num_classes(label_map_pbtxt_fname)
if pipeline_config.model.ssd:
  pipeline_config.model.ssd.num_classes = num_classes
elif pipeline_config.model.faster_rcnn:
  pipeline_config.model.faster_rcnn.num_classes = num_classes

# Save updated config
config_text = text_format.MessageToString(pipeline_config)
with tf.gfile.Open(pipeline_fname, "wb") as f:
    f.write(config_text)


In [None]:
!cat '{label_map_pbtxt_fname}'

In [None]:
# look for num_classes: 6, since we have 5 different road signs and 1 person type (total of 6 types) 
!cat '{pipeline_fname}'

## Run Tensorboard

In [None]:
!wget https://bin.equinox.io/c/4VmDzA7iaHb/ngrok-stable-linux-amd64.zip
!unzip -o ngrok-stable-linux-amd64.zip

In [None]:
!pwd

In [None]:
LOG_DIR = model_dir#+'Rerun'
get_ipython().system_raw(
    'tensorboard --logdir "{}" --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)

In [None]:
LOG_DIR

In [None]:
get_ipython().system_raw('./ngrok http 6006 &')

### Get Tensorboard link

In [None]:
! curl -s http://localhost:4040/api/tunnels | python3 -c \
    "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

## Train the model

Now all inputs are set up, just train the model.   This process may take a few hours.   Since we are saving the model training results (model.ckpt-* files) in our google drive (a persistent storage that will survice the restart of our colab VM instance), we can safely leave and return a few hours later. 

In [None]:
#################### SEND ALERT EMAIL AT FINISH WITH GMAIL OPTIONAL #####################
# To send email from Python from your google account, MUST 
# 1) Enable less secure app
# https://myaccount.google.com/lesssecureapps
# 2) Disable Unlock Capcha
# https://accounts.google.com/b/0/DisplayUnlockCaptcha

import smtplib

def SendEmail(msg):
    with open('/content/gdrive/My Drive/Colab Notebooks/pw.txt') as file:
        data = file.readlines()
        
    gmail_user = 'gehmaid@student.unimelb.edu.au'  
    gmail_password = data[0]


    sent_from = gmail_user  
    to = ['gehmaid@student.unimelb.edu.au']  
    subject = msg  
    body = '%s\n\n- Ghawady' % msg

    email_text = \
"""From: %s
To: %s
Subject: %s

%s
""" % (sent_from, ", ".join(to), subject, body)

    server = smtplib.SMTP("smtp.gmail.com", 587)
    server.ehlo()
    server.starttls()
    server.login(gmail_user, gmail_password)
    server.sendmail(sent_from, to, email_text)
    server.quit()

    print(f'Email: \n{email_text}')
    

In [None]:
num_steps = 5000
SendEmail("Colab train started")
!python '{model_dir}'/models/research/object_detection/model_main.py \
    --pipeline_config_path='{pipeline_fname}' \
    --model_dir='{trainedmodel_dir}' \
    --alsologtostderr \
    --num_train_steps='{num_steps}' \
    --num_eval_steps='{num_eval_steps}'
SendEmail("Colab train finished")

In [None]:
!ls -ltra '{trainedmodel_dir}'

# Section 5: Save and Convert Model Output

## Exporting a Trained Inference Graph
Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to perform the object detection. This can be done as follows:

In [None]:
import os
import re
import numpy as np

output_directory = trainedmodel_dir + '/fine_tuned_model'
os.makedirs(output_directory, exist_ok=True)
!ls '{output_directory}'

In [None]:
lst = os.listdir(trainedmodel_dir)
# find the last model checkpoint file, i.e. model.ckpt-1000.meta
lst = [l for l in lst if 'model.ckpt-' in l and '.meta' in l]
steps=np.array([int(re.findall('\d+', l)[0]) for l in lst])
last_model = lst[steps.argmax()].replace('.meta', '')

last_model_path = os.path.join(trainedmodel_dir, last_model)
print(last_model_path)

In [None]:
!echo creates the frozen inference graph in fine_tune_model
# there is an "Incomplete shape" message.  but we can safely ignore that. 

!python '{model_dir}'/models/research/object_detection/export_inference_graph.py \
    --input_type=image_tensor \
    --pipeline_config_path='{pipeline_fname}' \
    --output_directory='{output_directory}' \
    --trained_checkpoint_prefix='{last_model_path}'

In [None]:
# https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193
# create the tensorflow lite graph
!python '{model_dir}'/models/research/object_detection/export_tflite_ssd_graph.py \
    --pipeline_config_path='{pipeline_fname}' \
    --trained_checkpoint_prefix='{last_model_path}' \
    --output_directory='{output_directory}' \
    --add_postprocessing_op=true

In [None]:
# https://medium.com/tensorflow/training-and-serving-a-realtime-mobile-object-detector-in-30-minutes-with-cloud-tpus-b78971cf1193
# create the tensorflow lite graph
!python '{model_dir}'/models/research/object_detection/export_tflite_ssd_graph.py \
    --pipeline_config_path='{pipeline_fname}' \
    --trained_checkpoint_prefix='{last_model_path}' \
    --output_directory='{output_directory}' \
    --add_postprocessing_op=true

In [None]:
!echo "CONVERTING frozen graph to quantized TF Lite file..."
!tflite_convert \
  --output_file='{output_directory}/detect.tflite' \
  --graph_def_file='{output_directory}/tflite_graph.pb' \
  --inference_type=QUANTIZED_UINT8 \
  --input_arrays='normalized_input_image_tensor' \
  --output_arrays='TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3' \
  --mean_values=128 \
  --std_dev_values=128 \
  --input_shapes=1,300,300,3 \
  --change_concat_input_ranges=false \
  --allow_nudging_weights_to_use_fast_gemm_kernel=true \
  --allow_custom_ops
  #--output_arrays represent four arrays: detection_boxes, detection_classes, detection_scores, and num_detections.

In [None]:
'''
!echo "CONVERTING frozen graph to quantized TF Lite file..."
!tflite_convert \
  --output_file='{output_directory}/bus_labels_quantized.tflite' \
  --graph_def_file='{output_directory}/tflite_graph.pb' \
  --inference_type=QUANTIZED_UINT8 \
  --input_arrays='normalized_input_image_tensor' \
  --output_arrays='TFLite_Detection_PostProcess,TFLite_Detection_PostProcess:1,TFLite_Detection_PostProcess:2,TFLite_Detection_PostProcess:3' \
  --mean_values=128 \
  --std_dev_values=128 \
  --input_shapes=1,300,300,3 \
  --change_concat_input_ranges=false \
  --allow_nudging_weights_to_use_fast_gemm_kernel=true \
  --allow_custom_ops
'''

In [None]:
print(output_directory)
!ls -ltra '{output_directory}'
pb_fname = os.path.join(os.path.abspath(output_directory), "frozen_inference_graph.pb")  # this is tflite graph
!cp '{label_map_pbtxt_fname}' '{output_directory}'

# Section 6: Run inference test
Test with images in repository `object_detection/data/images/test` directory.

### Reinitialize required variables incase this notebook is not run from the beginning

In [None]:
%matplotlib inline
%tensorflow_version 1.x
!pip install numpy==1.17.4
!pip install --user gast==0.2.2
#!pip install imutils

In [None]:
'''
from google.colab import drive
drive.mount('/content/gdrive')
model_dir = '/content/gdrive/My Drive/UniMelb/Semester1_2020/Internship/SilverPond/FinalSource'

test_record_fname = model_dir + '/Data/annotations/test.record'
train_record_fname = model_dir + '/Data/annotations/train.record'
label_map_pbtxt_fname = model_dir + '/Data/annotations/label_map.pbtxt'
num_classes = 2

selected_model = 'ssd_mobilenet_v3_small_coco' #'ssd_mobilenet_v2_quantized'
curdate = '2020_04_29'
trainedmodel_dir = os.path.join(model_dir, 'trainedmodel', selected_model, curdate)
os.makedirs(trainedmodel_dir, exist_ok = True)

'''

### Load model & test images for inference

In [None]:
import os
import glob

output_directory = trainedmodel_dir+'/fine_tuned_model'
pb_fname = os.path.join(os.path.abspath(output_directory), "frozen_inference_graph.pb")  # this is tflite graph

# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = pb_fname
print(PATH_TO_CKPT)

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = label_map_pbtxt_fname

# If you want to test the code with your images, just add images files to the PATH_TO_TEST_IMAGES_DIR.
PATH_TO_TEST_IMAGES_DIR =  os.path.join(model_dir, "Data/val_cars")
print(PATH_TO_TEST_IMAGES_DIR)

assert os.path.isfile(pb_fname)
assert os.path.isfile(PATH_TO_LABELS)
TEST_IMAGE_PATHS = glob.glob(os.path.join(PATH_TO_TEST_IMAGES_DIR, "*.jpg"))
assert len(TEST_IMAGE_PATHS) > 0, 'No image found in `{}`.'.format(PATH_TO_TEST_IMAGES_DIR)
print(TEST_IMAGE_PATHS)

### Inference

In [None]:
%cd '{model_dir}'/models/research/object_detection

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops


# This is needed to display the images.
%matplotlib inline


from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util


detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')


label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
    label_map, max_num_classes=num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)

# Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)


def run_inference_for_single_image(image, graph):
    with graph.as_default():
        with tf.Session() as sess:
            # Get handles to input and output tensors
            ops = tf.get_default_graph().get_operations()
            all_tensor_names = {
                output.name for op in ops for output in op.outputs}
            tensor_dict = {}
            for key in [
                'num_detections', 'detection_boxes', 'detection_scores',
                'detection_classes', 'detection_masks'
            ]:
                tensor_name = key + ':0'
                if tensor_name in all_tensor_names:
                    tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
                        tensor_name)
            if 'detection_masks' in tensor_dict:
                # The following processing is only for single image
                detection_boxes = tf.squeeze(
                    tensor_dict['detection_boxes'], [0])
                detection_masks = tf.squeeze(
                    tensor_dict['detection_masks'], [0])
                # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
                real_num_detection = tf.cast(
                    tensor_dict['num_detections'][0], tf.int32)
                detection_boxes = tf.slice(detection_boxes, [0, 0], [
                                           real_num_detection, -1])
                detection_masks = tf.slice(detection_masks, [0, 0, 0], [
                                           real_num_detection, -1, -1])
                detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
                    detection_masks, detection_boxes, image.shape[0], image.shape[1])
                detection_masks_reframed = tf.cast(
                    tf.greater(detection_masks_reframed, 0.5), tf.uint8)
                # Follow the convention by adding back the batch dimension
                tensor_dict['detection_masks'] = tf.expand_dims(
                    detection_masks_reframed, 0)
            image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

            # Run inference
            output_dict = sess.run(tensor_dict,
                                   feed_dict={image_tensor: np.expand_dims(image, 0)})
            #print('output_dict',output_dict)
            # all outputs are float32 numpy arrays, so convert types as appropriate
            output_dict['num_detections'] = int(
                output_dict['num_detections'][0])
            output_dict['detection_classes'] = output_dict[
                'detection_classes'][0].astype(np.uint8)
            output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
            output_dict['detection_scores'] = output_dict['detection_scores'][0]
            if 'detection_masks' in output_dict:
                output_dict['detection_masks'] = output_dict['detection_masks'][0]
            
    return output_dict

#### Draw bounding boxes on all test images

In [None]:
# running inferences.  This should show images with bounding boxes
%matplotlib inline

print('Running inferences on %s' % TEST_IMAGE_PATHS)

for image_path in TEST_IMAGE_PATHS:
    image = Image.open(image_path)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    image_np = load_image_into_numpy_array(image)
    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)
    # Actual detection.
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    #print('Returned output_dict',output_dict)
    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks'),
        use_normalized_coordinates=True,
        line_thickness=2)
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image_np)


### Bounding box image crop

In [None]:
def get_bounding_box_coordinates(image,
                               ymin,
                               xmin,
                               ymax,
                               xmax,
                               use_normalized_coordinates=True):
  
  im_width, im_height = image.size
  if use_normalized_coordinates:
    (left, right, top, bottom) = (int(xmin * im_width), int(xmax * im_width),
                                  int(ymin * im_height), int(ymax * im_height))
  else:
    (left, right, top, bottom) = (xmin, xmax, ymin, ymax)

  return (left, top, right, bottom)  # tuple order matters for cropping


### Read Text from cropped image

#### Use pytesseract for Text Read (Bad)

config reference:

https://www.pyimagesearch.com/2018/09/17/opencv-ocr-and-text-recognition-with-tesseract/

In [None]:
!apt install tesseract-ocr
!apt install libtesseract-dev
!pip install pytesseract

In [None]:
import pytesseract
from pytesseract import Output
import cv2

import re

def read_image_text(img):
  print('image', type(img))
  img_np = load_image_into_numpy_array(img)
  crop = cv2.cvtColor(img_np, cv2.COLOR_RGB2GRAY)
  
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(crop)
  crop = cv2.GaussianBlur(crop, (5, 5), 0)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(crop)
  #_, crop = cv2.threshold(crop, 100, 150, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
  #plt.figure(figsize=IMAGE_SIZE)
  #plt.imshow(crop)

  print('1=',pytesseract.image_to_string(crop, config='-l eng --oem 1 --psm 7'))
  print('2=',pytesseract.image_to_string(crop, config='--oem 3 --psm 3'))
  print('3=',pytesseract.image_to_string(crop))
  print('org 3=',pytesseract.image_to_data(crop, output_type=Output.DICT))
  print('org 1=',pytesseract.image_to_string(image_np, config='-l eng --oem 1 --psm 7'))
  print('org 2=',pytesseract.image_to_string(image_np, config='--oem 3 --psm 3'))
  print('org 3=',pytesseract.image_to_string(image_np))
  print('org 3=',pytesseract.image_to_data(image_np, output_type=Output.DICT))

  custom_config = r'-l eng --oem 1 --psm 7' #r'--oem 3 --psm 3'
  text = pytesseract.image_to_string(img, config=custom_config)
  print('text from image', text)
  return text


def read_image_text(img):
  (left, top, right, bottom) = (0,25,img.size[0],img.size[1])   
  # Grayscale, Gaussian blur, Otsu's threshold
  image = img #.crop((left, top, right, bottom))
  image = load_image_into_numpy_array(image)
  cv2.imwrite(os.path.join(model_dir,'orgcrop.jpg'),image)
  image = cv2.resize(image,(0,0),fx=7,fy=7)
  plt.imshow(image)
  cv2.imwrite(os.path.join(model_dir,'resized.jpg'),image)
  gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  cv2.imwrite(os.path.join(model_dir,'grayscale.jpg'),gray)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(gray)
  blur = cv2.GaussianBlur(gray, (3,3), 0)
  cv2.imwrite(os.path.join(model_dir,'GaussianBlur.jpg'),blur)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(blur)
  #thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
  #plt.figure(figsize=IMAGE_SIZE)
  #plt.imshow(thresh)

  # Morph open to remove noise and invert image
  kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
  opening = cv2.morphologyEx(blur, cv2.MORPH_OPEN, kernel, iterations=1)
  cv2.imwrite(os.path.join(model_dir,'morphologyEx.jpg'),opening)
  invert = 255 - opening
  cv2.imwrite(os.path.join(model_dir,'invert.jpg'),invert)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(invert)
  print('first tiral')
  #oem: OCR Engine modes => 1 - Neural nets LSTM engine only
  #psm: Page Segmentation Mode ==> 
  #     3    Fully automatic page segmentation, but no OSD. (Default)
  #     6    Assume a single uniform block of text.
  #     7    Treat the image as a single text line.
  print(pytesseract.image_to_string(invert, lang='eng', config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789'))
  print('second trial',pytesseract.image_to_string(invert, config="-c tessedit_char_whitelist=0123456789 -oem 0"))
  print('try different read config')


	# in order to apply Tesseract v4 to OCR text we must supply
	# (1) a language, 
  # (2) an OEM flag of 1, indicating that the we
	# wish to use the LSTM neural net model for OCR,
	# (3) an OEM value, in this case, 7 which implies that we are
	# treating the ROI as a single line of text
  config = ("-l eng --oem 1 --psm 7")
  text = pytesseract.image_to_string(image, config=config)

	# strip out non-ASCII text so we can draw the text on the image
	# using OpenCV, then draw the text and a bounding box surrounding
	# the text region of the input image
  #text = "".join([c if ord(c) < 128 else "" for c in text]).strip()
  
  print('latest article::::',text)
  text = pytesseract.image_to_string(image, config=config)

  # Perform text extraction
  #for i in [1,3,4,5,6,7,8,9,10,11,12,13]:
  #  data = pytesseract.image_to_string(invert, lang='eng', config='--psm '+str(i))
  #  print('--psm '+str(i), data)
  
  print('antoher tuning')
  retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY)
  cv2.imwrite(os.path.join(model_dir,'THRESH_BINARY.jpg'),image)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image)
  image = cv2.GaussianBlur(image,(11,11),0)
  cv2.imwrite(os.path.join(model_dir,'GaussianBlur2.jpg'),image)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image)
  image = cv2.medianBlur(image,9)
  cv2.imwrite(os.path.join(model_dir,'medianBlur.jpg'),image)
  plt.figure(figsize=IMAGE_SIZE)
  plt.imshow(image)
  data = pytesseract.image_to_string(image, lang='eng')
  print('last trial',pytesseract.image_to_string(image, config="-c tessedit_char_whitelist=0123456789 -oem 0"))
  

  return data


def read_image_text(img, plot=False):
  readings = []
  config = ("-l eng --oem 1 --psm 7")
  #(left, top, right, bottom) = (0,25,img.size[0],img.size[1])   
  # Grayscale, Gaussian blur, Otsu's threshold
  image = img #.crop((left, top, right, bottom))
  image = load_image_into_numpy_array(image)
  #cv2.imwrite(os.path.join(model_dir,'orgcrop.jpg'),image)
  #print('original image',pytesseract.image_to_string(image, config=config))
  readings.append(pytesseract.image_to_string(image, config=config))
  
  image = cv2.resize(image,(0,0),fx=7,fy=7)
  if plot:
    plt.imshow(image)
  #cv2.imwrite(os.path.join(model_dir,'resized.jpg'),image)
  #print('rescale image',pytesseract.image_to_string(image, config=config))
  readings.append(pytesseract.image_to_string(image, config=config))
  
  gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
  #cv2.imwrite(os.path.join(model_dir,'grayscale.jpg'),gray)
  if plot:
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(gray)
  #print('gray scale',pytesseract.image_to_string(gray, config=config))
  readings.append(pytesseract.image_to_string(gray, config=config))

  blur = cv2.GaussianBlur(gray, (3,3), 0)
  #cv2.imwrite(os.path.join(model_dir,'GaussianBlur.jpg'),blur)
  if plot:
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(blur)
  #print('blur',pytesseract.image_to_string(blur, config=config))
  readings.append(pytesseract.image_to_string(blur, config=config))

  #thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
  #plt.figure(figsize=IMAGE_SIZE)
  #plt.imshow(thresh)

  # Morph open to remove noise and invert image
  kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
  opening = cv2.morphologyEx(blur, cv2.MORPH_OPEN, kernel, iterations=1)
  #cv2.imwrite(os.path.join(model_dir,'morphologyEx.jpg'),opening)
  #print('morphologyEx',pytesseract.image_to_string(kernel, config=config))
  readings.append(pytesseract.image_to_string(kernel, config=config))

  invert = 255 - opening
  #cv2.imwrite(os.path.join(model_dir,'invert.jpg'),invert)
  if plot:
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(invert)
  #print('invert',pytesseract.image_to_string(invert, config=config))
  readings.append(pytesseract.image_to_string(invert, config=config))
  
  #print('another tuning')
  retval, image = cv2.threshold(image,200,255, cv2.THRESH_BINARY)
  #cv2.imwrite(os.path.join(model_dir,'THRESH_BINARY.jpg'),image)
  if plot:
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image)
  #print('THRESH_BINARY',pytesseract.image_to_string(image, config=config))
  readings.append(pytesseract.image_to_string(image, config=config))

  image = cv2.GaussianBlur(image,(11,11),0)
  #cv2.imwrite(os.path.join(model_dir,'GaussianBlur2.jpg'),image)
  if plot:
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image)
  #print('GaussianBlur2',pytesseract.image_to_string(image, config=config))
  readings.append(pytesseract.image_to_string(image, config=config))

  image = cv2.medianBlur(image,9)
  #cv2.imwrite(os.path.join(model_dir,'medianBlur.jpg'),image)
  if plot:
    plt.figure(figsize=IMAGE_SIZE)
    plt.imshow(image)
  #print('medianBlur',pytesseract.image_to_string(image, config=config))
  readings.append(pytesseract.image_to_string(image, config=config))

  data = pytesseract.image_to_string(image, lang='eng')
  #print('last trial',data)
  #print('last trial',pytesseract.image_to_string(image, config="-c tessedit_char_whitelist=0123456789 -oem 1"))
  readings.append(pytesseract.image_to_string(image, config="-c tessedit_char_whitelist=0123456789 -oem 1"))
  

  return readings

In [None]:
# running inferences.  This should show images with bounding boxes
%matplotlib inline

import pandas

consolidated_results = []
print('Running inferences on %s' % TEST_IMAGE_PATHS)

# Set the default threshold for accepting the bounding box
min_score_thresh=.5

for image_path in TEST_IMAGE_PATHS:
    print('processing image: ',image_path)
    image = Image.open(image_path)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    image_np = load_image_into_numpy_array(image)

    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)

    # Actual detection
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    
    # Read detection Results:
    # Predicted boxes for crop coordinates 
    # box coordinates (ymin, xmin, ymax, xmax) are relative to the image
    boxes = output_dict['detection_boxes']
    # Scores needed to filtered accepted objects based on threshold
    scores = output_dict['detection_scores']

    # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
        image_np,
        output_dict['detection_boxes'],
        output_dict['detection_classes'],
        output_dict['detection_scores'],
        category_index,
        instance_masks=output_dict.get('detection_masks'),
        use_normalized_coordinates=True,
        line_thickness=2)
    
    #plt.figure(figsize=IMAGE_SIZE)
    #plt.imshow(image_np)

    # Reset image array for content reading
    image_np = load_image_into_numpy_array(image)

    for i in range(boxes.shape[0]):
        # Filter based on threshold
        if scores is None or scores[i] > min_score_thresh:
            # boxes[i] is the box which will be drawn
            ymin, xmin, ymax, xmax = tuple(boxes[i].tolist())
            class_name = category_index[output_dict['detection_classes'][i]]['name']

            bounding_box_coordinates = get_bounding_box_coordinates(image, ymin, xmin, ymax, xmax)

            # Crop image based on bounding box
            cropped_img = image.crop(bounding_box_coordinates)
            #plt.figure(figsize=IMAGE_SIZE)
            #plt.imshow(cropped_img)

            #read text from cropped image
            readings = read_image_text(cropped_img)
            print('row',[os.path.basename(image_path),class_name,scores[i]]+readings)
            consolidated_results.append([os.path.basename(image_path),class_name,scores[i]]+readings)

            #print ("This box is gonna get used", ymin, xmin, ymax, xmax ,boxes[i], output_dict['detection_classes'][i])
            #im_width, im_height = image.size

df = pandas.DataFrame(consolidated_results,columns=['image','object','Score','original','resized','grayscale','GaussianBlur','morphologyEx','invert','THRESH_BINARY','GaussianBlur2','medianBlur','numOnly'])


In [None]:
df

####Use Google Vision API

https://cloud.google.com/vision/docs/ocr

In [None]:
!pip install --upgrade google-api-python-client

**Getting a Google API Credential**

Visit API console, choose "Credentials" on the left-hand menu. Choose "Create Credentials" and generate an API key for your application. Ideally restrict it by IP/domain, but for now, just left if blank.
Enter the key in this first executable cell:

In [None]:
import getpass
# API Key 
APIKEY = getpass.getpass()
#Need to 

In [None]:
# Import the base64 encoding library.
import base64
from io import BytesIO

# Get base64 encoding of image data
def encode_image(image):
  buffered = BytesIO()
  image.save(buffered, format="JPEG")
  img_str = base64.b64encode(buffered.getvalue()).decode('utf-8')
  return img_str

# Run Vision API
def gvision_detect_text(img):
  from googleapiclient.discovery import build
  vservice = build('vision', 'v1', developerKey=APIKEY)
  request = vservice.images().annotate(body={
          'requests': [{
                  'image': {
                      'content': encode_image(img)
                  },
                  'features': [{
                      'type': 'TEXT_DETECTION',
                      'maxResults': 3
                  }]
              }],
          })
  responses = request.execute(num_retries=3)
  try:
    return responses['responses'][0]['textAnnotations'][0]['description']
  except:
    print('Error occured while fetching results from API')
    return ''

In [None]:
# Process text read to fetch only the standard plate number without surrounding text
# https://en.wikipedia.org/wiki/Vehicle_registration_plates_of_Australia
import re

def process_plateno_text(text):
  # TODO: Need enhacement, for now only identify two sets of alphanumeric with space
  m = re.search('( ?[a-zA-Z0-9]){1,9} ( ?[a-zA-Z0-9]){1,9}', text)
  if m:
      return m.group()
  return ''

def process_busrun_text(text):
  # Filter only numbers
  m = re.search('( ?[0-9]){1,9}', text)
  if m:
      return m.group()
  return ''

print('plateno',process_plateno_text('ngscars\n10V 4VX\nVIC\nVICTORIA THE EDUCATION STATE\n'))
print('busrun',process_busrun_text('10V\n'))


In [None]:
# running inferences.  This should show images with bounding boxes
%matplotlib inline

import pandas

consolidated_results = []
print('Running inferences on %s' % TEST_IMAGE_PATHS)

# Set the default threshold for accepting the bounding box
min_score_thresh=.5

for image_path in TEST_IMAGE_PATHS:
    print('processing image: ',image_path)
    image = Image.open(image_path)
    # the array based representation of the image will be used later in order to prepare the
    # result image with boxes and labels on it.
    image_np = load_image_into_numpy_array(image)

    # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
    image_np_expanded = np.expand_dims(image_np, axis=0)

    # Actual detection.
    output_dict = run_inference_for_single_image(image_np, detection_graph)
    
    # Read detection Results:
    # Predicted boxes for crop coordinates 
    # box coordinates (ymin, xmin, ymax, xmax) are relative to the image
    boxes = output_dict['detection_boxes']
    # Scores needed to filtered accepted objects based on threshold
    scores = output_dict['detection_scores']

    for i in range(boxes.shape[0]):
        # Filter based on threshold
        if scores is None or scores[i] > min_score_thresh:
            # boxes[i] is the box which will be drawn
            ymin, xmin, ymax, xmax = tuple(boxes[i].tolist())
            class_name = category_index[output_dict['detection_classes'][i]]['name']

            bounding_box_coordinates = get_bounding_box_coordinates(image, ymin, xmin, ymax, xmax)

            # Crop image based on bounding box
            cropped_img = image.crop(bounding_box_coordinates)
            #plt.figure(figsize=IMAGE_SIZE)
            #plt.imshow(cropped_img)

            #read text from cropped image
            readings = gvision_detect_text(cropped_img)
            #fine tune reading with appropriate regex
            if class_name == 'PlateNo':
              readings = process_plateno_text(readings)
            elif class_name == 'BusRun':
              readings = process_busrun_text(readings)

            if readings:
              #print('row',[os.path.basename(image_path),class_name,scores[i],readings])
              consolidated_results.append([os.path.basename(image_path),class_name,scores[i],readings])
            else:
              print('nothing returned',readings)

            #print ("This box is gonna get used", ymin, xmin, ymax, xmax ,boxes[i], output_dict['detection_classes'][i])
            #im_width, im_height = image.size

df2 = pandas.DataFrame(consolidated_results,columns=['image','object','Score','reading'])


In [None]:
df2

# Section 7: Further Enhacements:

- Try other object detection pretrained models from the available Zoo list
- Adjust Training Hyper parameters
- Train with more labeled images 
- Augment train images (different zooms, rotations, contrast, lighting) (Currently only default is used in the pipeline - horizontal flip & random crop is used)
- Explore other multi digit models like SVHN https://arxiv.org/pdf/1312.6082.pdf https://github.com/penny4860/SVHN-deep-digit-detector
- Build own multi digit model & train from public dataset (won't be as good as google vision)
- Text post processing improvement (to ensure regex covers all standard AUS plates)