# Notebook Description

This notebook is referenced from the [Roboflow Tensorflow Object Detection notebook](https://blog.roboflow.com/training-a-tensorflow-object-detection-model-with-a-custom-dataset/), with modifications to fit the assignment.







##Selection of Framework
`TensorFlow` is used here instead of PyTorch or other frameworks as I am more familiar with TensorFlow. However, picking up PyTorch shouldn't be a problem.

##Selection of TensorFlow version
`TensorFlow 1` is used here for the following reasons:
- Roboflow has a fantastic tutorial on using MobileNet v2 SSD on object detection

##Selection of platform
`Google Colab` is used here as I do not have a GPU enabled machine.

##Selection of Model
Model selected: `MobileNet v2 SSD` 
MobileNet v2 SSD is selected as it goes inline with what I have learnt about object detection models like YOLO, SSD and Faster R-CNN. Among these, SSD on MobileNet v2 is efficient enough to be run on mobile devices and be accurate at the same time. However, in this notebook, the metrics are not exactly satisfactory. I have listed several things I could do to enhance the performance of the model, which also explain why the model is not performing great.


##Dataset
Custom dataset from Roboflow [here](https://public.roboflow.com/object-detection/license-plates-us-eu)

The dataset consists of:
- 245 training images
- 70 validation images
- 35 test images

All images are annotated with two classes:
- vehicle
- license_plate


##Test video
One of the test video is a personal video on license plate and cars, hence it won't be uploaded for public use.

However, the other test video used is a video from YouTube that can be found [here](https://www.youtube.com/watch?v=Z4eOnPTp2Aw)

##Task Accomplishment and Performance of Model




The trained model was tested on 2 test videos. The first test video is a test of video of my own and the second one is a trimmed video from the given video link. Due to limitations in Google Colab, I could only manage to run inference for a limited amount of time, hence the test videos are kept as short as possible.

In general, the license plate object detection task achieved a fairly good performance, with a lot more room to improve.

However, the license plate recognition is subpar due to lack of image processing after extraction of the ROI from the bounding boxes. With that acknowledged, image processing techniques can be employed to improve the recognition task. This should reduce the number of false recognition in the .txt file.


*Average precision: 0.323*

*Average recall: 0.345*

*Mean average precision (mAP): 0.3227*



##Future work
1. Number of epochs could be increased to increase precision, recall and mAP.
2. Number of images in the dataset could be increased:
  - by finding bigger and better datasets
  - by performing augmentation (can only augment a limited number of images before the model still overfits)
3. An attempt on using other detection algorithms like EfficientDet could be explored.



---



#Initializing several variables

In [None]:
# If you forked the repo, you can replace the link.
repo_url = 'https://github.com/roboflow-ai/tensorflow-object-detection-faster-rcnn'

# Number of training steps is set to 30000
# This number is due to limitations of using Google Colab,
num_steps = 30000  

# Number of evaluation steps.
num_eval_steps = 50

MODELS_CONFIG = {
    'ssd_mobilenet_v2': {
        'model_name': 'ssd_mobilenet_v2_coco_2018_03_29',
        'pipeline_file': 'ssd_mobilenet_v2_coco.config',
        'batch_size': 12
    },
    'faster_rcnn_inception_v2': {
        'model_name': 'faster_rcnn_inception_v2_coco_2018_01_28',
        'pipeline_file': 'faster_rcnn_inception_v2_pets.config',
        'batch_size': 12
    },
    'rfcn_resnet101': {
        'model_name': 'rfcn_resnet101_coco_2018_01_28',
        'pipeline_file': 'rfcn_resnet101_pets.config',
        'batch_size': 8
    },    
}

# From the list of models, MobileNet v2 SSD was chosen due to its speed especially in mobile devices
selected_model = 'ssd_mobilenet_v2'

# Name of the object detection model to use.
MODEL = MODELS_CONFIG[selected_model]['model_name']

# Name of the pipline file in tensorflow object detection API.
pipeline_file = MODELS_CONFIG[selected_model]['pipeline_file']

# Training batch size fits in Colabe's Tesla K80 GPU memory for selected model.
batch_size = MODELS_CONFIG[selected_model]['batch_size']

In [None]:
# use TF 1.x
%tensorflow_version 1.x

TensorFlow 1.x selected.


# Clone the `tensorflow-object-detection` repository

In [None]:
import os

%cd /content

repo_dir_path = os.path.abspath(os.path.join('.', os.path.basename(repo_url)))

!git clone {repo_url}
%cd {repo_dir_path}
!git pull

/content
Cloning into 'tensorflow-object-detection-faster-rcnn'...
remote: Enumerating objects: 885, done.[K
remote: Total 885 (delta 0), reused 0 (delta 0), pack-reused 885[K
Receiving objects: 100% (885/885), 24.83 MiB | 33.82 MiB/s, done.
Resolving deltas: 100% (428/428), done.
/content/tensorflow-object-detection-faster-rcnn
Already up to date.


# Install required packages

In [None]:
%cd /content
!git clone --quiet https://github.com/tensorflow/models.git

!pip install tf_slim

!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

!pip install lvis

%cd /content/models/research
!protoc object_detection/protos/*.proto --python_out=.

import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'

!python object_detection/builders/model_builder_test.py

/content
Collecting tf_slim
  Downloading tf_slim-1.1.0-py2.py3-none-any.whl (352 kB)
[K     |████████████████████████████████| 352 kB 27.8 MB/s 
Installing collected packages: tf-slim
Successfully installed tf-slim-1.1.0
Selecting previously unselected package python-bs4.
(Reading database ... 155160 files and directories currently installed.)
Preparing to unpack .../0-python-bs4_4.6.0-1_all.deb ...
Unpacking python-bs4 (4.6.0-1) ...
Selecting previously unselected package python-pkg-resources.
Preparing to unpack .../1-python-pkg-resources_39.0.1-2_all.deb ...
Unpacking python-pkg-resources (39.0.1-2) ...
Selecting previously unselected package python-chardet.
Preparing to unpack .../2-python-chardet_3.0.4-1_all.deb ...
Unpacking python-chardet (3.0.4-1) ...
Selecting previously unselected package python-six.
Preparing to unpack .../3-python-six_1.11.0-2_all.deb ...
Unpacking python-six (1.11.0-2) ...
Selecting previously unselected package python-webencodings.
Preparing to unpack .

# Prepare `tfrecord` files

The TFRecords files are generated using Roboflow's API, hence this step saves a lot of time.

Two TFRecords files are generated: train and test

In [None]:
%cd /content/tensorflow-object-detection-faster-rcnn/data

/content/tensorflow-object-detection-faster-rcnn/data


In [None]:
!curl -L "https://public.roboflow.com/ds/Rzi0cL7n3X?key=IoQpsWuYzr" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   891  100   891    0     0   1760      0 --:--:-- --:--:-- --:--:--  1760
100 10.1M  100 10.1M    0     0  13.5M      0 --:--:-- --:--:-- --:--:--  127M
Archive:  roboflow.zip
 extracting: README.dataset.txt      
 extracting: README.roboflow.txt     
   creating: test/
 extracting: test/Plates.tfrecord    
 extracting: test/Plates_label_map.pbtxt  
   creating: train/
 extracting: train/Plates.tfrecord   
 extracting: train/Plates_label_map.pbtxt  
   creating: valid/
 extracting: valid/Plates.tfrecord   
 extracting: valid/Plates_label_map.pbtxt  


In [None]:
# training set
%ls train

Plates_label_map.pbtxt  Plates.tfrecord


In [None]:
# test set
%ls test

Plates_label_map.pbtxt  Plates.tfrecord


In [None]:
test_record_fname = '/content/tensorflow-object-detection-faster-rcnn/data/test/Plates.tfrecord'
train_record_fname = '/content/tensorflow-object-detection-faster-rcnn/data/train/Plates.tfrecord'
label_map_pbtxt_fname = '/content/tensorflow-object-detection-faster-rcnn/data/train/Plates_label_map.pbtxt'

# Download base model

The model selected above will be downloaded here from TensorFlow model zoo

In [None]:
%cd /content/models/research

import os
import shutil
import glob
import urllib.request
import tarfile
MODEL_FILE = MODEL + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
DEST_DIR = '/content/models/research/pretrained_model'

if not (os.path.exists(MODEL_FILE)):
    urllib.request.urlretrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)

tar = tarfile.open(MODEL_FILE)
tar.extractall()
tar.close()

os.remove(MODEL_FILE)
if (os.path.exists(DEST_DIR)):
    shutil.rmtree(DEST_DIR)
os.rename(MODEL, DEST_DIR)

/content/models/research


In [None]:
!echo {DEST_DIR}
!ls -alh {DEST_DIR}

/content/models/research/pretrained_model
total 135M
drwxr-xr-x  3 345018 89939 4.0K Mar 30  2018 .
drwxr-xr-x 23 root   root  4.0K Feb 10 05:22 ..
-rw-r--r--  1 345018 89939   77 Mar 30  2018 checkpoint
-rw-r--r--  1 345018 89939  67M Mar 30  2018 frozen_inference_graph.pb
-rw-r--r--  1 345018 89939  65M Mar 30  2018 model.ckpt.data-00000-of-00001
-rw-r--r--  1 345018 89939  15K Mar 30  2018 model.ckpt.index
-rw-r--r--  1 345018 89939 3.4M Mar 30  2018 model.ckpt.meta
-rw-r--r--  1 345018 89939 4.2K Mar 30  2018 pipeline.config
drwxr-xr-x  3 345018 89939 4.0K Mar 30  2018 saved_model


In [None]:
fine_tune_checkpoint = os.path.join(DEST_DIR, "model.ckpt")
fine_tune_checkpoint

'/content/models/research/pretrained_model/model.ckpt'

# Configuring a Training Pipeline

In [None]:
import os
pipeline_fname = os.path.join('/content/models/research/object_detection/samples/configs/', pipeline_file)

assert os.path.isfile(pipeline_fname), '`{}` not exist'.format(pipeline_fname)

In [None]:
# Helper function to get the total number of classes for the detection task
def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

In [None]:
import re

num_classes = get_num_classes(label_map_pbtxt_fname)
with open(pipeline_fname) as f:
    s = f.read()
with open(pipeline_fname, 'w') as f:
    
    # fine_tune_checkpoint
    s = re.sub('fine_tune_checkpoint: ".*?"',
               'fine_tune_checkpoint: "{}"'.format(fine_tune_checkpoint), s)
    
    # tfrecord files train and test.
    s = re.sub(
        '(input_path: ".*?)(train.record)(.*?")', 'input_path: "{}"'.format(train_record_fname), s)
    s = re.sub(
        '(input_path: ".*?)(val.record)(.*?")', 'input_path: "{}"'.format(test_record_fname), s)

    # label_map_path
    s = re.sub(
        'label_map_path: ".*?"', 'label_map_path: "{}"'.format(label_map_pbtxt_fname), s)

    # Set training batch_size.
    s = re.sub('batch_size: [0-9]+',
               'batch_size: {}'.format(batch_size), s)

    # Set training steps, num_steps
    s = re.sub('num_steps: [0-9]+',
               'num_steps: {}'.format(num_steps), s)
    
    # Set number of classes num_classes.
    s = re.sub('num_classes: [0-9]+',
               'num_classes: {}'.format(num_classes), s)
    f.write(s)

In [None]:
!cat {pipeline_fname}

# SSD with Mobilenet v2 configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  ssd {
    num_classes: 2
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    anchor_generator {
      ssd_anchor_generator {
        num_layers: 6
        min_scale: 0.2
        max_scale: 0.95
        aspect_ratios: 1.0
        aspect_ratios: 2.0
        aspect_

In [None]:
model_dir = 'training/'
# Optionally remove content in output model directory to fresh start.
!rm -rf {model_dir}
os.makedirs(model_dir, exist_ok=True)

# Train the model

The model is trained for 30000 epochs.

Higher number of epochs could be used, but due to the limitations with Google Colab, 30000 seems to be the upper limit for which I could train the model.

In [None]:
!python /content/models/research/object_detection/model_main.py \
    --pipeline_config_path={pipeline_fname} \
    --model_dir={model_dir} \
    --alsologtostderr \
    --num_train_steps={num_steps} \
    --num_eval_steps={num_eval_steps}

Using TensorFlow backend.
W0210 05:23:11.894076 140036665313152 model_lib.py:841] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 30000
I0210 05:23:11.894387 140036665313152 config_util.py:552] Maybe overwriting train_steps: 30000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0210 05:23:11.894603 140036665313152 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0210 05:23:11.894768 140036665313152 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0210 05:23:11.894925 140036665313152 config_util.py:552] Maybe overwriting eval_num_epochs: 1
W0210 05:23:11.895116 140036665313152 model_lib.py:857] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs

In [None]:
# The final model checkpoint shows the number of epochs
!ls {model_dir}

checkpoint
eval_0
events.out.tfevents.1644470612.77315630fa79
export
graph.pbtxt
model.ckpt-23985.data-00000-of-00001
model.ckpt-23985.index
model.ckpt-23985.meta
model.ckpt-25498.data-00000-of-00001
model.ckpt-25498.index
model.ckpt-25498.meta
model.ckpt-27016.data-00000-of-00001
model.ckpt-27016.index
model.ckpt-27016.meta
model.ckpt-28529.data-00000-of-00001
model.ckpt-28529.index
model.ckpt-28529.meta
model.ckpt-30000.data-00000-of-00001
model.ckpt-30000.index
model.ckpt-30000.meta


# Exporting a Trained Inference Graph


Extracting the trained inference graph for the object detection task, after which it is downloaded to my local system before uploading to a Google Drive folder, so that the graph can be imported directly from Google Drive. The label map is uploaded to Google Drive too.

In [None]:
import re
import numpy as np

output_directory = './fine_tuned_model'

lst = os.listdir(model_dir)
lst = [l for l in lst if 'model.ckpt-' in l and '.meta' in l]
steps=np.array([int(re.findall('\d+', l)[0]) for l in lst])
last_model = lst[steps.argmax()].replace('.meta', '')

last_model_path = os.path.join(model_dir, last_model)
print(last_model_path)
!python /content/models/research/object_detection/export_inference_graph.py \
    --input_type=image_tensor \
    --pipeline_config_path={pipeline_fname} \
    --output_directory={output_directory} \
    --trained_checkpoint_prefix={last_model_path}

training/model.ckpt-30000
Using TensorFlow backend.
Instructions for updating:
Please use `layer.__call__` method instead.
W0210 08:46:14.592627 140160693356416 deprecation.py:323] From /usr/local/lib/python3.7/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0210 08:46:17.238635 140160693356416 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0210 08:46:17.410387 140160693356416 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0210 08:46:17.455281 140160693356416 convolutional_box_predictor.py:156] depth of additional conv before box predic

In [None]:
!ls {output_directory}

checkpoint			model.ckpt.index  saved_model
frozen_inference_graph.pb	model.ckpt.meta
model.ckpt.data-00000-of-00001	pipeline.config


# Download the model `.pb` file

In [None]:
import os

pb_fname = os.path.join(os.path.abspath(output_directory), "frozen_inference_graph.pb")
assert os.path.isfile(pb_fname), '`{}` not exist'.format(pb_fname)

In [None]:
!ls -alh {pb_fname}

-rw-r--r-- 1 root root 19M Feb 10 08:46 /content/models/research/fine_tuned_model/frozen_inference_graph.pb


Downloading the model file

In [None]:
from google.colab import files
files.download(pb_fname)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Downloading the label map file

In [None]:
from google.colab import files
files.download(label_map_pbtxt_fname)

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

# Inference on video

Before running inference, it is required to run the cells with the following title to install the necessary packages:
- Install the required packages
- Download the base model

##Mounting GDrive, installing and importing necessary packages

In [None]:
# Mounting my Google Drive to be able to import the model graph and the label map
from google.colab import drive
drive.mount("/content/drive")

Mounted at /content/drive


In [None]:
# Select TensorFlow 1 again if notebook is used for inferencing 
%tensorflow_version 1.x

To use Optical Character Recognition (OCR), PyTesseract is needed

In [None]:
!pip install pytesseract

Collecting pytesseract
  Downloading pytesseract-0.3.8.tar.gz (14 kB)
Building wheels for collected packages: pytesseract
  Building wheel for pytesseract (setup.py) ... [?25l[?25hdone
  Created wheel for pytesseract: filename=pytesseract-0.3.8-py2.py3-none-any.whl size=14070 sha256=28e5361e4daa0b4d0859ae989f8db56d63dcaca3a12ddd1b226de738768461f7
  Stored in directory: /root/.cache/pip/wheels/a4/89/b9/3f11250225d0f90e5454fcc30fd1b7208db226850715aa9ace
Successfully built pytesseract
Installing collected packages: pytesseract
Successfully installed pytesseract-0.3.8


In [None]:
!sudo apt install tesseract-ocr

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  cuda-command-line-tools-10-0 cuda-command-line-tools-10-1
  cuda-command-line-tools-11-0 cuda-compiler-10-0 cuda-compiler-10-1
  cuda-compiler-11-0 cuda-cuobjdump-10-0 cuda-cuobjdump-10-1
  cuda-cuobjdump-11-0 cuda-cupti-10-0 cuda-cupti-10-1 cuda-cupti-11-0
  cuda-cupti-dev-11-0 cuda-documentation-10-0 cuda-documentation-10-1
  cuda-documentation-11-0 cuda-documentation-11-1 cuda-gdb-10-0 cuda-gdb-10-1
  cuda-gdb-11-0 cuda-gpu-library-advisor-10-0 cuda-gpu-library-advisor-10-1
  cuda-libraries-10-0 cuda-libraries-10-1 cuda-libraries-11-0
  cuda-memcheck-10-0 cuda-memcheck-10-1 cuda-memcheck-11-0 cuda-nsight-10-0
  cuda-nsight-10-1 cuda-nsight-11-0 cuda-nsight-11-1 cuda-nsight-compute-10-0
  cuda-nsight-compute-10-1 cuda-nsight-compute-11-0 cuda-nsight-compute-11-1
  cuda-nsight-systems-10-1 cuda-nsight-systems-

In [None]:
%cd /content/models/research/object_detection

import numpy as np
import os
import cv2
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import time

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops


# This is needed to display the images.
%matplotlib inline


from object_detection.utils import label_map_util

from object_detection.utils import visualization_utils as vis_util

# This is needed to recognize and extract text from images
import pytesseract

/content/models/research/object_detection


##Cells containing helper functions to be run if the notebook is used for inference and not training

From the label map, get the number of classes to be detected in the task

In [None]:
def get_num_classes(pbtxt_fname):
    from object_detection.utils import label_map_util
    label_map = label_map_util.load_labelmap(pbtxt_fname)
    categories = label_map_util.convert_label_map_to_categories(
        label_map, max_num_classes=90, use_display_name=True)
    category_index = label_map_util.create_category_index(categories)
    return len(category_index.keys())

Basic function to load image into a Numpy array

In [None]:
def load_image_into_numpy_array(image):
  image = Image.fromarray(image)
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)

From the video frames, obtain the license plate ROI and perform Optical Character Recognition(OCR) to extract the license plate numbers

In [None]:
def ocr_it(image, detections, detection_threshold=0.7, region_threshold=0.6):
    
    # Scores, boxes and classes above threhold
    scores = list(filter(lambda x: x> detection_threshold, detections['detection_scores']))
    boxes = detections['detection_boxes'][:len(scores)]
    classes = detections['detection_classes'][:len(scores)]
    
    # Full image dimensions
    width = image.shape[1]
    height = image.shape[0]
    
    # Apply ROI filtering and OCR
    for idx, box in enumerate(boxes):
        roi = box*[height, width, height, width]
        region = image[int(roi[0]):int(roi[2]),int(roi[1]):int(roi[3])]
        
        # Display the ROI
        plt.imshow(region, cmap='gray')
        plt.title("Detected box")
        plt.show()

        ocr_result = pytesseract.image_to_string(region, lang ='eng', config = '--oem 3 --psm 7 -c tessedit_char_whitelist=ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
        
        print("Detected license plate: ", ocr_result)

        return ocr_result

##Running inference on my own downloaded test video

Initialize a text file in Google Drive to store the recognized license plate numbers

In [None]:
license_plate_detection_output_logfile = "/content/drive/My Drive/own_test_video_detected_license_plate_numbers_tf1.txt"
with open(license_plate_detection_output_logfile, 'w') as writefile:
    writefile.write("Detected license plate numbers: \n")

writefile.close()

In [None]:
# Initialize the input video file path
video_file = "/content/drive/My Drive/own_test_video.mp4"

# Initialize path to store the output video
video_path_out = "/content/drive/My Drive/own_test_video_detection_output_tf1.mp4"

# Initialize the path to the label map
PATH_TO_LABELS = "/content/drive/My Drive/Plates_label_map.pbtxt"

# Initialize the path to the uploaded model file
PATH_TO_CKPT = "/content/drive/My Drive/frozen_inference_graph.pb"

# To keep track of the number of frames run
frame_number = 0
video_out = None

# Load checkpoint and initialize a new graph
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

graph = detection_graph
num_classes = get_num_classes(PATH_TO_LABELS)
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with graph.as_default():
  with tf.Session() as sess:
    # Get handles to input and output tensors
    ops = tf.get_default_graph().get_operations()
    all_tensor_names = {output.name for op in ops for output in op.outputs}
    tensor_dict = {}
    for key in [
        'num_detections', 'detection_boxes', 'detection_scores',
        'detection_classes', 'detection_masks'
    ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
            tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)

    # Read video file
    video_capture = cv2.VideoCapture(video_file)
    
    # Get the total number of frames in the video
    total = int(video_capture.get(cv2.CAP_PROP_FRAME_COUNT))
    print("Total number of frames: ", total)

    while frame_number < total:
      
      # Start time to measure FPS
      start_time = time.time()

      # Get current frame
      image = video_capture.read()[1]
      frame_number += 1
      print("Current frame: ", frame_number)

      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)

      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict, feed_dict={image_tensor: np.expand_dims(image_np, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]

      # Visualization of the results of a detection.
      output_frame = vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          output_dict['detection_boxes'],
          output_dict['detection_classes'],
          output_dict['detection_scores'],
          category_index,
          instance_masks=output_dict.get('detection_masks'),
          use_normalized_coordinates=True,
          line_thickness=8)
      
      try:
        print("Detecting license plate")
        # Convert frame to gray and threshold for OCR to extract easily
        gray_image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2GRAY)
        _, binary_gray_image_np = cv2.threshold(gray_image_np, 0, 255, cv2.THRESH_OTSU)
        license_plate_text = ocr_it(binary_gray_image_np, output_dict)

        # Initialize detection output log file
        writefile = open(license_plate_detection_output_logfile, "a")
        if license_plate_text is not None:
          writefile.write(license_plate_text)
          print("Written to file")
        writefile.close()

      except:
        print("Unable to read license plate")
        pass
      
      # Write each frame to an output video file
      if video_out is None:
        fourcc = cv2.VideoWriter_fourcc(*"MJPG")
        video_out = cv2.VideoWriter(video_path_out, fourcc, 30, (output_frame.shape[1], output_frame.shape[0]), True)

      video_out.write(output_frame)

      # End time to measure how long the process took, which gives FPS
      end_time = time.time()
 
      # Calculating the FPS
      seconds = end_time - start_time
      fps = total / seconds

      print("FPS: ", str(fps))

      cv2.waitKey(1)

video_out.release()
video_capture.release()

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Detecting license plate
Detected license plate:  
Written to file
FPS:  2283.563544992424
Current frame:  2319
Detecting license plate
Detected license plate:  
Written to file
FPS:  2287.6448539967164
Current frame:  2320
Detecting license plate
Detected license plate:  
Written to file
FPS:  2294.9944468135473
Current frame:  2321
Detecting license plate
Detected license plate:  
Written to file
FPS:  2285.9258197547997
Current frame:  2322
Detecting license plate
Detected license plate:  
Written to file
FPS:  2297.135208986493
Current frame:  2323
Detecting license plate
Detected license plate:  . & - ’

Written to file
FPS:  2257.7656019149495
Current frame:  2324
Detecting license plate
Detected license plate:  
Written to file
FPS:  2301.7233290189665
Current frame:  2325
Detecting license plate
Detected license plate:  
Written to file
FPS:  2291.655591090628
Current frame:  2326
Detecting license plate
De

##Running inference on given test video (trimmed)

In [None]:
license_plate_detection_output_logfile = "/content/drive/My Drive/test_video_detected_license_plate_numbers_tf1.txt"
with open(license_plate_detection_output_logfile, 'w') as writefile:
    writefile.write("Detected license plate numbers: \n")

writefile.close()

In [None]:
# Initialize the input video file path
video_file = "/content/drive/My Drive/test_video_trimmed.mp4"

# Initialize path to store the output video
video_path_out = "/content/drive/My Drive/test_video_detection_output_tf1.mp4"

# Initialize the path to the label map
PATH_TO_LABELS = "/content/drive/My Drive/Plates_label_map.pbtxt"

# Initialize the path to the uploaded model file
PATH_TO_CKPT = "/content/drive/My Drive/frozen_inference_graph.pb"

# To keep track of the number of frames run
frame_number = 0
video_out = None

# Load checkpoint and initialize a new graph
detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

graph = detection_graph
num_classes = get_num_classes(PATH_TO_LABELS)
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=num_classes, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with graph.as_default():
  with tf.Session() as sess:
    # Get handles to input and output tensors
    ops = tf.get_default_graph().get_operations()
    all_tensor_names = {output.name for op in ops for output in op.outputs}
    tensor_dict = {}
    for key in [
        'num_detections', 'detection_boxes', 'detection_scores',
        'detection_classes', 'detection_masks'
    ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
            tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(tensor_name)

    # Read video file
    video_capture = cv2.VideoCapture(video_file)
    
    # Get the total number of frames in the video
    total = int(video_capture.get(cv2.CAP_PROP_FRAME_COUNT))
    print("Total number of frames: ", total)

    while frame_number < total:
      
      # Start time to measure FPS
      start_time = time.time()

      # Get current frame
      image = video_capture.read()[1]
      frame_number += 1
      print("Current frame: ", frame_number)

      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)

      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict, feed_dict={image_tensor: np.expand_dims(image_np, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict['detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]

      # Visualization of the results of a detection.
      output_frame = vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          output_dict['detection_boxes'],
          output_dict['detection_classes'],
          output_dict['detection_scores'],
          category_index,
          instance_masks=output_dict.get('detection_masks'),
          use_normalized_coordinates=True,
          line_thickness=8)
      
      try:
        print("Detecting license plate")
        # Convert frame to gray and threshold for OCR to extract easily
        gray_image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2GRAY)
        _, binary_gray_image_np = cv2.threshold(gray_image_np, 0, 255, cv2.THRESH_OTSU)
        license_plate_text = ocr_it(binary_gray_image_np, output_dict)

        # Initialize detection output log file
        writefile = open(license_plate_detection_output_logfile, "a")
        if license_plate_text is not None:
          writefile.write(license_plate_text)
          print("Written to file")
        writefile.close()

      except:
        print("Unable to read license plate")
        pass
      
      # Write each frame to an output video file
      if video_out is None:
        fourcc = cv2.VideoWriter_fourcc(*"MJPG")
        video_out = cv2.VideoWriter(video_path_out, fourcc, 30, (output_frame.shape[1], output_frame.shape[0]), True)

      video_out.write(output_frame)

      # End time to measure how long the process took, which gives FPS
      end_time = time.time()
 
      # Calculating the FPS
      seconds = end_time - start_time
      fps = total / seconds

      print("FPS: ", str(fps))

      cv2.waitKey(1)

video_out.release()
video_capture.release()

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Current frame:  953
Detecting license plate
Detected license plate:  WBE

Written to file
FPS:  1145.9376145072108
Current frame:  954
Detecting license plate
Detected license plate:  Ket MW

Written to file
FPS:  1134.616618567097
Current frame:  955
Detecting license plate
Detected license plate:  Ha

Written to file
FPS:  1135.4956357390358
Current frame:  956
Detecting license plate
Detected license plate:  Ram.

Written to file
FPS:  1135.8498863187258
Current frame:  957
Detecting license plate
FPS:  1286.3156551000704
Current frame:  958
Detecting license plate
FPS:  1318.3839660678404
Current frame:  959
Detecting license plate
FPS:  1304.020522992297
Current frame:  960
Detecting license plate
FPS:  1317.3328736529354
Current frame:  961
Detecting license plate
FPS:  1307.0144311630206
Current frame:  962
Detecting license plate
FPS:  1295.7692269085157
Current frame:  963
Detecting license plate
Detected lic