# Object Detection with Faster R-CNN Transfer Learning
## Initial document from Adrian Meyer and further processing by Elia Ferrari and Marius Hürzeler



# Download Tensorflow Repo and Python Modules
By executing the first code snippet you initialize your virtual linux-style machine. Use The little arrow ">" in the top left corner to view the file system of your hosted system.
You can use UNIX-style terminal commands by using the prefix % and elevated priviledge commands for installations with the prefix !.

In [0]:
#make sure numpy is downgraded for compatibility reasons.
!pip install numpy==1.17.4

In [0]:
%cd
%tensorflow_version 1.x

#make sure to be in /root and that tensorflow is running in version 1.15.2
#%load_ext tensorboard# Load the TensorBoard notebook extension 
import tensorflow as tf
print(tf.__version__)
#!rm -rf ./logs/#remove logs from previous runs

In [0]:
"""
This allows you to check which GPU you have been allocated. Google offers free
Tesla T4, Tesla K80, Tesla P100 (the P100 hax 1.6x more GFLOPs and 3x the memory bandwith than K80, the T4 is fairly slow).
In theory you can restart the environment until you have the fast one. 
For testing and learning it doesn't really matter.
"""
  
#We have to work with Tensorflow 1.15.2 for code compatibility reasons; by now TF v2 is available.
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
!nvidia-smi

In [0]:
%cd
%tensorflow_version 1.x

#make sure to be in /root and that tensorflow is running in version 1.15.2
import tensorflow as tf
print(tf.__version__)

"""
This repository contains a number of different models implemented in TensorFlow: The official models are a collection of example models that use TensorFlows high-level APIs. They are intended to be well-maintained, tested, and kept up to date with the latest stable TensorFlow API. They should also be reasonably optimized for fast performance while still being easy to read. We especially recommend newer TensorFlow users to start here.
The research models are a large collection of models implemented in TensorFlow by researchers. They are not officially supported or available in release branches; it is up to the individual researchers to maintain the models and/or provide support on issues and pull requests.
The samples folder contains code snippets and smaller models that demonstrate features of TensorFlow, including code presented in various blog posts.
"""
!git clone https://github.com/tensorflow/models.git

!apt-get install protobuf-compiler python-tk

"""
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data; similar to JSON or XML.
"""
!pip install Cython contextlib2 pillow lxml matplotlib PyDrive
"""
These context modules are necessary python pachages. Especially Cython is important: It allows to call native C or C++ bindings from within python.
"""

!pip install pycocotools
"""
COCO is a large image dataset designed for object detection, segmentation, person keypoints detection, stuff segmentation, and caption generation. 
"""

%cd ~/models/research
!protoc object_detection/protos/*.proto --python_out=. 
#This initializes/compiles the Tensorflow Protobuf evnironment.

import os
os.environ['PYTHONPATH'] += ':/models/research/:/models/research/slim/'
#This sets the file system path for the python interpreter.


# Install Tensorflow on Virtual Machine

In [0]:
!python setup.py build
!python setup.py install > /dev/null
"""
This snippet builds and installs the Tensorflow API from the cloned git source.
"""

In [0]:
%cd slim
!pip install -e .

%cd ..
!python object_detection/builders/model_builder_test.py
"""
This tests if the installation was successful. The Tests should yield the output [ RUN  OK ]
"""

#Upload and Import Dataset

In [0]:
#Here you can your dataset dowload
%cd /datalab
!wget https://drive.switch.ch/index.php/s/tqRSOcRs0FxUzvF/download 

In [0]:
%cd /datalab
!mv download datensatz.zip
#In case you have unwanted folders remaining in your file system use this command: !rm -r FOLDERNAME
!unzip datensatz.zip #Scroll through the unzip output to get an idea of the datalab folder content.

#Data and Model Preparation
The dataset has to be transformed in a readable and trainable format. This step includes reading the XML information, generating bounding boxes and producing mathematical tensors as input for the network architecture.

In [0]:
%cd ..
%cd /datalab

!echo "item { id: 1 name: 'car'}" > label_map.pbtxt
!echo "item { id: 2 name: 'bike'}" >> label_map.pbtxt
!echo "item { id: 3 name: 'person'}" >> label_map.pbtxt
!echo "item { id: 4 name: 'tram'}" >> label_map.pbtxt
!echo "item { id: 5 name: 'motorbike'}" >> label_map.pbtxt

image_files=os.listdir('images')
im_files=[x.split('.')[0] for x in image_files]
with open('annotations/trainval.txt', 'w') as text_file:
  for row in im_files:
    text_file.write(row + '\n')
    print(row)

We need to write our label name (in this case for example "car") into a config file defining all detectable classes.
It can be one or multiple classes. If you want to start a new file use the ">" pipe command. If you want to append a line use the ">>" pipe command.

Then we iterate through all image files to extract the file names (paths are not relevant) which we want to use for training and validation.

In [0]:
%cd /datalab
!python xml_to_csv.py
#This script takes the XML annotations from the 'train' and 'test' folders and writes them as a list into 
#two CSV table files.

## Generate Bounding Boxes on Images for RPN Network Training
The same process need to be performed with the XML Annotation files.
Additionally, we write PNG files containing the masks of our labelled areas.

In [0]:
%cd /datalab/annotations
!rm -r trimaps
!mkdir trimaps

from PIL import Image
image = Image.new('RGB', (1000, 800))

for filename in os.listdir('xmls'):
  filename = os.path.splitext(filename)[0]
  image.save('trimaps/' + filename + '.png')

##Generate Labelled Tensor Matrices (tf_records)
The Tensorflow Record files contain the actual input data for the Machine Learning process in binary format. An API specific script can do the job for us. We use the famous "coco model" in our transfer learning process. The dataset needs to be split at this point into training and validation data. 80% of our data should be used for training, the remaining 20% for validation (testing)

In [0]:
%cd /datalab
!python generate_tfrecord.py --csv_input=splitted/train_labels.csv --image_dir=splitted/train --output_path=tf_train.record
!python generate_tfrecord.py --csv_input=splitted/test_labels.csv --image_dir=splitted/test --output_path=tf_val.record



##Download the Model Checkpoint you want to use for Transfer Learning
Many different COCO pretrained neural models can be used for bounding box related object detection with Tensorflow.
They all have different advantages or disadvantages (e.g. inferencing speed, accuracy, easy to train, etc.).

An overview can be found with the [TF Model Zoo](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md).



In [0]:
%cd /datalab
!wget http://download.tensorflow.org/models/object_detection/faster_rcnn_inception_v2_coco_2018_01_28.tar.gz
  
%cd /datalab
!tar -xvzf faster_rcnn_inception_v2_coco_2018_01_28.tar.gz

%cd /datalab
!mv faster_rcnn_inception_v2_coco_2018_01_28 pretrained_model

##Configure the Paths and Training Parameters
This specifies which files and model checkpoints should be used for the trainings process.

In [0]:
%cd /datalab

import re

#filename = '/datalab/pretrained_model/pipeline.config'
filename = '/root/models/research/object_detection/samples/configs/faster_rcnn_inception_v2_coco.config'
with open(filename) as f:
  s = f.read()
with open(filename, 'w') as f:
  s = re.sub('    num_classes: 90', '    num_classes: 3', s)
  s = re.sub('PATH_TO_BE_CONFIGURED/model.ckpt', '/datalab/pretrained_model/model.ckpt', s)
  s = re.sub('PATH_TO_BE_CONFIGURED/mscoco_train.record-\?\?\?\?\?-of-00100', '/datalab/tf_train.record', s)
  s = re.sub('PATH_TO_BE_CONFIGURED/mscoco_val.record-\?\?\?\?\?-of-00010', '/datalab/tf_val.record', s)
  s = re.sub('PATH_TO_BE_CONFIGURED/mscoco_label_map.pbtxt', '/datalab/label_map.pbtxt', s)
  f.write(s)

# Training on GPU

As a rough estimate, the loss value of Faster RCNN models should fall below 0.05 over a few thousand steps and then the training can be aborted. 

We configure automatic termination after 3'000 Steps, in productive trainings as much as 100'000-200'000 Steps can be neccesary.

In [0]:
import tensorflow as tf
print(tf.__version__)

In [0]:
# here can you download the last checkpoint
%cd /datalab
#!wget https://drive.switch.ch/index.php/s/Y7bPYdtf2C8FBYq/download #120000 steps with cars, people, bikes and motorbike

In [0]:
#unzip the data for the last checkpoint
%cd /datalab
!mv download chkpt.zip
#In case you have unwanted folders remaining in your file system use this command: !rm -r FOLDERNAME
!unzip chkpt.zip

In [0]:
%cd /datalab
%cp -R /datalab /content
#make a temporary copy of the dataset
#training for a higher number of steps increases the later achievable accuracy.

!python ~/models/research/object_detection/model_main.py \
    --pipeline_config_path=/root/models/research/object_detection/samples/configs/faster_rcnn_inception_v2_coco.config \
    --model_dir=/datalab/trained \
    --train_dir=/datalab/trained \
    --logtostderr \
    --logdir=/datalab/trained \
    --num_train_steps=121000 \
    --num_eval_steps=1000 \
    --max_evals=0

In [0]:
# zip the last checkpoint
%cd /datalab
!zip -r /checkpoint.zip trained

In [0]:
# download the last checkpoint as *.zip for further use
from google.colab import files
files.download("/checkpoint.zip")

# Export Inference Graph
Inferencing means to apply the model to images which haven't been used for training.

We reserved a few images to check if our model performs correctly.

The frozen Inference Graph gets generated from the last model checkpoint and contains all elements of the model neccesary to perform inference (also on weaker hardware), but it cannot be used to continue training the model.

In [0]:
%cd /datalab

lst = os.listdir('trained')
lf = filter(lambda k: 'model.ckpt-' in k, lst)
last_model = sorted(lf)[-1].replace('.meta', '')

!python ~/models/research/object_detection/export_inference_graph.py \
    --input_type=image_tensor \
    --pipeline_config_path=/root/models/research/object_detection/samples/configs/faster_rcnn_inception_v2_coco.config \
    --output_directory=fine_tuned_model \
    --trained_checkpoint_prefix=trained/$last_model

# Run Inference

In [0]:
%cd /root/models/research/object_detection




import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile

from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image

# This is needed since the notebook is stored in the object_detection folder.
sys.path.append("..")
from object_detection.utils import ops as utils_ops

#if tf.__version__ < '1.4.0':
#  raise ImportError('Please upgrade your tensorflow installation to v1.4.* or later!')
  

  
  
# This is needed to display the images.
%matplotlib inline




from utils import label_map_util

from utils import visualization_utils as vis_util




# What model to download.
# Path to frozen detection graph. This is the actual model that is used for the object detection.
PATH_TO_CKPT = '/datalab/fine_tuned_model' + '/frozen_inference_graph.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('/content/datalab', 'label_map.pbtxt')

NUM_CLASSES = 5




detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph = fid.read()
    od_graph_def.ParseFromString(serialized_graph)
    tf.import_graph_def(od_graph_def, name='')
    
    
    
    
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)




def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)




# Path to images to get classiefied
path_to_images = '/datalab/images/'
images = os.listdir(path_to_images)
images_path = [os.path.join(path_to_images,i ) for i in images]

# Size, in inches, of the output images.
IMAGE_SIZE = (18, 12)




def run_inference_for_single_image(image, graph):
  with graph.as_default():
    with tf.Session() as sess:
      # Get handles to input and output tensors
      ops = tf.get_default_graph().get_operations()
      all_tensor_names = {output.name for op in ops for output in op.outputs}
      tensor_dict = {}
      for key in [
          'num_detections', 'detection_boxes', 'detection_scores',
          'detection_classes', 'detection_masks'
      ]:
        tensor_name = key + ':0'
        if tensor_name in all_tensor_names:
          tensor_dict[key] = tf.get_default_graph().get_tensor_by_name(
              tensor_name)
      if 'detection_masks' in tensor_dict:
        # The following processing is only for single image
        detection_boxes = tf.squeeze(tensor_dict['detection_boxes'], [0])
        detection_masks = tf.squeeze(tensor_dict['detection_masks'], [0])
        # Reframe is required to translate mask from box coordinates to image coordinates and fit the image size.
        real_num_detection = tf.cast(tensor_dict['num_detections'][0], tf.int32)
        detection_boxes = tf.slice(detection_boxes, [0, 0], [real_num_detection, -1])
        detection_masks = tf.slice(detection_masks, [0, 0, 0], [real_num_detection, -1, -1])
        detection_masks_reframed = utils_ops.reframe_box_masks_to_image_masks(
            detection_masks, detection_boxes, image.shape[0], image.shape[1])
        detection_masks_reframed = tf.cast(
            tf.greater(detection_masks_reframed, 0.5), tf.uint8)
        # Follow the convention by adding back the batch dimension
        tensor_dict['detection_masks'] = tf.expand_dims(
            detection_masks_reframed, 0)
      image_tensor = tf.get_default_graph().get_tensor_by_name('image_tensor:0')

      # Run inference
      output_dict = sess.run(tensor_dict,
                             feed_dict={image_tensor: np.expand_dims(image, 0)})

      # all outputs are float32 numpy arrays, so convert types as appropriate
      output_dict['num_detections'] = int(output_dict['num_detections'][0])
      output_dict['detection_classes'] = output_dict[
          'detection_classes'][0].astype(np.uint8)
      output_dict['detection_boxes'] = output_dict['detection_boxes'][0]
      output_dict['detection_scores'] = output_dict['detection_scores'][0]
      if 'detection_masks' in output_dict:
        output_dict['detection_masks'] = output_dict['detection_masks'][0]
  return output_dict

try:
  os.mkdir("/datalab/results")
  os.mkdir("/datalab/label")
except:
  pass

counter = 0
for image_path in images_path:
  filename = image_path.split("/")[-1]
  image = Image.open(image_path)
  imname = os.path.basename(image_path).split('.')[0]
  # the array based representation of the image will be used later in order to prepare the
  # result image with boxes and labels on it.
  image_np = load_image_into_numpy_array(image)
  # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
  image_np_expanded = np.expand_dims(image_np, axis=0)
  # Actual detection.
  output_dict = run_inference_for_single_image(image_np, detection_graph)
  # Visualization of the results of a detection.
  '''vis_util.visualize_boxes_and_labels_on_image_array(
      image_np,
      output_dict['detection_boxes'],
      output_dict['detection_classes'],
      output_dict['detection_scores'],
      category_index,
      instance_masks=output_dict.get('detection_masks'),
      use_normalized_coordinates=True,
      line_thickness=4)
  plt.imsave( "/datalab/results/{}".format(filename),image_np)'''
  # Save the boundingboxes
  list0=[]
  list2=[]
  list1=[]
  for elem in output_dict['detection_scores']:
    list0 += [elem]
  for elem1 in output_dict['detection_classes']:
    list1 += [elem1]
  for elem2 in output_dict['detection_boxes']:
    centroidx = (elem2[0]+elem2[2])/2
    centroidy = (elem2[1]+elem2[3])/2
    list2 += [(centroidx,centroidy)]
  import csv
  csv_columns =  ['detection_scores','detection_classes','detection_boxes']

  with open('/datalab/label/'+imname+'.csv','w') as csvfile:
    writer = csv.DictWriter(csvfile, fieldnames=csv_columns)
    writer.writeheader()
    csvspam = csv.writer(csvfile, delimiter=',')
    for i in range(output_dict['num_detections']):
      if lista[i] > 0.70:
        csvspam.writerow([list0[i],list1[i],list2[i]])
  if counter%10==0:
    print(counter)
  counter += 1

In [0]:
# play sound when finished
from google.colab import output
output.eval_js('new Audio("https://upload.wikimedia.org/wikipedia/commons/0/05/Beep-09.ogg").play()')

In [0]:
# zip the classified images and the centroids data
!zip -r /datalab/label.zip /datalab/label/
!zip -r /datalab/results.zip /datalab/results/

In [0]:
# dowload the results
from google.colab import files
files.download("datalab/label.zip")
files.download("datalab/results.zip")