# 0. Introduction

This notebook provide the pipeline I used to detect and track vehicles using [R-FCN](https://arxiv.org/abs/1605.06409) and [MDP](http://cvgl.stanford.edu/projects/MDP_tracking/).  
It works with other models similarly simply by using a different [config file](https://github.com/josueBulle/detection-and-tracking-from-uav/tree/master/tf_configs)
## Detection
Model: R-FCN.  
Uses: Tensorflow object_detection api.
### Dataset preparation
The custom provided dataset is formated for Darknet (Yolo). Tensorflow's implementation uses its own format known as *tf records*. The transformation can be done using the script [createRecords.py](https://github.com/josueBulle/detection-and-tracking-from-uav/blob/master/helpers/createRecords.py).   
To avoid this step, the whole dataset converted for tensorflow is available [here](https://drive.google.com/file/d/1gSmlXO3VUulRp_-CxN-H9Gxgu_YOR2qP/view?usp=sharing).
 (For better handling and parallelization, the dataset is sharded in multiple parts)
 
 In either case, the dataset must be linked in the *rfcn.config* file
### Training
The config file *rfcn.config* contains all the parameters and paths to the training/evaluation datasets.  
The file *dataset/label_map.pbtxt*  contains the class names (car, truck, pedestrian etc.).   
Tensorboard can be used to monitor the training.
### Inference
To avoid training, inference can be done from the [trained model](https://drive.google.com/open?id=1S_dpNJe9bFQU4jTA_YGuAC0QUpsvVUll) on this custom dataset.
## Tracking
Algorithm: MDP  
The official source code is available at [MDP_tracking](https://github.com/yuxng/MDP_Tracking).  Uses: Matlab

# 1. Installation

Clone my github repo

In [0]:
%%shell
# Clone the git project which includes tensorflow's object_detection, slim, pycocotools into /content/detection/

cd /content/
git clone https://github.com/josueBulle/detection-and-tracking-from-uav.git detection

In [0]:
import os, sys, subprocess
from google.colab import drive

# connect to gDrive
drive.mount('/content/drive')
!ln -s "/content/drive/My Drive/" "/MyDrive"

# set working dir and export path
ROOT = '/content/detection/'
os.chdir(ROOT)
sys.path.append(ROOT + "tensorflow")
sys.path.append(ROOT + "tensorflow/object_detection")
sys.path.append(ROOT + "tensorflow/slim")
%set_env PYTHONPATH=/env/python:/content/detection/tensorflow:/content/detection/tensorflow/slim:/content/detection/tensorflow/object_detection

# 2. Training

## 1.1 Retrieve the dataset and pre-trained weights

Tensorflow need a dataset converted into tf_records. (see  [createRecords.py](https://github.com/josueBulle/detection-and-tracking-from-uav/blob/master/helpers/createRecords.py)).


This cell can take some minutes, the connection between gDrive and gColab is sometimes slow and the dataset weights ~3 GB.  
1. Download the tf_records from gDrive  (assuming the dataset is at the gDrive root directory)
2. Download weights trained on the coco dataset (to be used as initial checkpoint)

In [0]:
%%shell
# Download the dataset from gDrive to gColab
PATH_TO_TF_RECORDS="/MyDrive/aerial_dataset"
cp -a "$PATH_TO_TF_RECORDS" /content/detection/dataset/

# Download the pre-trained weights
mkdir /content/detection/models
cd /content/detection/models
wget -O tmp.tar.gz http://download.tensorflow.org/models/object_detection/rfcn_resnet101_coco_2018_01_28.tar.gz
tar -xzvf tmp.tar.gz
rm tmp.tar.gz

## 1.2 Training
The following cell is optional and allows us to monitor the training using tensorboard. (Requires an evaluation dataset).

In [0]:
# Tensorboard runs locally on gColab. "ngrok" allows us to access tensorboard through the internet.

LOG_DIR = "/MyDrive/rfcn-output/"
if not os.path.isdir(LOG_DIR):
  os.mkdir(LOG_DIR)

get_ipython().system_raw(
    'tensorboard --logdir {} --host 0.0.0.0 --port 6006 &'
    .format(LOG_DIR)
)
!chmod u+x /content/detection/tensorflow/ngrok
get_ipython().system_raw('/content/detection/tensorflow/ngrok authtoken 7aXuMYYxjUHa4p2afyRBB_6W9N2KsJQzXmXsqWni7MB')
get_ipython().system_raw('/content/detection/tensorflow/ngrok http 6006 &')
!sleep 4
!curl -s http://localhost:4040/api/tunnels | python3 -c "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'] + '/#scalars')"

Start the training using the rfcn.config. SSD and Faster R-CNN configs can be found [here](https://github.com/josueBulle/detection-and-tracking-from-uav/tree/master/tf_configs).  
You might need to change the paths to the dataset and weights in the config file.

In [0]:
%%shell
# The checkpoints and outputs will be saved in the folder designed by the MODEL_DIR variable

cd /content/detection/tensorflow/object_detection

PIPELINE_CONFIG_PATH=/content/detection/rfcn.config
MODEL_DIR=/MyDrive/rfcn-output/
NUM_TRAIN_STEPS=80000
SAMPLE_1_OF_N_EVAL_EXAMPLES=1

python model_main.py \
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} \
    --model_dir=${MODEL_DIR} \
    --num_train_steps=${NUM_TRAIN_STEPS} \
    --sample_1_of_n_eval_examples=$SAMPLE_1_OF_N_EVAL_EXAMPLES \
    --alsologtostderr

# 3. Export inference graph

Finally, we need to export the trained model for inference (frozen inference graph).

In [0]:
def export_inference_graph(model_dir, config_path, output_dir):
  assert os.path.exists( config_path ), "File not found : {}".format(config_path)
  assert os.path.isdir( model_dir ), "Directory does not exists : {}".format(model_dir)
  
  if not os.path.isdir( output_dir ):
    os.mkdir( output_dir )

  # Find the last checkpoint
  ls = [ f for f in os.listdir( model_dir ) if f.startswith("model.ckpt-") ]
  ls.sort()
  best_checkpoint = '.'.join( ls[-1].split('.')[0:2] )
  
  args = ["python", "/content/detection/tensorflow/object_detection/export_inference_graph.py"]
  args.extend( ["--input_type", "image_tensor"] )
  args.extend( ["--pipeline_config_path", config_path] )
  args.extend( ["--trained_checkpoint_prefix", os.path.join(model_dir, best_checkpoint)] )
  args.extend( ["--output_directory", output_dir] )
  p = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
  output, err = p.communicate()
  print(err.decode())
  print(output.decode())


path_to_checkpoint = "/MyDrive/rfcn-output/"
path_to_config = "/content/detection/rfcn.config"
path_to_output = "/MyDrive/rfcn-output/exported/"

export_inference_graph(path_to_checkpoint, path_to_config, path_to_output)

### Just a local copy
!mkdir /content/detection/exported/
!cp -a /MyDrive/rfcn-output/exported/. /content/detection/exported/

# 4. Inference

If we skipped the training, we need to download the trained weights (from [here](https://drive.google.com/open?id=1S_dpNJe9bFQU4jTA_YGuAC0QUpsvVUll)). Otherwise skip the next cell.

In [0]:
%%shell
# This cell download the trained weights from the gDrive link

gfileid="1S_dpNJe9bFQU4jTA_YGuAC0QUpsvVUll"
destination_dir="/content/detection/"
destination_path="${destination_dir}rfcn.zip"

curl -c ./cookie -s -L "https://drive.google.com/uc?export=download&id=${gfileid}" > /dev/null
curl -Lb ./cookie "https://drive.google.com/uc?export=download&confirm=`awk '/download/ {print $NF}' ./cookie`&id=${gfileid}" -o ${destination_path}
unzip ${destination_path}
rm ${destination_path}

In [0]:
from helpers.inference import detect

# path to the exported frozen graph
graph_path = "/content/detection/exported"

# list of images to infer on
images_path = ["/content/detection/dataset/test-img-1.jpg", "/content/detection/dataset/test-img-2.jpg"]

# path to the label_map file containing the class names ("car", "truck", "pedestrian" etc.)
label_path = "/content/detection/dataset/label_map.pbtxt"

detections = detect( graph_path, label_path, images_path )
#(boxes, scores, classes, num) = detections[0]

Let's visualize the predicted boxes on the first image

In [0]:
import cv2
from utils import visualization_utils as vis_util
import numpy as np
import matplotlib.pyplot as plt

image = cv2.imread( images_path[0] )
(boxes, scores, classes, num) = detections[0]


category_index = {1: {'id': 1, 'name': 'car'},
                 2: {'id': 2, 'name': 'class 2'},
                 3: {'id': 3, 'name': 'class 3'},
                 4: {'id': 4, 'name': 'class 4'},
                 5: {'id': 5, 'name': 'class 5'},
                 6: {'id': 6, 'name': 'class 6'}}

image = vis_util.visualize_boxes_and_labels_on_image_array(
    image,
    np.squeeze(boxes),
    np.squeeze(classes).astype(np.int32),
    np.squeeze(scores),
    category_index,
    use_normalized_coordinates=True,
    line_thickness=4,
    min_score_thresh=0.50,
    max_boxes_to_draw=100)

plt.figure(figsize=(18,18))
plt.axis('off')
img = plt.imshow( image )

Next cell show how to generate the file for the MDP input from a directory containing all the frames of a video as images.  
~3.3 fps

In [0]:
from helpers.inference import detect
import cv2, numpy as np

# Only boxes with confidence score > THRESHOLD are kept
THRESHOLD = 0.4

# Path to the video directory
input_path = "/MyDrive/PATH_TO_VIDEO/"
output_path = "/MyDrive/rfcn-mot.txt"
graph_path = "/content/detection/exported"
label_path = "/content/detection/dataset/label_map.pbtxt"

# List images
images_path = [ os.path.join(input_path, f) for f in os.listdir(input_path) if not f.startswith(".")][0:100]

print( "Detecting on {} images".format(len(images_path)) )
detections = detect( graph_path, label_path, images_path )

print( "Done.\nExporting to {} in MOT format".format(output_path) )
with open( output_path, 'w' ) as f:
  # for each image
  for frame, dets in enumerate(detections):
    (boxes, scores, classes, num) = dets
    
    image = cv2.imread( images_path[frame] ) # does not actually read the image data
    h, w, c = image.shape
    
    indices = np.where( scores > THRESHOLD )
    boxes = boxes[indices]
    scores = scores[indices]
    classes = classes[indices]

    # for each box
    for i in range(len(boxes)):
      box = boxes[i]
      entry = [frame+1, -1, int(box[1]*w), int(box[0]*h), int( (box[3]-box[1])*h ), int( (box[2]-box[0])*h ), scores[i], -1, -1, -1 ]
      f.write( ",".join(map(str, entry)) + '\n' )
print("Done")