### Setup environment

Start by generating the protobuf sources and adding the required directories to Python PATH.

In [None]:
%%bash
cd  models/research
# Generate .proto messages
protoc object_detection/protos/*.proto --python_out=.
# Add both research and research/slim to python path
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
cd ../..

### Real dataset

Our dataset of real images is hosted in our lab's [webpage][vislab].
We provide a convenient script for downloading it to your workspace.
Running it should copy the set raw images and annotations to `workspace/data/raw` and the processed binaries to `workspace/data`. 

[vislab]: http://vislab.isr.ist.utl.pt/datasets/

In [None]:
!bash download_real_dataset.sh

### Synthetic dataset

You can generate you own synthetic datasets using [our Domain Randomization plugin for Gazebo][gap].
We provide documentation on how to do so [here][TODO].
After this step, the dataset should be split into `images` and `annotations` folder.
The images themselves are split in folders each containing 100 Full-HD `.jpg` files.

[gap]: https://github.com/jsbruglie/gap
[TODO]: A

#### Pre-processing

The synthetic dataset has to undergo some processing, such as rescaling the images to 960x540 and condensing the `.XML` annotations into a single `.CSV` file.
Assuming you have your synthetic dataset in `./workspace/data/raw` and want the output top be stored in `./workspace/data/proc` run:

In [None]:
%%bash
python preproc.py \
    --img_ext jpg \
    --in_xml_dir workspace/data/raw/annotations \
    --in_img_dir workspace/data/raw/images \
    --out_csv workspace/data/proc/annotations.csv \
    --out_img_dir workspace/data/proc/images_resize

### Generate TF records

Both real and sytnthetic datasets must be converted to TF records so they can be used by the network.
For each processed subdataset run:

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
python generate_tfrecord.py \
    --data_path workspace/data/ \
    --filename data \
    --csv_path workspace/data/proc/annotations.csv \
    --images_path workspace/data/proc/images_resize

For batch processing of several subdatasets we ran:

```bash
./run_preproc.sh woskspace/data/raw workspace/data/proc
./generate_tf_records.sh workspace/data/proc
```

### Download pre-trained SSD

We have used Single Shot Detector [SSD][ssd] using [MobileNet][mobilenet] as the feature extractor (which was trained using [ImageNet][imagenet]) and pre-trained on [COCO][coco] dataset as our baseline.
You can download and unpack a snapshot of the network in the form of TF `checkpoint` as follows:

[coco]: http://cocodataset.org/#home
[imagenet]: https://arxiv.org/abs/1704.04861
[mobilenet]: https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
[ssd]: http://www.cs.unc.edu/%7Ewliu/papers/ssd.pdf

In [None]:
%%bash
wget http://download.tensorflow.org/models/object_detection/ssd_mobilenet_v1_coco_11_06_2017.tar.gz \
    -q -O ssd_mobilenet_v1_coco_11_06_2017.tar.gz
tar xvzf ssd_mobilenet_v1_coco_11_06_2017.tar.gz

### Train network

We now provide an example on how to fine-tune the nework on our real image dataset.
The procedure is similar for other datasets and run configurations.
Provided you have the `workspace/data/train.record` and `workspace/data/eval.record` files (respectively training and validation partitions), the configuration of the desired run in `workspace/config/example.config` and wish the output of training to be stored in `workspace/train/example/` just run the following command
(adjust the `num_clones` parameter to match the number of GPU devices in your machine; ours was 2)

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
python models/research/object_detection/train.py \
    --logtostderr \
    --train_dir=workspace/train/example \
    --pipeline_config_path=workspace/config/example.config \
    --num_clones=2 \
    --ps_tasks=1

### Evaluating performance on validation set

Simultaneous with training, we ran an evaluation process on the CPU, to test **mAP** performance on validation set.
Simply run:

In [None]:
%%bash
# Make CPU the only visible CUDA device
export CUDA_VISIBLE_DEVICES=3
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
python models/research/object_detection/eval.py \
    --logtostderr \
    --eval_dir=workspace/eval/example/ \
    --pipeline_config_path=workspace/config/example.config \
    --checkpoint_dir=workspace/train/example/

### Watch training in tensorboard

To watch the current progress of the network training, such as loss function and **mAP** curves over time, you can use tensorboard by running:

In [None]:
%%bash
tensorboard \
    --logdir=training:workspace/train/,testing:workspace/eval/ \
    --port=6006 --host=localhost

### Evaluating performance

Now that we have trained the network, we want to use it for inference.
For that, we need to first export the graph with frozen weights.

#### Exporting inference graph

Provided you have let the train process finished, you should have the checkpoint at the 100th iteration in `workspace/train/example/`.
If so, export the frozen inference graph to `workspace/inference_graph/example` by running: 

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
python models/research/object_detection/export_inference_graph.py \
    --input_type image_tensor \
    --pipeline_config_path workspace/config/example.config \
    --trained_checkpoint_prefix workspace/train/example/model.ckpt-100 \
    --output_directory workspace/inference_graph/example

#### Infer detections

To test the trained network on the real image test set you just need to have the corresponding `test.record` in `workspace/data` folder and run:

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
mkdir -p workspace/inference_results/example
python -m object_detection/inference/infer_detections \
    --input_tfrecord_paths=workspace/data/test.record \
    --inference_graph=workspace/inference_graph/example/frozen_inference_graph.pb \
    --discard_image_pixels \
    --output_tfrecord_path=workspace/inference_results/example/detections.tfrecord

#### Compute mAP and Precision-Recall curves

With the generated inferences you can finally obtain the **AP** per class and mean, as well as the precision-recall curves per class.
For this make sure you have a valid `example.pbtxt` file in `workspace/test_input_config/` and run:

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
python -m object_detection/metrics/offline_eval_map_corloc \
    --eval_dir=workspace/inference_results/example/ \
    --eval_config_path=workspace/config/test_eval_config.pbtxt \
    --input_config_path=workspace/test_input_config/example.pbtxt

#### Overlay bounding boxes on real image test set

Finally, it will be usefull to look at the actual detections overlayed on the original test set images.
For this, run:

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
mkdir -p workspace/overlay/example
python overlay.py \
    --images_path=workspace/data/test/images/ \
    --save_path=workspace/overlay/example/ \
    --ckpt_path=workspace/inference_graph/example/frozen_inference_graph.pb

### Automating the training procedures for several subdatasets

For our ablation study we were required to train several networks, using different configurations.
Our general approach is to prepare all required configuration files beforehand, and then run a script to automate the training and validation procedures, as well as exporting the inference graphs and calculating the required metrics.
For this, we provide `run_network.sh`, which resembles our final setup:

In [None]:
%%bash
export PYTHONPATH=$PYTHONPATH:`pwd`/models/research:`pwd`/models/research/slim
run_network.sh accv2018