# Error Analysis on Open Images Evaluation Results

This tutorial demonstrates per-image evaluation of [the Open Images dataset](https://storage.googleapis.com/openimages/web/index.html)
that generates:
- true positives
- false positives
- per-class AP
- mAP

and adds this information to each [Sample](https://voxel51.com/docs/fiftyone/api/fiftyone.core.sample.html#fiftyone.core.sample.Sample)
in [Dataset](https://voxel51.com/docs/fiftyone/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset).

The steps are broken down as:
- [#2.-Download-the-data-and-ground-truth-labels](#2.-Download-the-data-and-ground-truth-labels)

If you already have the data downloaded you may skip step 2.

If you have your own model you can skip step 3.

This tutorial evaluates a model on [Open Images V4](https://storage.googleapis.com/openimages/web/download_v4.html)
however this code supports later versions of Open Images as well. If using a newer version just make sure to
use the appropriate hierarchy file and class label map.

## Requirements

This notebook contains bash commands. To run it as a notebook, you must install the [Jupyter bash kernel](https://github.com/takluyver/bash_kernel) via the command below.

Alternatively, you can just copy + paste the code blocks into your shell.

In [None]:
pip install bash_kernel
python -m bash_kernel.install

This workflow requires a few required python packages.

Install the appropriate version of `tensorflow` depending on whether or not you
have a GPU:

In [None]:
pip install tensorflow
# pip install tensorflow-gpu

Install other requirements:

In [None]:
pip install Pillow tensorflow-hub

## 2. Download the data and ground-truth labels

All of the data can be found on the
[official Open Images website](https://storage.googleapis.com/openimages/web/download_v4.html).

If you are using Open Images V4 you can use the following commands to download
all the necessary files.

### Download the data

**WARNING** This is 36GB of data!

In [None]:
aws s3 --no-sign-request sync s3://open-images-dataset/test [target_dir/test]

### Downloading the labels and metadata

In [None]:
wget https://storage.googleapis.com/openimages/2018_04/test/test-annotations-bbox.csv
wget https://storage.googleapis.com/openimages/2018_04/test/test-annotations-human-imagelabels-boxable.csv
wget https://storage.googleapis.com/openimages/2018_04/class-descriptions-boxable.csv
wget https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy.json

## 4. Generating predictions

In [None]:
cd PATH/TO/open_images_error_analysis

In [None]:
IMAGES_DIR="/PATH/TO/IMAGES"
OUTPUT_DIR="/PATH/TO/PREDICTIONS"

MODEL_HANDLE="https://tfhub.dev/google/faster_rcnn/openimages_v4/inception_resnet_v2/1"
# MODEL_HANDLE="https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1"

python scripts/inference.py \
    --output_dir ${OUTPUT_DIR} \
    --output_format tf_object_detection_api \
    ${IMAGES_DIR} ${MODEL_HANDLE}

## 5. Visualizing the data

### Installing FiftyOne

We are going to use the [fiftyone](https://github.com/voxel51/fiftyone) package
for visualizing the data.

In [None]:
pip install fiftyone

### Loading the data into FiftyOne

In [None]:
DATASET_NAME="open-images-v4-test"
IMAGES_DIR="/PATH/TO/IMAGES"
BOUNDING_BOXES_EXPANDED="/PATH/TO/test-annotations-bbox_expanded.csv"
IMAGE_LABELS_EXPANDED="/PATH/TO/test-annotations-human-imagelabels-boxable_expanded.csv"
PREDICTIONS_PATH="/PATH/TO/PREDICTIONS.csv"
CLASS_DESCRIPTIONS="/PATH/TO/class-descriptions-boxable.csv"

# @todo(Tyler)
DATASET_NAME="open-images-v4-test"
IMAGES_DIR="~/data/open-images-dataset/TESTING/test_images"
BOUNDING_BOXES_EXPANDED="~/data/open-images-dataset/TESTING/test-annotations-bbox_expanded.csv"
IMAGE_LABELS_EXPANDED="~/data/open-images-dataset/TESTING/test-annotations-human-imagelabels-boxable_expanded.csv"
PREDICTIONS_PATH="~/data/open-images-dataset/TESTING/faster_rcnn_preds_3081.csv"
#PREDICTIONS_PATH="~/data/open-images-dataset/TESTING/faster_rcnn_preds_74061.csv"
CLASS_DESCRIPTIONS="~/data/open-images-dataset/TESTING/class-descriptions-boxable.csv"

python scripts/load_data.py \
    --bounding_boxes_path ${BOUNDING_BOXES_EXPANDED} \
    --image_labels_path ${IMAGE_LABELS_EXPANDED} \
    --predictions_path ${PREDICTIONS_PATH} \
    --prediction_field_name "faster_rcnn" \
    --class_descriptions_path ${CLASS_DESCRIPTIONS} \
    --load_images_with_preds \
    --max_num_images 1000 \
    ${DATASET_NAME} ${IMAGES_DIR}

### (optional) Visualizing the data

We can optionally visualize the data before evaluating. Open up a `python` or
`ipython` terminal and run the following:

```python
import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.load_dataset("open-images-v4-test")

session = fo.launch_app(dataset=dataset)

# Filter the visible detections by confidence
session.view = dataset.filter_detections("faster_rcnn", F("confidence") > 0.4)
```

## 3. Preparing the ground-truth for evaluation

Open Images requires "expanding the hierarchy" if the ground-truth labels, for
evaluation. The labels you downloaded only contain leaf node labels, so for
example, for a bounding box labeled `Jaguar`, the hierarchy expansion would add
duplicate boxes with labels `Carnivore`, `Mammal` and `Animal`.

### Installing TF Object Detection API

The first step is to install the Tensorflow Object Detection API. Instructions
on how to do so can be found
[here](https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2.md).

### Create expanded hierarchy ground-truth labels

In [None]:
cd PATH/TO/models/research/object_detection

In [None]:
LABELS_DIR=PATH/TO/LABELS

HIERARCHY_FILE=${LABELS_DIR}/bbox_labels_600_hierarchy.json
BOUNDING_BOXES=${LABELS_DIR}/test-annotations-bbox
IMAGE_LABELS=${LABELS_DIR}/test-annotations-human-imagelabels-boxable

python dataset_tools/oid_hierarchical_labels_expansion.py \
    --json_hierarchy_file=${HIERARCHY_FILE} \
    --input_annotations=${BOUNDING_BOXES}.csv \
    --output_annotations=${BOUNDING_BOXES}_expanded.csv \
    --annotation_type=1

python dataset_tools/oid_hierarchical_labels_expansion.py \
    --json_hierarchy_file=${HIERARCHY_FILE} \
    --input_annotations=${IMAGE_LABELS}.csv \
    --output_annotations=${IMAGE_LABELS}_expanded.csv \
    --annotation_type=2

You should now have two new files in `LABELS_DIR`:

In [None]:
test-annotations-bbox_expanded.csv
test-annotations-human-imagelabels-boxable_expanded.csv

## 6. Evaluating on a per-image granularity

### Running evaluation

In [None]:
export TF_MODELS_RESEARCH="/Users/tylerganter/data/open-images-dataset"
CLASS_LABEL_MAP=${TF_MODELS_RESEARCH}/object_detection/data/oid_v4_label_map.pbtxt

python scripts/evaluate_model.py \
    --prediction_field_name "faster_rcnn" \
    --iou_threshold 0.5 \
    ${DATASET_NAME} ${CLASS_LABEL_MAP}

## 7. Error analysis

We can now visualize

```python
import fiftyone as fo
from fiftyone import ViewField as F

dataset = fo.load_dataset("open-images-v4-test")

session = fo.launch_app(dataset=dataset)

# Filter the visible detections by confidence
session.view = (
    dataset
    .filter_detections("faster_rcnn_TP", F("confidence") > 0.4)
    .filter_detections("faster_rcnn_FP", F("confidence") > 0.4)
)
```