# Evaluate detection and tracking
This script evaluates the fish detection and tracking.

## Install software
To install the software on your own computer, follow the steps provided in the [readme](https://github.com/Rick-v-E/automatic_discard_registration/blob/master/README.md). If running on Google Colab, clone the GIT repository and install it's dependencies:

In [None]:
%%shell

# Check if the repository is already available, if not, clone and install
if [ ! -d .git ]
then
  git clone https://github.com/Rick-v-E/automatic_discard_registration.git
  pip install -r automatic_discard_registration/requirements.txt
  pip install -r automatic_discard_registration/detection/yolov3/requirements.txt
  pip install automatic_discard_registration/detection/apex
  pip install gdown
else
  git pull
fi

If you installed the software in the previous step, enter the repository:

In [None]:
%cd automatic_discard_registration

## Setup dataset
The complete dataset can be downloaded from [4TU.ResearchData](https://doi.org/10.4121/16622566.v1). To use this dataset, extract both `fdf_images.zip` and `results.zip` in the [data](https://github.com/Rick-v-E/automatic_discard_registration/tree/master/data) folder.

For use on Google Colab, we have created a smaller subset of the data. This dataset contains only part of the images of the complete dataset, but contains all result from the complete dataset.

---
**IMPORTANT**

Execute only one of the three cells below! Each cell contains a method to import the data, if one method fails, use another method. If the method succeed, go to the next section in this notebook.

---

**METHOD 1** Download and extract the sample dataset (this will take around 5-10 minutes):

In [None]:
!gdown --id 1TcyeeX0UjhWldbjhLkCRJIuktDNeAMJJ
!unzip -q fdf_sample_dataset.zip -d data
!rm fdf_sample_dataset.zip

**METHOD 2** Download the [sample dataset](https://drive.google.com/file/d/1TcyeeX0UjhWldbjhLkCRJIuktDNeAMJJ/view?usp=sharing) manually and upload it to Google Colab in the `automatic_discard_registration` opening the files tab and right click on the folder name:

![Manual upload image](colab_manual_upload.png)

After uploading, extract the dataset:

In [None]:
!unzip -q fdf_sample_dataset.zip -d data
!rm fdf_sample_dataset.zip

**METHOD 3** Download the [sample dataset](https://drive.google.com/file/d/1TcyeeX0UjhWldbjhLkCRJIuktDNeAMJJ/view?usp=sharing) and upload it to your personal Google Drive account. Connect this account to Google Colab:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

!unzip -q ../drive/MyDrive/fdf_sample_dataset.zip -d data

Check if the dataset is loaded correctly:

In [None]:
from pathlib import Path

DATA_PATH = Path("data")
NEEDED_FOLDERS = ["fdf_images", "results"]

# Check if all folders are correct
if not all([(DATA_PATH / f).is_dir() for f in NEEDED_FOLDERS]):
    print("Could not find all data folders! Did you extract both fdf_images.zip and results.zip in the data folder?")

To get the same results as in the paper, use the complete dataset. Upload the dataset to your Google Drive and [mount](https://towardsdatascience.com/downloading-datasets-into-google-drive-via-google-colab-bcb1b30b0166) this folder to your Google Colab environment. 

## Setup evaluation notebook
Start by importing the dependencies:

In [None]:
%matplotlib inline

from collections import defaultdict
from pathlib import Path

import matplotlib.pyplot as plt
from matplotlib import rcParams
from IPython.display import HTML, display
from sklearn.metrics import classification_report
from tabulate import tabulate
from tqdm.notebook import tqdm

from common.io import load_detection_file, load_annotation_files
from common.nb_utils import show_mc_precision_recall_curve, show_confusion_matrix
from evaluation.fdf_detections import FDFDetectionEvaluator
from evaluation.fdf_trackers import count_trackers_by_class

rcParams["font.family"] = "sans-serif"

## Detection evaluation
Load the detection files (using 0, 50, 100 and 200 synthetic images):

In [None]:
path_dict = {
    "detections_validation_0": Path("data/results/model_selection/detections_validation_0_synthetic_images.json"),
    "detections_validation_50": Path("data/results/model_selection/detections_validation_50_synthetic_images.json"),
    "detections_validation_100": Path("data/results/model_selection/detections_validation_100_synthetic_images.json"),
    "detections_validation_200": Path("data/results/model_selection/detections_validation_200_synthetic_images.json"),
    "detections_test_200": Path("data/results/detections_test.json"),
    "names_file": Path("detection/fish_classes.names"),
}

assert all(f.is_file() for f in path_dict.values())

Load the ground truth and the detection files. 

---
**NOTE**

The example dataset contains a subset of all images. However, all annotation files are added. The detection files are from the complete dataset, therefore, the results below should be the same as reported in the paper.

---

In [None]:
validation_annotation_folder = Path("data/fdf_images/annotations/validation")
test_annotation_folder = Path("data/fdf_images/annotations/test")

assert validation_annotation_folder.is_dir()
assert test_annotation_folder.is_dir()

# Load ground truth files
validation_ground_truth = load_annotation_files(validation_annotation_folder, skip_classes=["dragonet"])  # There are no dragonets in the validation and test datasets
test_ground_truth = load_annotation_files(test_annotation_folder, skip_classes=["dragonet"]) # There are no dragonets in the validation and test datasets

# Load detection files
detections = {}
for n_synthetic_images in tqdm([0, 50, 100, 200], desc="Loading subsets"):
    detections[f"validation_{ n_synthetic_images }"] = load_detection_file(path_dict[f"detections_validation_{ n_synthetic_images }"])

Create 4 evaluators (one for each amount of synthetic images):

In [None]:
evaluators = {}
for n_synthetic_images in tqdm([0, 50, 100, 200], desc="Create evaluators for subsets"):
    evaluators[f"validation_{ n_synthetic_images }"] = FDFDetectionEvaluator(validation_ground_truth, detections[f"validation_{ n_synthetic_images }"], path_dict, skip_classes=["dragonet"])

Associate all the detections with their ground truth annotation based on the intersection-over-union (IoU) score:

In [None]:
results = {}
for name, ev in tqdm(evaluators.items(), desc="Associate subsets"):
    results[name] = ev.associate_results_with_gt()

### Performance measures
For each evaluator, calculate the performance measures (F1-sore, precision and recall). The support column indicates the number of fishes with that specific class in the ground truth.

---
**NOTE**

The macro and weighted scores in the paper are recalculated from the F1-score, precision and recall by omitting the background row.

---


In [None]:
for name, result in results.items():
    print(f"Classification results for { name.split('_')[1] } synthetic images:")
    print(classification_report(result.y_true, result.y_pred, zero_division=0, target_names=result.classes))

Based on the performance measures, we see that adding 200 synthetic images yielded the highest F1-score. Therefore, we use this model for the rest of the experiments. Now, calculate the test performance of the 200 synthetic images model:

In [None]:
test_detections = load_detection_file(path_dict[f"detections_test_200"])
evaluator = FDFDetectionEvaluator(test_ground_truth, test_detections, path_dict, skip_classes=["dragonet"])
result = evaluator.associate_results_with_gt()
print(classification_report(result.y_true, result.y_pred, zero_division=0, target_names=result.classes))

### Confusion matrix
Create a the confusion matrix. The colors are normalized over the rows (as fraction of the ground truth) and the numbers are the real number of fishes for that specific cell.

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
show_confusion_matrix(result, ax)
plt.show()

Optionally, save the confusion matrix:

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
show_confusion_matrix(result, ax)
plt.subplots_adjust(bottom=0.25, left=0.21)
plt.tight_layout()
plt.savefig("data/confusion_matrix.eps")
# plt.savefig("data/confusion_matrix.png", dpi=600)
plt.close(fig)

### Precision-Recall curve
Calculate and the multi-class precision-recall curves. The score for each detection is the objectness multiplied by the class probability for a specific class.

In [None]:
f, axarr = plt.subplots(1, 2, figsize=(12,6))
plt.subplots_adjust(wspace=0.7)

# Split between high and low frequent fish species
lf = ["dab", "gurnard", "lesser_spotted_dogfish", "pouting", "turbot"]
hf = ["common_sole", "lemon_sole", "plaice", "ray", "whiting"]

show_mc_precision_recall_curve(result, lf, axarr[0])
show_mc_precision_recall_curve(result, hf, axarr[1])
plt.show()

Optionally, save the figures:

In [None]:
fig, ax = plt.subplots(figsize=(6, 5))
show_mc_precision_recall_curve(result, lf, ax, fontsize=17)
plt.subplots_adjust(right=0.7)
plt.savefig("data/precision_recall_lf.eps")
# plt.savefig("data/precision_recall_lf.png", dpi=600)
plt.tight_layout()
plt.close(fig)

fig, ax = plt.subplots(figsize=(6, 5))
show_mc_precision_recall_curve(result, hf, ax, fontsize=17)
plt.subplots_adjust(right=0.7)
plt.savefig("data/precision_recall_hf.eps")
# plt.savefig("data/precision_recall_hf.png", dpi=600)
plt.tight_layout()
plt.close(fig)

## Tracking evaluation
The tracking is mainly evaluated by counting the number of trackers for each fish specie and compare this number with the ground truth. This is done for each box individually, one box per day. Each box has 4 runs (repetitions):

In [None]:
tracker_path_dict = {
    "20191018_run_1_tracking": Path("data/results/EM_comparison/20191018_run_1_tracking.json"),    
    "20191018_run_2_tracking": Path("data/results/EM_comparison/20191018_run_2_tracking.json"),    
    "20191018_run_3_tracking": Path("data/results/EM_comparison/20191018_run_3_tracking.json"),    
    "20191018_run_4_tracking": Path("data/results/EM_comparison/20191018_run_4_tracking.json"),  
    "20191025_run_1_tracking": Path("data/results/EM_comparison/20191025_run_1_tracking.json"),  
    "20191025_run_2_tracking": Path("data/results/EM_comparison/20191025_run_2_tracking.json"),  
    "20191025_run_3_tracking": Path("data/results/EM_comparison/20191025_run_3_tracking.json"),  
    "20191025_run_4_tracking": Path("data/results/EM_comparison/20191025_run_4_tracking.json"),  
    "20191101_run_1_tracking": Path("data/results/EM_comparison/20191101_run_1_tracking.json"),  
    "20191101_run_2_tracking": Path("data/results/EM_comparison/20191101_run_2_tracking.json"),  
    "20191101_run_3_tracking": Path("data/results/EM_comparison/20191101_run_3_tracking.json"),  
    "20191101_run_4_tracking": Path("data/results/EM_comparison/20191101_run_4_tracking.json"),    
}

assert all(f.is_file() for f in tracker_path_dict.values())

Now, load all the tracker files:

In [None]:
tracker_dict = {}
for name, file_path in tqdm(tracker_path_dict.items(), desc="Loading all runs"):
    tracker_dict[name] = load_detection_file(file_path) 

Count the number of fishes for each specie in each file:

In [None]:
# Initialize the table
table = []
header = ["Date", "Run"]
header += [class_name.replace("_", " ").capitalize() for class_name in result.classes if class_name != "background"]

# Count the total number per specie
total = defaultdict(int)

for name, t_dict in tracker_dict.items():
    counts = count_trackers_by_class(t_dict)    

    row = [name.split("_")[0], name.split("_")[2]]
    for class_name in result.classes:
        if class_name == "background":
            continue

        n = counts.get(class_name, 0)

        total[class_name] += n
        row.append(n)

    table.append(row)

total_row = ["Total", ""]
total_row += [total[class_name] for class_name in result.classes if class_name != "background"]
table.append(total_row)

display(HTML(tabulate(table, tablefmt="html", headers=header)))