# Evaluate military vehicle detections on The Search_2 dataset

This notebook provides a small benchmark to evaluate automated military vehicle detection models on a real dataset. Deep-learning object detection models can give reasonable detection performance when fine-tuned on specific datasets. However, acquiring enough data corresponding to a real military setting is a challenge, as demonstrated by this project. It is therefore important to evaluate these models in a military setting, with a target area around tens of pixels in a cluttered environment.

We propose to use [The Search_2](https://figshare.com/articles/dataset/The_Search_2_dataset/1041463) dataset for such an evaluation. The Search_2 dataset consists of 44 high-resolution digital color images of different complex natural scenes, with each scene (image) containing a single military vehicle that serves as a search target. Ground truth annotations are provided for the targets.

### Download and load the dataset

To begin, we download the dataset and load it into fiftyone.

In [1]:
from pathlib import Path
from adomvi.utils import download_and_extract

search_2_dir = Path() / "search_2"
search_2_url = "https://github.com/jonasrenault/adomvi/releases/download/v1.3.0/search_2.tar.gz"
download_and_extract(search_2_url, "search_2.tar.gz", search_2_dir)

In [2]:
from adomvi.datasets.search2 import load_search_2_dataset

dataset = load_search_2_dataset(search_2_dir / "search_2")



 100% |███████████████████| 44/44 [112.6ms elapsed, 0s remaining, 390.8 samples/s] 


We can map the labels which identify each target in the dataset to the four classes with which our model was trained.

In [3]:
label_mapping = {
    "M60": "AFV",
    "M3": "AFV",
    "M1": "AFV",
    "T72": "AFV",
    "HVS": "LAV",
    "HVT": "LAV",
    "BMP": "APC",
    "BTR": "APC",
    "M113": "APC",
}

# Map the labels
dataset.map_labels(
    "ground_truth",
    label_mapping
).save()

### Evaluate a pretrained model on this dataset

Once our test dataset is ready, we can evaluated a pretrained model on it.

In [5]:
from adomvi.yolo.yolo import predict

model = Path() / "runs/detect/train/weights/best.pt"
results_predict = predict(model, source=search_2_dir / "search_2/images")

# Load the path of the prediction model results
results_predict_dir = Path(results_predict[0].save_dir)


image 1/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0001.jpg: 448x640 (no detections), 89.0ms
image 2/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0002.jpg: 448x640 1 AFV, 6.8ms
image 3/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0003.jpg: 448x640 (no detections), 6.9ms
image 4/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0004.jpg: 448x640 (no detections), 6.7ms
image 5/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0005.jpg: 448x640 (no detections), 6.8ms
image 6/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0006.jpg: 448x640 (no detections), 6.7ms
image 7/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0007.jpg: 448x640 (no detections), 6.6ms
image 8/44 /home/jrenault/workspace/adomvi2/notebooks/search_2/search_2/images/IMG0008.jpg: 448x640 (no detections), 6.6ms
image 9/44 /home/jrenaul

Let's load the model's predictions into our dataset.

In [6]:
from adomvi.yolo.utils import add_yolo_detections

prediction_field = "yolov8"
predictions_dir = Path() / results_predict_dir / "labels"
add_yolo_detections(dataset, prediction_field=prediction_field, predictions_dir=predictions_dir, class_list=["AFV", "APC", "MEV", "LAV"])

Once that's done, we can evaluate our model's predictions and print the mean Average Precision (mAP).

In [7]:
detection_results = dataset.evaluate_detections(
    prediction_field, 
    eval_key="eval",
    compute_mAP=True,
    gt_field="ground_truth",
)

Evaluating detections...
 100% |███████████████████| 44/44 [79.7ms elapsed, 0s remaining, 552.1 samples/s] 
Performing IoU sweep...
 100% |███████████████████| 44/44 [62.7ms elapsed, 0s remaining, 701.4 samples/s] 


In [8]:
mAP = detection_results.mAP()
print(f"mAP = {mAP}")

mAP = 0.008415841584158416


In [9]:
detection_results.print_report()

              precision    recall  f1-score   support

         AFV       0.50      0.08      0.14        24
         APC       0.00      0.00      0.00         9
         LAV       0.00      0.00      0.00        11

   micro avg       0.50      0.05      0.08        44
   macro avg       0.17      0.03      0.05        44
weighted avg       0.27      0.05      0.08        44



We don't have a lot of test images, but our scores aren't good anyways... It's maybe easier to visualize the results in fiftyone.

In [11]:
import fiftyone as fo

session = fo.launch_app(dataset, auto=False)
session.open_tab()

Session launched. Run `session.show()` to open the App in a cell output.


<IPython.core.display.Javascript object>