Comprehensive Analysis of Model Errors in Blueberry Detection and Maturity Classification: Identifying Limitations and Proposing Future Improvements in Agricultural Monitoring.

Repository for the code used in the paper

@Article{agriculture14010018,
AUTHOR = {Aguilera, Cristhian A. and Figueroa-Flores, Carola and Aguilera, Cristhian and Navarrete, Cesar},
TITLE = {Comprehensive Analysis of Model Errors in Blueberry Detection and Maturity Classification: Identifying Limitations and Proposing Future Improvements in Agricultural Monitoring},
JOURNAL = {Agriculture},
VOLUME = {14},
YEAR = {2024},
NUMBER = {1},
ARTICLE-NUMBER = {18},
URL = {https://www.mdpi.com/2077-0472/14/1/18},
ISSN = {2077-0472},
ABSTRACT = {In blueberry farming, accurately assessing maturity is critical to efficient harvesting. Deep Learning solutions, which are increasingly popular in this area, often undergo evaluation through metrics like mean average precision (mAP). However, these metrics may only partially capture the actual performance of the models, especially in settings with limited resources like those in agricultural drones or robots. To address this, our study evaluates Deep Learning models, such as YOLOv7, RT-DETR, and Mask-RCNN, for detecting and classifying blueberries. We perform these evaluations on both powerful computers and embedded systems. Using Type-Influence Detector Error (TIDE) analysis, we closely examine the accuracy of these models. Our research reveals that partial occlusions commonly cause errors, and optimizing these models for embedded devices can increase their speed without losing precision. This work improves the understanding of object detection models for blueberry detection and maturity estimation.},
DOI = {10.3390/agriculture14010018}
}

Requirements

Docker
nvidia-docker

Build the docker container.

export WORKSPACE=/absolute_path_to_this_folder
docker build -t blueberry_detection -f docker/Dockerfile .

Replicate results

The first step is to launch the docker container. All commands are run inside the container.

# shell 1
sh docker/run.sh
# shell 2
docker exec -it blueberry_detection /bin/bash

The second step is to download the dataset & models which will be evaluated.

sh download_dataset.sh
sh download_models.sh

Object Detection Metrics

For each architecture, 3 models were trained and their metrics averaged. The detection metrics reported are:

Model	Class	Precision	Recall	F1	mAP75
YOLOv7-tiny	Ripe Pint Unripe All	0.547 0.568 0.485 0.533	0.387 0.433 0.323 0.380	0.443 0.489 0.388 0.443	0.330 0.364 0.231 0.309
YOLOv7-default	Ripe Pint Unripe All	0.626 0.641 0.605 0.624	0.456 0.508 0.415 0.46	0.528 0.567 0.492 0.530	0.435 0.432 0.348 0.405
YOLOv7-w6	Ripe Pint Unripe All	0.598 0.631 0.591 0.607	0.500 0.494 0.457 0.484	0.544 0.554 0.516 0.539	0.445 0.431 0.381 0.419
RT-DETR-L	Ripe Pint Unripe All	0.606 0.627 0.544 0.592	0.429 0.329 0.380 0.380	0.502 0.431 0.447 0.462	0.393 0.282 0.289 0.321
Mask-RCNN	Ripe Pint Unripe All	0.612 0.680 0.582 0.625	0.490 0.574 0.488 0.518	0.543 0.622 0.530 0.565	0.447 0.558 0.426 0.477

You can reproduce the results by running the following scripts.

YOLOv7-tiny

python3 test.py --weights /workspace/models/blueberry_yolov7tiny/model1.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide
python3 test.py --weights /workspace/models/blueberry_yolov7tiny/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide
python3 test.py --weights /workspace/models/blueberry_yolov7tiny/model3.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide

YOLOv7-default

python3 test.py --weights /workspace/models/blueberry_yolov7/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide
python3 test.py --weights /workspace/models/blueberry_yolov7/model2.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide
python3 test.py --weights /workspace/models/blueberry_yolov7/model3.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide

YOLOv7-w6

python3 test.py --weights /workspace/models/blueberry_yolov7w6/model1.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 1280 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide
python3 test.py --weights /workspace/models/blueberry_yolov7w6/model2.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 1280 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide
python3 test.py --weights /workspace/models/blueberry_yolov7w6/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 1280 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7 --no-tide

RT-DETR-L

python3 test.py --weights /workspace/models/blueberry_rt-detr-l/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode rt-detr --no-tide
python3 test.py --weights /workspace/models/blueberry_rt-detr-l/model2.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode rt-detr --no-tide
python3 test.py --weights /workspace/models/blueberry_rt-detr-l/model3.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode rt-detr --no-tide

Mask-RCNN

python3 test.py --weights /workspace/models/blueberry_maskrcnn/model_best.pth --cfg /workspace/models/blueberry_maskrcnn/custom_cfg.yaml --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode mask-rcnn --no-tide
python3 test.py --weights /workspace/models/blueberry_maskrcnn/model2.pth --cfg /workspace/models/blueberry_maskrcnn/custom_cfg.yaml --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode mask-rcnn --no-tide
python3 test.py --weights /workspace/models/blueberry_maskrcnn/model3.pth --cfg /workspace/models/blueberry_maskrcnn/custom_cfg.yaml --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode mask-rcnn --no-tide

TIDE Errors

TIDE requires the results in COCO format to calculate errors, so it is essential to convert the results from the YOLO format into a format compliant with COCO standards

The following script will perform the conversion automatically.

cd detectron2
python3 yolo2coco.py --images_path /workspace/datasets/blueberry_dataset/test/images/ --labels_path /workspace/datasets/blueberry_dataset/test/labels --out ../coco_gt
cd /workspace

The errors metrics reported by each model are:

Model	Cls	Loc	Both	Bkg	Miss	FP	FN
YOLOv7-tiny	5.73	32.3	0.31	0.27	10.87	10.35	43.14
YOLOv7-default	5.8	28.41	0.21	0.36	10.29	10.25	38.28
YOLOv7-w6	6.19	26.47	0.26	0.27	11.42	9.47	38.95
RT-DETR-L	8.43	35.52	0.41	0.71	6.71	11.88	40.75
Mask-RCNN	3.26	30.47	0.17	0.77	8.56	10.93	34.35

You can measure the TIDE errors & mAP of each model with the following:

python3 test.py --weights /workspace/models/blueberry_yolov7tiny/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7
python3 test.py --weights /workspace/models/blueberry_yolov7/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7
python3 test.py --weights /workspace/models/blueberry_yolov7w6/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 1280 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode yolov7
python3 test.py --weights /workspace/models/blueberry_rt-detr-l/model_best.pt --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode rt-detr
python3 test.py --weights /workspace/models/blueberry_maskrcnn/model_best.pth --cfg /workspace/models/blueberry_maskrcnn/custom_cfg.yaml --data /workspace/datasets/blueberry_dataset/data.yaml --batch-size 1 --img-size 640 --conf-thres 0.25 --iou-thres 0.65 --task test --device 0 --base-map 0.75 --infer-mode mask-rcnn

Runtime Benchmark

The average time it took each model to do inference over an image. Each model does inference 100 times as warmup, as to not pollute the final results, then each model does another 100 inferences which are recorded. Finally we get the mean of the 100 times recorded and then we calculate the standard deviation.

GTX 3080

Times recorded on a computer system with a 12th Gen Core i7 CPU, 32GB of RAM, a 1TB SSD, and an NVIDIA RTX3080TI 10GB GPU.

Model	fp32	std	fp16	std
YOLOv7-tiny	3.308	0.013	2.252	0.049
YOLOv7-default	8.059	0.046	3.823	0.059
YOLOv7-w6	19.551	0.142	7.677	0.081
RT-DETR-L	11.512	0.385	8.933	0.522
Mask-RCNN	34.301	0.755	16.998	0.206

For this step you need to convert the models to a TensorRT engine.

To convert the YOLOv7 models:

cd deps/YOLOv7_Tensorrt
sh to_onnx.sh
sh to_trt.sh
cd /workspace

To convert the Mask-RCNN model:

cd detectron2
sh to_trt.sh
cd /workspace

The RT-DETR-L model cannot be converted to a TRT engine, therefore it's coverted to a torchscript model.

cd rt-detr
sh to_torchscript.sh
cd /workspace

FP16 inference times.

python3 benchmark.py --weights /workspace/models/blueberry_yolov7tiny/model_best_fp16.trt --imgsz 640 --infer-type yolov7-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_yolov7/model_best_fp16.trt --imgsz 640 --infer-type yolov7-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_yolov7w6/model_best_fp16.trt --imgsz 1280 --infer-type yolov7-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_rt-detr-l/model_best_fp16.torchscript --infer-type rt-detr-ts --im_folder /workspace/datasets/blueberry_dataset/test/images/ --half
python3 benchmark.py --weights /workspace/models/blueberry_maskrcnn/model_best_fp16.trt --infer-type mask-rcnn-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/

FP32 inference times.

python3 benchmark.py --weights /workspace/models/blueberry_yolov7tiny/model_best_fp32.trt --imgsz 640 --infer-type yolov7-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_yolov7/model_best_fp32.trt --imgsz 640 --infer-type yolov7-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_yolov7w6/model_best_fp32.trt --imgsz 1280 --infer-type yolov7-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_rt-detr-l/model_best_fp32.torchscript --infer-type rt-detr-ts --im_folder /workspace/datasets/blueberry_dataset/test/images/
python3 benchmark.py --weights /workspace/models/blueberry_maskrcnn/model_best_fp32.trt --infer-type mask-rcnn-trt --im_folder /workspace/datasets/blueberry_dataset/test/images/

Training

You can train models by following the instructions in each folder corresponding to its architecture.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comprehensive Analysis of Model Errors in Blueberry Detection and Maturity Classification: Identifying Limitations and Proposing Future Improvements in Agricultural Monitoring.

Requirements

Replicate results

Object Detection Metrics

YOLOv7-tiny

YOLOv7-default

YOLOv7-w6

RT-DETR-L

Mask-RCNN

TIDE Errors

Runtime Benchmark

GTX 3080

Training

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
deps		deps
detectron2		detectron2
docker		docker
rt-detr		rt-detr
yolov7		yolov7
.gitignore		.gitignore
README.md		README.md
benchmark.py		benchmark.py
download_dataset.sh		download_dataset.sh
download_models.sh		download_models.sh
requirements.txt		requirements.txt
test.py		test.py

ngunsu/bb2023

Folders and files

Latest commit

History

Repository files navigation

Comprehensive Analysis of Model Errors in Blueberry Detection and Maturity Classification: Identifying Limitations and Proposing Future Improvements in Agricultural Monitoring.

Requirements

Replicate results

Object Detection Metrics

YOLOv7-tiny

YOLOv7-default

YOLOv7-w6

RT-DETR-L

Mask-RCNN

TIDE Errors

Runtime Benchmark

GTX 3080

Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages