MIT License -All rights reserved to the author. This project may be used for study and educational purposes, but **redistribution, redevelopment, or use of the code for personal or commercial purposes is strictly prohibited without the author's written consent.
π₯ Watch Live Demo (YouTube) | π₯ Watch Live Demo (Bilibili)
π Enhancing Tiny Object Detection Using Guided Object Inference Slicing (GOIS): An Efficient Dynamic Adaptive Framework for Fine-Tuned and Non-Fine-Tuned Deep Learning Models
Guided-Object Inference Slicing (GOIS) Innovatory Framework with Several Open source code Deployed on Google Colab/Gradio Live/Huggingface
π¬ Research by: Muhammad Muzammul, Xuewei Li, Xi Li
π Published in Neurocomputing Journal-> https://doi.org/10.1016/j.neucom.2025.130327
Contact: muzamal@zju.edu.cn
@article{MUZAMMUL2025130327,
title = {Enhancing Tiny Object Detection Using Guided Object Inference Slicing (GOIS): An efficient dynamic adaptive framework for fine-tuned and non-fine-tuned deep learning models},
journal = {Neurocomputing},
volume = {640},
pages = {130327},
year = {2025},
issn = {0925-2312},
doi = {https://doi.org/10.1016/j.neucom.2025.130327},
url = {https://www.sciencedirect.com/science/article/pii/S0925231225009993},
author = {Muhammad Muzammul and Xuewei Li and Xi Li},
keywords = {Tiny Object Detection, Guided Object Inference Slicing (GOIS), Adaptive slicing-based detection, UAV-based real-time inference, High-resolution remote sensing imagery, Computationally efficient object detection, Deep learning for small-object recognition, Non-maximum suppression optimization, Transformer-based object detection},
abstract = {Tiny Object Detection (TOD) in UAV and standard imaging remains challenging due to extreme scale variations, occlusion, and cluttered backgrounds. This paper presents the Dynamic Adaptive Guided Object Inference Slicing (GOIS) framework, a two-stage adaptive slicing strategy that dynamically reallocates computational resources to Regions of Interest (ROIs), enhancing detection precision and recall. Unlike static and semi-adaptive slicing methods like SAHI and ASAHI, evaluated with models such as FNet, TOOD, and TPH-YOLO, GOIS leverages VisDrone and xView datasets to optimize hierarchical slicing and dynamic Non-Maximum Suppression (NMS), improving tiny object detection while reducing boundary artifacts and false positives. Comprehensive experiments using MS COCO-pretrained Ultralytics models under fine-tuning and non-fine-tuning conditions validate its effectiveness. Evaluations across YOLO11, RT-DETR-L, YOLOv8s-WorldV2, YOLOv10, YOLOv8, and YOLOv5 demonstrate that GOIS consistently outperforms Full-Image Inference (FI-Det), achieving up to 3-4Γ improvements in small-object recall. On the VisDrone2019 dataset, GOIS-Det improved mAP@0.50:0.95 from 0.12 (FI-Det) to 0.33 (+175%) on YOLO11 and from 0.18 to 0.38 (+111.10%) on YOLOv5n. Fine-tuning further enhanced AP-Small by 278.66% and AR-Small by 279.22%, confirming GOISβs adaptability across diverse deployment scenarios. Additionally, GOIS reduced false positives by 40%β60%, improving real-world detection reliability. Ablation studies validate GOISβs hierarchical slicing and parameter optimization, with 640-pixel coarse slices and 256-pixel fine slices achieving an optimal balance between accuracy and efficiency. As the first open-source TOD slicing framework on Hugging Face Apps and Google Colab, GOIS delivers real-time inference, open-source code, and live demonstrations, establishing itself as a breakthrough in object detection. The code and results are publicly available at https://github.com/MMUZAMMUL/GOIS with a live demoe at https://youtu.be/ukWUfXBFZ5I.}
}
Step | Command |
---|---|
1οΈβ£ Clone Repo | git clone https://github.com/MMUZAMMUL/GOIS.git && cd GOIS |
2οΈβ£ Download Data | Follow Dataset Instructions or Download 15% Dataset |
3οΈβ£ Download Models | cd Models && python download_models.py |
4οΈβ£ Generate Ground Truth | python scripts/generate_ground_truth.py --annotations_folder "<annotations_path>" --images_folder "<images_path>" --output_coco_path "./data/ground_truth/ground_truth_coco.json" |
5οΈβ£ Full Inference (FI-Det) | python scripts/full_inference.py --images_folder "<path>" --model_path "Models/yolo11n.pt" --model_type "YOLO" --output_base_path "./data/FI_Predictions" |
6οΈβ£ GOIS Inference | python scripts/gois_inference.py --images_folder "<path>" --model_path "Models/yolo11n.pt" --model_type "YOLO" --output_base_path "./data/gois_Predictions" |
7οΈβ£ Evaluate FI-Det | python scripts/evaluate_prediction.py --ground_truth_path "./data/ground_truth/ground_truth_coco.json" --predictions_path "./data/FI_Predictions/full_inference.json" --iou_type bbox |
8οΈβ£ Evaluate GOIS-Det | python scripts/evaluate_prediction.py --ground_truth_path "./data/ground_truth/ground_truth_coco.json" --predictions_path "./data/gois_Predictions/gois_inference.json" --iou_type bbox |
9οΈβ£ Compare Results | python scripts/calculate_results.py --ground_truth_path "./data/ground_truth/ground_truth_coco.json" --full_inference_path "./data/FI_Predictions/full_inference.json" --gois_inference_path "./data/gois_Predictions/gois_inference.json" |
π Upscale Metrics | python scripts/evaluate_upscaling.py --ground_truth_path "./data/ground_truth/ground_truth_coco.json" --full_inference_path "./data/FI_Predictions/full_inference.json" --gois_inference_path "./data/gois_Predictions/gois_inference.json" |
π GOIS Benchmarks Repository
π₯ Watch Live Demo (YouTube) | π₯ Watch Live Demo (Bilibili)
π MIT License - Study & Educational Use Only
π§ Contact: Author Email
Explore the GOIS-Det vs. FI-Det benchmark results through live interactive applications on Gradio. These applications provide detailed comparisons using graphs, tables, and output images, demonstrating the effectiveness of GOIS-Det in tiny object detection.
- Click on any "Open in Colab" button above to launch the interactive notebook.
- Follow the instructions in the notebook to test GOIS-Det vs. FI-Det.
- Evaluate detection performance using provided visualizations and metrics.
Experience Guided Object Inference Slicing (GOIS) across images, videos, and live cameras with configurable parameters. Evaluate real-time small object detection and compare against full-image inference (FI-Det).
π Compatible Datasets: VisDrone, UAV Surveillance (100-150ft), Pedestrian & Tiny Object Detection, Geo-Sciences
π₯οΈ Applied Models: YOLO11, YOLOv10, YOLOv9, YOLOv8, YOLOv6, YOLOv5, RT-DETR-L, YOLOv8s-Worldv2
GOIS incorporates a two-stage hierarchical slicing strategy, dynamically adjusting coarse-to-fine slicing and overlap rates to optimize tiny object detection while reducing false positives. These live applications allow users to test GOIS against full-image inference, analyze occlusion handling, boundary artifacts, and false positive reductions, while adjusting key parameters.
Test Function | Description | π Test Link |
---|---|---|
GOIS vs. Full-Image Detection | Evaluates dynamic slicing vs. full-image inference (FI-Det) across images, identifying missed objects and reducing false positives. | |
Video Detection (Single Stage) | Tests frame-wise GOIS slicing to improve small object detection, mitigating occlusion issues. | |
Advanced Video Detection (Two Stage) | Uses coarse-to-fine GOIS slicing based on object density to dynamically adjust slicing strategies and eliminate boundary artifacts. | |
Live Camera Detection (FI vs. GOIS) | Compares full-frame inference vs. GOIS slicing in real-time, highlighting differences in object localization and accuracy. | |
Live Camera Advanced Detection | Demonstrates adaptive slicing based on object density, improving small object retrieval while maintaining efficiency. |
1οΈβ£ Click a Test Link β 2οΈβ£ Upload Image/Video β 3οΈβ£ Adjust Parameters (Slice Size, Overlap, NMS) β 4οΈβ£ Compare FI vs. GOIS Results β 5οΈβ£ Analyze Performance in Real-Time
To validate β the Guided Object Inference Slicing (GOIS) framework, the following Google Colab notebooks provide real-time inference and analysis. These tests allow users to compare GOIS vs. Full-Image Detection (FI-Det) across different datasets and models.
π Compatible Datasets: VisDrone, UAV Surveillance (100-150ft), Pedestrian & Tiny Object Detection, Geo-Sciences
π₯οΈ Applied Models: YOLO11, YOLOv10, YOLOv9, YOLOv8, YOLOv6, YOLOv5, RT-DETR-L, YOLOv8s-Worldv2
GOIS differs from traditional slicing methods (SAHI, ASAHI) by dynamically adjusting slicing parameters based on object density rather than static window sizes. These notebooks enable comparative testing, allowing users to experiment with slicing sizes, overlap rates, and NMS thresholds, addressing key performance trade-offs.
Test Function | Description | Colab Link |
---|---|---|
GOIS vs. SAHI/ASAHI (Proposed Method) | Compares GOIS dynamic slicing vs. static slicing (SAHI, ASAHI-like), analyzing boundary artifacts and false positive rates. | |
GOIS - Single Image Inference | Runs GOIS on a single image, adjusting slicing parameters and overlap rates. | |
GOIS vs. FI-Det (Single Image) | Side-by-side visual comparison of GOIS vs. FI-Det, addressing occlusion and small object visibility. | |
GOIS vs. FI-Det (Multiple Images) | Processes multiple images to compare detection consistency across datasets. | |
Detection Count & Metrics Comparison | Evaluates object count, area coverage, and false positive reduction rates. | |
Slice Size Optimization - Speed Test | Tests how different slicing sizes and overlap settings impact speed vs. accuracy. | |
GOIS - 81 Parameter Combinations Test | Tests 81 slicing, overlap, and NMS variations for optimal performance. | |
GOIS - Three Best Slicing Configurations | Evaluates three optimized GOIS slicing setups based on empirical results: C1: 512px/128px (0.1 overlap, NMS 0.3) C2: 640px/256px (0.2 overlap, NMS 0.4) C3: 768px/384px (0.3 overlap, NMS 0.5). These configurations were determined as optimal trade-offs between accuracy, false positive reduction, and computational efficiency. |
1οΈβ£ Open any Colab link β 2οΈβ£ Run the notebook β 3οΈβ£ Upload images or use datasets β 4οΈβ£ Adjust GOIS parameters (slice size, overlap, NMS) β 5οΈβ£ Compare FI vs. GOIS results
The following tables present benchmark evaluations of the Guided Object Inference Slicing (GOIS) framework, comparing Full Inference (FI-Det) vs. GOIS-Det across different datasets and model configurations.
GOIS integrates a two-stage hierarchical slicing strategy, dynamically adjusting slice size, overlap rate, and NMS thresholds to optimize detection performance. These results highlight improvements in small object detection, reduction of boundary artifacts, and comparisons with existing slicing methods like SAHI and ASAHI.
Test/Part | Dataset & Setup | Description | Benchmark Link |
---|---|---|---|
Part 1 | Without Fine-Tuning - 15% Dataset (970 Images) - VisDrone2019Train | Evaluates FI-Det vs. GOIS-Det on a small dataset subset. The table presents AP and AR metrics for seven models, comparing detection performance with and without GOIS enhancements. The percentage improvement achieved by GOIS is included for each model. | π Section 1 - GOIS Benchmarks |
Part 2 | Fine-Tuned Models (10 Epochs) - Full Dataset (6,471 Images) - VisDrone2019Train | GOIS performance is tested after 10 epochs of fine-tuning. The impact of GOIS slicing parameters (coarse-fine slice size, overlap rate, NMS filtering) is analyzed. The table provides detailed AP and AR metrics for five models, highlighting GOIS's ability to improve small object recall while managing computational efficiency. | π Section 2 - GOIS Benchmarks |
Part 3 | Without Fine-Tuning - Five Models - Full Dataset (6,471 Images) - VisDrone2019Train | Evaluates GOIS on a large-scale dataset without fine-tuning, highlighting its robust generalization ability. Comparative results for five models (YOLO11, YOLOv10, YOLOv9, YOLOv8, YOLOv5) include FI-Det, GOIS-Det, and % improvement achieved by GOIS. This setup assesses GOISβs impact on both small and large object detection. | π Section 3 - GOIS Benchmarks |
Part 4 | General Analysis - Pretrained Weights on VisDrone, xView, MS COCO | GOIS's adaptability is tested across multiple datasets and model architectures. This section evaluates pretrained YOLO and transformer-based detectors (e.g., RT-DETR-L) to measure cross-domain effectiveness, computational trade-offs, and improvements in occlusion handling. Key focus: Can GOIS be applied universally? | π Section 4,5 - GOIS Benchmarks |
Part 5 | Comparative Analysis - SAHI vs. ASAHI vs. GOIS | A quantitative and qualitative comparison between GOIS and other slicing frameworks (SAHI, ASAHI) across VisDrone2019 and xView datasets. This section examines: 1οΈβ£ Boundary artifact reduction, 2οΈβ£ False positive minimization, and 3οΈβ£ Effectiveness of dynamic slicing in handling occlusion issues. Detailed benchmark tables are included. | π Section 4,5 - GOIS vs. SAHI/ASAHI Benchmarks |
β
Dynamic Slicing Optimization: Unlike static SAHI/ASAHI methods, GOIS adjusts slice sizes and overlap rates based on object density, reducing redundant processing.
β
Occlusion Handling & Boundary Artifact Reduction: GOIS minimizes false detections and truncated object artifacts by dynamically refining inference slices.
β
Scalability Across Models & Datasets: Successfully applied to YOLO models, RT-DETR, and various datasets, proving its universal applicability.
β
Performance Gains in Small Object Detection: GOIS consistently improves AP-Small and AR-Small metrics, as validated on VisDrone and xView datasets.
π For additional benchmark results and evaluation scripts, visit: GOIS Benchmarks Repository
If you use GOIS in your research, please consider citing our paper: