Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with detecting smaller objects or objects in the distance #100

Open
SkalskiP opened this issue May 28, 2024 · 21 comments
Open

Problems with detecting smaller objects or objects in the distance #100

SkalskiP opened this issue May 28, 2024 · 21 comments

Comments

@SkalskiP
Copy link
Contributor

Hi 👋🏻

I noticed that YOLOv10 has trouble detecting small objects, especially compared to YOLOv8 and YOLOv9. I have built a small HF Space where you can test this. Is this a known issue? Czy mogę coś zrobić by poprawić ten performance w relacji do pozostałych modeli.

Here is the comparison of YOLOv8l at 640x640 and YOLOv10l at 640x640:

  • green: detected by YOLOv8 and YOLOv10
  • red: detected only by YOLOv8
  • blue: detected only by YOLOv10
yolov8-yolov10-people-walking.1.mp4
@jameslahm
Copy link
Collaborator

Thanks for the fantastic demo and detailed evaluation! We previously posted a comment on your demo page to provide some clarification. https://huggingface.co/spaces/SkalskiP/YOLO-ARENA/discussions/1. Thanks!

@pseacrest
Copy link

Same issue in my model, can we freely set a smaller threshold for YOLOv10 to detect more small object?

@SkalskiP
Copy link
Contributor Author

Hi @jameslahm 👋🏻 If I understand your comment correctly, the differences are due to:

  • Loss of accuracy resulting from conversion to ONNX (is this expected)?
  • Different optimal confidence thresholds?

@jameslahm
Copy link
Collaborator

  • Loss of accuracy resulting from conversion to ONNX (is this expected)?

@SkalskiP Thanks. We try to simulate the loss of accuracy in our local environment. We infer the image using the same onnx conversion following the below process. But the result is still different from that of the demo. So we are not very sure where the reason lies and want to seek your help.

wget https://skalskip-yolo-arena.hf.space/file=/tmp/gradio/f878616a10625ce7dba02bcb34df2df279273666/image.png
yolo export model=yolov10m.pt format=onnx opset=13 simplify half=True device=0
yolo predict model=yolov10m.onnx source=vehicles.png half conf=0.4
  • Different optimal confidence thresholds?

Yes, we think so.

@jameslahm
Copy link
Collaborator

Same issue in my model, can we freely set a smaller threshold for YOLOv10 to detect more small object?

@pseacrest Yes, we think so.

@SkalskiP
Copy link
Contributor Author

The ONNX models that we are running were converted by our ML team. I'll try to understand how they did it and get back to you.

@NickHerrig
Copy link

@SkalskiP ONNX Models were converted using the instructions in the README.md..

The steps below were followed for n/s/m/b/l/x pt files:

  1. The model weights were downloaded for example from yolov10n.pt
  2. Model weights were exported to ONNX via yolo export model=yolov10n.pt format=onnx opset=13 simplify

@jameslahm
Copy link
Collaborator

@NickHerrig Thanks! Would you mind checking if the following results in your local environment match that of the demo? Thank you!

  • Loss of accuracy resulting from conversion to ONNX (is this expected)?

@SkalskiP Thanks. We try to simulate the loss of accuracy in our local environment. We infer the image using the same onnx conversion following the below process. But the result is still different from that of the demo. So we are not very sure where the reason lies and want to seek your help.

wget https://skalskip-yolo-arena.hf.space/file=/tmp/gradio/f878616a10625ce7dba02bcb34df2df279273666/image.png
yolo export model=yolov10m.pt format=onnx opset=13 simplify half=True device=0
yolo predict model=yolov10m.onnx source=vehicles.png half conf=0.4

@SkalskiP
Copy link
Contributor Author

Hi @jameslahm 👋🏻

I just updated https://huggingface.co/spaces/SkalskiP/YOLO-ARENA. Now we load images in Pillow. And the results are slightly different.

@SkalskiP
Copy link
Contributor Author

I also added per-model confidence threshold sliders.

Screenshot 2024-05-29 at 09 01 07

@jameslahm
Copy link
Collaborator

@SkalskiP Thank you very much! The results of the demo seem to be still different from our local environment. We are investigating this. We will get back to you once we identify the root cause.

@salwaghanim
Copy link

salwaghanim commented May 29, 2024

@SkalskiP Hello, you have designed a Simple and elegant interface is it opensource? btw I Just checked your github page and its very impressive I loved the Neural networks Numpy example

@jameslahm
Copy link
Collaborator

@SkalskiP @NickHerrig We found that the inference results seem to be not the same in our codebase and Roboflow Inference with the same onnx file. Here is a minimal example for reproducing this issue.

wget https://skalskip-yolo-arena.hf.space/file=/tmp/gradio/56eee51b0a661453cbf915229dfbadc00b7a0cad/vehicles.png
pip install -q git+https://github.com/THU-MIG/yolov10.git
import numpy as np
import supervision as sv
from inference import get_model
from PIL import Image

def detect_and_annotate(
    input_image: np.ndarray,
    confidence_threshold: float,
    iou_threshold: float = 0,
):
    model = get_model(model_id="coco/22")
    result = model.infer(
        input_image,
        confidence=confidence_threshold,
        iou_threshold=iou_threshold
    )[0]
    detections = sv.Detections.from_inference(result)

    print(detections.data['class_name'])

detect_and_annotate(Image.open('vehicles.png'), 0.4)

from ultralytics import YOLOv10

model = YOLOv10('/tmp/cache/coco/22/weights.onnx', task='detect')
model.predict(source=Image.open('vehicles.png'), verbose=True, conf=0.4)

The output is:

# Roboflow inference
['truck' 'car' 'car']

# This codebase
Loading /tmp/cache/coco/22/weights.onnx for ONNX Runtime inference...

0: 640x640 3 cars, 1 truck, 16.7ms
Speed: 11.3ms preprocess, 16.7ms inference, 15.7ms postprocess per image at shape (1, 3, 640, 640)

We observe that one truck and two cars are detected with Roboflow inference, while one truck and three cars are detected in our codebase. May we ask for your help? Thanks a lot!

@SkalskiP
Copy link
Contributor Author

@salwaghanim, thanks a lot! The UI is built with gradio.

@SkalskiP
Copy link
Contributor Author

@jameslahm I'll let @NickHerrig try to investigate that.

@NickHerrig
Copy link

@SkalskiP and @jameslahm It appears that the different prediction confidence scores are the result of different preprocessing steps (resizing) in inference and yolo cli. Was able to run a test on yolo cli and roboflow/inference with the images already resized to 640px and am seeing the same predictions and confidence scores.

Take a look at the below image where on the right we see inference results and on the left we see yolo cli results:

image

@jameslahm
Copy link
Collaborator

@NickHerrig Thanks a lot for your great efforts! Is the different preprocessing step between inference and yolo cli expected?

@jameslahm
Copy link
Collaborator

@SkalskiP It seems that we and @NickHerrig have identified the root cause. One of the reasons is that roboflow inference invokes the NMS in the postprocessing of YOLOv10, which is not needed as it does not rely on NMS. Besides, the exported onnx files may be corrupted, and replacing our exported onnx models leads to the same results as our local environment. We have submitted a PR roboflow/inference#437 to fix these. Thank you!

@jameslahm
Copy link
Collaborator

@SkalskiP The PR roboflow/inference#437 has been merged. The results of roboflow Inference and our local environment are the same now. Would you mind updating the inference version in the requirements.txt of the HF Space? Thanks a lot!

@jameslahm
Copy link
Collaborator

@SkalskiP Friendly ping :) Thanks!

@jameslahm
Copy link
Collaborator

@SkalskiP We opened a PR in https://huggingface.co/spaces/SkalskiP/YOLO-ARENA/discussions/2 to update the inference version of the HF Space? Would you mind taking a look? Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants