Skip to content

predict with onnx #133

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

valavanisleonidas
Copy link
Contributor

@valavanisleonidas valavanisleonidas commented Apr 9, 2025

Description

This PR adds support for ONNX-based inference in RF-DETR, enabling faster model execution compared to PyTorch CPU inference. This change introduces an alternative inference path using the onnxruntime backend.

Ticket : #64

Type of change

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How has this change been tested, please provide a testcase or example of how you tested the change?

Python test file

import time
import cv2
import numpy as np
from PIL import Image
import supervision as sv
from rfdetr.detr import RFDETRLarge
from rfdetr.util.coco_classes import COCO_CLASSES


def predict(model, image_path, im_save):

    image = Image.open(image_path)

    start = time.time()
    detections = model.predict(image_path, threshold=0.5)
    end = time.time()
    print("only predict : ", end - start)

    labels = [
        f"{COCO_CLASSES[class_id]} {confidence:.2f}"
        for class_id, confidence in zip(detections.class_id, detections.confidence)
    ]

    image = image.convert("RGB")
    annotated_image = sv.BoxAnnotator().annotate(image.copy(), detections)
    annotated_image = sv.LabelAnnotator().annotate(annotated_image, detections, labels)

    sv.plot_image(annotated_image)
    cv2.imwrite(im_save, np.array(annotated_image))


image_url = "path/to/image"

start = time.time()
model = RFDETRLarge()
predict(model, image_url, "predict_old.jpg")
end = time.time()
print("old: ", end - start)

model = RFDETRLarge(onnx_path="output/inference_model.onnx")
predict(model, image_url, "predict_onnx.jpg")
end = time.time()
print("onnx: ", end - start)

The bboxes and confidence scores of the image i tested are the same.

The time using GPU is 0,14 sec for cuda with pth model and 0,11 sec for onnx model

Any specific deployment considerations

Docs

  • Docs updated? What were the changes:

@SkalskiP
Copy link
Collaborator

SkalskiP commented Apr 9, 2025

Let's wait this PR until we'll merge #129.

@SkalskiP
Copy link
Collaborator

@valavanisleonidas #129 got merged. We can work on this PR now.

@valavanisleonidas
Copy link
Contributor Author

@valavanisleonidas #129 got merged. We can work on this PR now.

Yes I saw that thank you. should we also have dynamic batch size for onnx models since the predict function support batch inference ?

@valavanisleonidas
Copy link
Contributor Author

@SkalskiP I think the PR is mostly ready. Two things to mention.

  1. in the providers for onnx we can add a device id for the GPU to use (and add an argument as well).
  2. I tried to export dynamic onnx model but something crashes. Even though the model input is ['batch', 3, 560, 560] it crashes with error in layer transformer/Reshape_13, so probably model does not support dynamic batching in onnx. We have to handle this somehow in predict or maybe fix the model to support dynamic batching ?? :D

@valavanisleonidas
Copy link
Contributor Author

Hey @SkalskiP, any news on this PR ?

@NOTGOOOOD
Copy link

Hi, dear developer, how is going on ? I also need to use the ONNX model to predict.

@valavanisleonidas
Copy link
Contributor Author

Hi, dear developer, how is going on ? I also need to use the ONNX model to predict.

this PR is not up to date with the latest changes. I can do it if @SkalskiP and the team is still interested in merging it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants