-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add DINO DETR Model to HuggingFace Transformers #36711
base: main
Are you sure you want to change the base?
Conversation
…nd original DINO repo. Substitute Deformable for DINO where needed.
…nly in the DINO class
… are still missing
Hello @qubvel! The state of the PR is that I've gotten the forward pass to match the original implementation up to the required precision. I've marked this as a draft because I wanted to just get a first opinion on if I'm modifying the correct files in the codebase. Let me know if I'm missing anything big. Regarding tests, i've copied some from Deformable Detr but haven't tried getting them to work. Let me know if I need to add any apart from what's already there. |
Hi @konstantinos-p! Thanks a lot for working on the model, super excited to see it merged! 🚀 Before diving into the implementation details, here are some general comments we need to address in the PR:
Thanks again! Looking forward to the updates 🚀 |
Thanks for the comments! I'll start addressing them! |
The integration tests and most unit tests are passing. The unit tests that are failing are mainly due to gradient checkpointing not being supported and due to using shared tensors in the implementation (which causes saving and loading tests to fail).
What does this PR do?
This PR introduces the DINO DETR (DEtection TRansformer with DIstillation) model (https://arxiv.org/abs/2203.03605) to the Hugging Face Transformers library. DINO DETR is a state-of-the-art object detection model that builds upon the original DETR architecture, incorporating improvements such as:
The model achieves strong performance on COCO test-dev (https://paperswithcode.com/sota/object-detection-on-coco).
Fixes #36205
What's included
Resources I've used
Who can review?
@amyeroberts, @qubvel