Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroundingDino - Loss calculation exceptions #31434

Open
2 of 4 tasks
Nitaym opened this issue Jun 14, 2024 · 6 comments · May be fixed by #31828
Open
2 of 4 tasks

GroundingDino - Loss calculation exceptions #31434

Nitaym opened this issue Jun 14, 2024 · 6 comments · May be fixed by #31828

Comments

@Nitaym
Copy link

Nitaym commented Jun 14, 2024

System Info

transformers==4.40.2
Python 3.10.14
Ubuntu WSL under Windows 10

Who can help?

@amyeroberts

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

I've been trying to fine tune GroundingDino with transformers' GroundingDinoForObjectDetection. To ease things I've been using batch_size = 1.
(I haven't tried with any other batch sizes)

When running the model, I got this exception:

Exception has occurred: RuntimeError       (note: full exception trace is shown but execution is paused at: _run_module_as_main)
split_with_sizes expects split_sizes to sum exactly to 2700 (input tensor's size at dimension -1), but got split_sizes=[3]
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/_tensor.py", line 921, in split
    return torch._VF.split_with_sizes(self, split_size, dim)
  File "/home/nitay/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 2723, in forward
    indices = [linear_sum_assignment(c[i]) for i, c in enumerate(cost_matrix.split(sizes, -1))]
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/nitay/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 2866, in forward
    indices = self.matcher(outputs_without_aux, targets)
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/nitay/.local/lib/python3.10/site-packages/transformers/models/grounding_dino/modeling_grounding_dino.py", line 3091, in forward
    loss_dict = criterion(outputs_loss, labels)
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/nitay/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/mnt/folder/main.py", line 84, in train
    outputs = model(input_ids=input_ids, pixel_values=pixel_values, pixel_mask=pixel_mask, labels=labels)
  File "/mnt/folder/main.py", line 98, in <module>
    train()
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main (Current frame)
    return _run_code(code, main_globals, None,
RuntimeError: split_with_sizes expects split_sizes to sum exactly to 2700 (input tensor's size at dimension -1), but got split_sizes=[3]

(There were indeed 3 bounding boxes in the label data)

Expected behavior

Loss should be calculated with no errors

@Nitaym
Copy link
Author

Nitaym commented Jun 14, 2024

Hey @amyeroberts, are you the relevant person for this bug?

I have further questions, if possible:

  1. Regarding labels - What should the "class_labels" tensor be filled in? Where should I get the right class indices from? Since this is an open-set detection model, I assume there's not a simple class index dictionary.

  2. Is there example code somewhere for fine-tuning this GroundingDino model with huggingface / custom datasets?

Thanks!
Nitay

@NielsRogge
Copy link
Contributor

cc @EduardoPach

@EduardoPach
Copy link
Contributor

Hey @amyeroberts, are you the relevant person for this bug?

I have further questions, if possible:

  1. Regarding labels - What should the "class_labels" tensor be filled in? Where should I get the right class indices from? Since this is an open-set detection model, I assume there's not a simple class index dictionary.

  2. Is there example code somewhere for fine-tuning this GroundingDino model with huggingface / custom datasets?

Thanks!

Nitay

TL;DR

I will work to fix this during this week :)

Hey, thanks for the opening the issue! The implementation of GroundingDinoLoss is not actually correct and when adding the model I didn't focused that much on making it right as the original repo doesn't have training code or the loss calculation.

That being said I found an issue in the original repo where authors point to other repos that implement the training for Grounding DINO so I will use that and check with the paper to fix this :)

@Nitaym
Copy link
Author

Nitaym commented Jun 21, 2024

Thanks @EduardoPach!

I'll be happy to assist as needed. Could you point me to the reference implementations you've mentioned?

@zappy586
Copy link

zappy586 commented Jul 5, 2024

Any update @EduardoPach?

@EduardoPach
Copy link
Contributor

Any update @EduardoPach?

I have added the corrections (haven't created the PR yet) I just need to test them know. I will probably do that during the weekend

@EduardoPach EduardoPach linked a pull request Jul 7, 2024 that will close this issue
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants