Donut model only works on 4.36.2 for inference, not 4.37.2 #28846

VikParuchuri · 2024-02-02T20:16:44Z

System Info

I have a donut model with some slight customizations to the mbart decoder (added gqa and moe). It works fine on 4.36.2 and 4.37.2 for training.

But inference only works on 4.36.2. When I run inference on 4.37.2, then the output degenerates into repetition. @amyeroberts

Here is an example (the text has been ocred with donut, then rendered back onto a page image):

This is with 4.36.2:

And this is with 4.37.2:

Everything else is identical (same system, same packages). I don't see anything obvious in the release notes that would cause this.

Who can help?

No response

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

Clone https://github.com/VikParuchuri/surya/tree/rec (has to be the rec branch)
Install (see README)
Run python benchmark/recognition.py --max 1 --debug

You should see different output with different transformers versions.

Expected behavior

I expect the output to be the same with both versions, and to not degenerate into repetition.

The text was updated successfully, but these errors were encountered:

ArthurZucker · 2024-02-05T02:21:33Z

fyi @NielsRogge

github-actions · 2024-03-04T08:11:25Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

NielsRogge · 2024-03-04T09:02:01Z

Hi sorry for the late reply, I've checked all slow integration tests of Donut, and they all pass:

RUN_SLOW=yes pytest tests/models/vision_encoder_decoder/test_modeling_vision_encoder_decoder.py::DonutModelIntegrationTest

Which would mean that Donut works as expected on v4.38.2 dev.

This might be explained because of the customizations? Could you clarify which things are changed compared to Donut?

github-actions · 2024-04-02T08:04:16Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this as completed Apr 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Donut model only works on 4.36.2 for inference, not 4.37.2 #28846

Donut model only works on 4.36.2 for inference, not 4.37.2 #28846

VikParuchuri commented Feb 2, 2024

ArthurZucker commented Feb 5, 2024

github-actions bot commented Mar 4, 2024

NielsRogge commented Mar 4, 2024

github-actions bot commented Apr 2, 2024

Donut model only works on 4.36.2 for inference, not 4.37.2 #28846

Donut model only works on 4.36.2 for inference, not 4.37.2 #28846

Comments

VikParuchuri commented Feb 2, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ArthurZucker commented Feb 5, 2024

github-actions bot commented Mar 4, 2024

NielsRogge commented Mar 4, 2024

github-actions bot commented Apr 2, 2024