Follow ups to DocumentQuestionAnswering Pipeline #18926

ankrgyl · 2022-09-07T16:55:54Z

Feature request

PR #18414 has a number of TODOs left over which we'd like to track as follow up tasks.

Pipeline

Add support for documents which have more than the tokenizer span (e.g. 512) words
Add support for multi-page documents (e.g. for Donut, we need to present one image per page)
Rework use of tokenizer to avoid the need for add_prefix_space=True
Re-add support for Donut
Refactor Donut usage in the pipeline or move logic into the tokenizer, so that pipeline does not have as much Donut-specific code

Testing

Enable test_small_model_pt_donut once hf-internal-testing/tiny-random-donut is implemented

Documentation / Website

Add DocumentQuestionAnswering demo to Hosted Inference API so that model demos work
Add tutorial documentation to Task Summary

Motivation

These are follow ups that we cut from the initial scope of PR #18414.

Your contribution

Happy to contribute many or all of these.

The text was updated successfully, but these errors were encountered:

NielsRogge · 2022-09-08T09:32:17Z

cc'ing @Narsil for enabling the model on the inference API, cc'ing @stevhliu for adding tutorial documentation to the task summary

ankrgyl · 2022-09-09T00:12:22Z

@NielsRogge because we removed donut-swin from AutoModelForDocumentQuestionAnswering, you can no longer create a pipeline with donut, i.e.

In [2]: p = pipeline('document-question-answering', model='naver-clova-ix/donut-base-finetuned-docvqa')
/Users/ankur/projects/transformers/venv/lib/python3.10/site-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/TensorShape.cpp:2895.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
The model 'VisionEncoderDecoderModel' is not supported for document-question-answering. Supported models are ['LayoutLMForQuestionAnswering', 'LayoutLMv2ForQuestionAnswering', 'LayoutLMv3ForQuestionAnswering'].

Should we add it back to that list? Or what is the best way to support that?

ankrgyl · 2022-09-26T14:17:21Z

Could we re-open this (I don't think I have permissions to)? There are still a few changes necessary to complete all of the checkboxes.

github-actions · 2022-10-22T15:01:43Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

JuheonChu · 2023-03-23T03:34:19Z

@ankrgyl Can I ask you if I can work on this?
If I want to work on adding support for multi-page documents (e.g. for Donut, we need to present one image per page), may I ask you where I can start to proceed making contributions?

ankrgyl · 2023-03-23T04:03:04Z

Absolutely!

Feel free to start looking here: https://github.com/huggingface/transformers/blob/main/src/transformers/pipelines/document_question_answering.py

JuheonChu · 2023-03-24T02:07:58Z

Add support for multi-page documents (e.g. for Donut, we need to present one image per page)

Thank you! I carefully read it! In order to add support for multi-page documents in document_question_answering.py, should I modify some methods in that file such as preprocess()? Can I create a pull request of the file you provided after modifying those methods?

elabongaatuo · 2023-04-03T17:37:44Z

@ankrgyl Hello. I would love to contribute to this task : Add tutorial documentation to Task Summary. Is it open and may I get pointers on how to begin working on it?
Thank you.

y3sar · 2023-05-10T05:51:42Z

@elabongaatuo It seems like the Add tutorial documentation to Task Summary is still open. are you working on it? It seems you need to change starting from here

elabongaatuo · 2023-05-10T06:07:41Z

Hello @y3sar , no, I am not working on it at the moment.

y3sar · 2023-05-10T06:16:53Z

@elabongaatuo then I would like to take it up if there is no problem with you

Hello @y3sar , no, I am not working on it at the moment.

elabongaatuo · 2023-05-10T06:24:15Z

@elabongaatuo then I would like to take it up if there is no problem with you

Hello @y3sar , no, I am not working on it at the moment.

@y3sar , sure thing. 😊 no problem.

rajveer43 · 2023-07-26T07:18:59Z

@ankrgyl I would Like to work on this Add tutorial documentation to Task Summary and also in Add support for multi-page documents (e.g. for Donut, we need to present one image per page)

hackpk · 2023-08-09T17:10:43Z

@ankrgyl Can i work on Refactor Donut usage ???

dhivyeshrk · 2023-10-23T05:56:30Z

Hey @ankrgyl ! I would be happy to contribute to this issue by adding support for multi-page documents.
Could you assign this to me ?

ArthurZucker · 2023-10-23T10:37:34Z

Hey! For anyone wanting to contribute, the best way is to just open a PR and link it here! We don't usually assign issues as they can be taken over in case of inactivity for example! 🤗

ankrgyl mentioned this issue Sep 7, 2022

Add DocumentQuestionAnswering pipeline #18414

Merged

5 tasks

ankrgyl mentioned this issue Sep 8, 2022

Tracking integration for Document QA huggingface/hub-docs#299

Closed

6 tasks

ankrgyl mentioned this issue Sep 14, 2022

Move the model type check in DocumentQuestionAnswering to support Donut #19027

Merged

5 tasks

sgugger closed this as completed in #19027 Sep 26, 2022

sgugger reopened this Sep 26, 2022

github-actions bot closed this as completed Oct 31, 2022

NielsRogge reopened this Nov 1, 2022

NielsRogge added the Good First Issue label Nov 1, 2022

AdiaWu mentioned this issue Mar 24, 2023

Update document_question_answering.py #22354

Closed

5 tasks

This was referenced May 10, 2023

Add document-question-answering in task_summary #23252

Closed

Add Multimodal heading and Document question answering in task_summary.mdx #23318

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Follow ups to DocumentQuestionAnswering Pipeline #18926

Follow ups to DocumentQuestionAnswering Pipeline #18926

ankrgyl commented Sep 7, 2022 •

edited

NielsRogge commented Sep 8, 2022

ankrgyl commented Sep 9, 2022

ankrgyl commented Sep 26, 2022

github-actions bot commented Oct 22, 2022

JuheonChu commented Mar 23, 2023 •

edited

ankrgyl commented Mar 23, 2023

JuheonChu commented Mar 24, 2023 •

edited

elabongaatuo commented Apr 3, 2023

y3sar commented May 10, 2023

elabongaatuo commented May 10, 2023

y3sar commented May 10, 2023

elabongaatuo commented May 10, 2023

rajveer43 commented Jul 26, 2023 •

edited

hackpk commented Aug 9, 2023

dhivyeshrk commented Oct 23, 2023

ArthurZucker commented Oct 23, 2023

Follow ups to DocumentQuestionAnswering Pipeline #18926

Follow ups to DocumentQuestionAnswering Pipeline #18926

Comments

ankrgyl commented Sep 7, 2022 • edited

Feature request

Pipeline

Testing

Documentation / Website

Motivation

Your contribution

NielsRogge commented Sep 8, 2022

ankrgyl commented Sep 9, 2022

ankrgyl commented Sep 26, 2022

github-actions bot commented Oct 22, 2022

JuheonChu commented Mar 23, 2023 • edited

ankrgyl commented Mar 23, 2023

JuheonChu commented Mar 24, 2023 • edited

elabongaatuo commented Apr 3, 2023

y3sar commented May 10, 2023

elabongaatuo commented May 10, 2023

y3sar commented May 10, 2023

elabongaatuo commented May 10, 2023

rajveer43 commented Jul 26, 2023 • edited

hackpk commented Aug 9, 2023

dhivyeshrk commented Oct 23, 2023

ArthurZucker commented Oct 23, 2023

ankrgyl commented Sep 7, 2022 •

edited

JuheonChu commented Mar 23, 2023 •

edited

JuheonChu commented Mar 24, 2023 •

edited

rajveer43 commented Jul 26, 2023 •

edited