Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AutoMM] Add document classification pipeline #2765

Merged
merged 25 commits into from
Feb 2, 2023

Conversation

cheungdaven
Copy link
Contributor

@cheungdaven cheungdaven commented Jan 26, 2023

Description of changes:
This pull request adds a document classification pipeline which can classify scanned document images into appropriate categories. Specifically,
(1) documents are represented as images;
(3) an OCR pipeline is used to obtain their texts and layout information;
(3) document foundation models (or document transformer) such as LayoutLM, LayoutLmv*, are used as the backbone which can be fine-tuned with document classification datasets.

We added

  • DocumentProcessor: for document processing, e.g., OCR, text, layout, and image feature generation.
  • DocumentTransformer: a classification wrapper for document foundation models. Text-focused models are also supported.
  • Tutorial and unit test on document classification.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions
Copy link

Job PR-2765-05acc33 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/05acc33/index.html

@github-actions
Copy link

Job PR-2765-dd29319 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/dd29319/index.html

@github-actions
Copy link

Job PR-2765-cb50221 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/cb50221/index.html

@cheungdaven cheungdaven changed the title [WIP] Add document classification pipeline [AutoMM] Add document classification pipeline Jan 27, 2023
Copy link
Contributor

@bryanyzhu bryanyzhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great feature and tutorial, thanks.

@github-actions
Copy link

Job PR-2765-be3db78 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/be3db78/index.html

@github-actions
Copy link

Job PR-2765-7c87a17 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/7c87a17/index.html

@github-actions
Copy link

github-actions bot commented Feb 1, 2023

Job PR-2765-db34ed3 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/db34ed3/index.html

@github-actions
Copy link

github-actions bot commented Feb 1, 2023

Job PR-2765-d095b6a is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/d095b6a/index.html

Copy link
Contributor

@zhiqiangdon zhiqiangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Awesome new feature! We may need to sync up with Zihan on the OCR design.

Copy link
Collaborator

@sxjscience sxjscience left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@github-actions
Copy link

github-actions bot commented Feb 1, 2023

Job PR-2765-3336abc is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/3336abc/index.html

@cheungdaven cheungdaven added the model list checked You have updated the model list after modifying multimodal unit tests/docs label Feb 1, 2023
@github-actions
Copy link

github-actions bot commented Feb 1, 2023

Job PR-2765-f154a60 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/f154a60/index.html

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Job PR-2765-598a05d is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/598a05d/index.html

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Job PR-2765-e1b6c11 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/e1b6c11/index.html

Copy link
Contributor

@FANGAreNotGnu FANGAreNotGnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a few minor comments.

multimodal/src/autogluon/multimodal/data/utils.py Outdated Show resolved Hide resolved
multimodal/src/autogluon/multimodal/data/utils.py Outdated Show resolved Hide resolved
@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Job PR-2765-1d97c17 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/1d97c17/index.html

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Job PR-2765-fe711d4 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/fe711d4/index.html

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Job PR-2765-e6343bc is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/e6343bc/index.html

@github-actions
Copy link

github-actions bot commented Feb 2, 2023

Job PR-2765-c46d7cf is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-2765/c46d7cf/index.html

@sxjscience
Copy link
Collaborator

@cheungdaven Feel free to merge when you think it's ready.

@cheungdaven cheungdaven merged commit f3719e1 into autogluon:master Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model list checked You have updated the model list after modifying multimodal unit tests/docs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants