Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add Dataset Preparer #1484

Merged
merged 20 commits into from
Nov 2, 2022
Merged

Conversation

xinke-wang
Copy link
Collaborator

This PR adds dataset preparer base classes, as well as configs of three datasets, including ICDAR2015, TotalText, and Wildreceipt. See docs for usage.

dataset_zoo/icdar2015/metafile.yml Show resolved Hide resolved
dataset_zoo/icdar2015/textdet.py Outdated Show resolved Hide resolved
dataset_zoo/icdar2015/textdet.py Outdated Show resolved Hide resolved
dataset_zoo/icdar2015/textrecog.py Show resolved Hide resolved
docs/en/user_guides/data_prepare/dataset_preparer.md Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_obtainer.py Show resolved Hide resolved
mmocr/datasets/preparers/data_preparer.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/ic15_parser.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/loaders.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/loaders.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/base.py Show resolved Hide resolved
mmocr/datasets/preparers/parsers/ic15_parser.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/ic15_parser.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/wildreceipt.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/wildreceipt.py Outdated Show resolved Hide resolved
mmocr/utils/fileio.py Outdated Show resolved Hide resolved
mmocr/utils/fileio.py Outdated Show resolved Hide resolved
@xinke-wang xinke-wang changed the base branch from dev-1.x to test-1.x October 26, 2022 06:41
@xinke-wang xinke-wang changed the base branch from test-1.x to dev-1.x October 26, 2022 06:42
@xinke-wang xinke-wang changed the base branch from dev-1.x to test-1.x October 26, 2022 08:09
@xinke-wang xinke-wang changed the base branch from test-1.x to dev-1.x October 26, 2022 08:09
@xinke-wang xinke-wang changed the base branch from dev-1.x to test-1.x October 27, 2022 02:29
@xinke-wang xinke-wang changed the base branch from test-1.x to dev-1.x October 27, 2022 02:29
@xinke-wang xinke-wang changed the base branch from dev-1.x to 1.x October 27, 2022 07:38
@xinke-wang xinke-wang changed the base branch from 1.x to dev-1.x October 27, 2022 07:38
docs/en/user_guides/data_prepare/dataset_preparer.md Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Outdated Show resolved Hide resolved
docs/en/dataset_zoo.py Outdated Show resolved Hide resolved
@xinke-wang xinke-wang changed the base branch from dev-1.x to 1.x October 27, 2022 11:16
@xinke-wang xinke-wang changed the base branch from 1.x to dev-1.x October 27, 2022 11:16
docs/en/user_guides/data_prepare/dataset_preparer.md Outdated Show resolved Hide resolved
mmocr/utils/fileio.py Outdated Show resolved Hide resolved
tools/dataset_converters/prepare_dataset.py Outdated Show resolved Hide resolved
xinke-wang and others added 2 commits October 28, 2022 10:33
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
Co-authored-by: Tong Gao <gaotongxiao@gmail.com>
mmocr/datasets/preparers/data_obtainer.py Outdated Show resolved Hide resolved
configs/textdet/_base_/datasets/totaltext.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/data_converter.py Show resolved Hide resolved
mmocr/datasets/preparers/parsers/base.py Outdated Show resolved Hide resolved
mmocr/datasets/preparers/parsers/base.py Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Oct 28, 2022

Codecov Report

Base: 88.16% // Head: 83.33% // Decreases project coverage by -4.82% ⚠️

Coverage data is based on head (8ee2b7b) compared to base (ff04034).
Patch coverage: 2.01% of modified lines in pull request are covered.

❗ Current head 8ee2b7b differs from pull request most recent head a1d450e. Consider uploading reports for the commit a1d450e to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           dev-1.x    #1484      +/-   ##
===========================================
- Coverage    88.16%   83.33%   -4.83%     
===========================================
  Files          147      156       +9     
  Lines         9249     9792     +543     
  Branches      1268     1350      +82     
===========================================
+ Hits          8154     8160       +6     
- Misses         863     1401     +538     
+ Partials       232      231       -1     
Flag Coverage Δ
unittests 83.33% <2.01%> (-4.83%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmocr/datasets/preparers/data_converter.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/data_obtainer.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/data_preparer.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/dumpers/dumpers.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/parsers/base.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/parsers/coco_parser.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/parsers/ic15_parser.py 0.00% <0.00%> (ø)
...ocr/datasets/preparers/parsers/totaltext_parser.py 0.00% <0.00%> (ø)
mmocr/datasets/preparers/parsers/wildreceipt.py 0.00% <0.00%> (ø)
mmocr/utils/polygon_utils.py 97.40% <ø> (-1.30%) ⬇️
... and 6 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@gaotongxiao gaotongxiao mentioned this pull request Oct 31, 2022
Copy link
Collaborator

@gaotongxiao gaotongxiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try not to introduce new changes to this PR, including the documentation. Otherwise this PR would never be able to be merged

@xinke-wang
Copy link
Collaborator Author

Try not to introduce new changes to this PR, including the documentation. Otherwise this PR would never be able to be merged

Added a final feature to this PR, which will automatically generate a dataset config file (if not exists). And I will not add new codes to this PR unless further comments are made.

@gaotongxiao
Copy link
Collaborator

@xinke-wang Nice feature!

@gaotongxiao gaotongxiao merged commit 8864fa1 into open-mmlab:dev-1.x Nov 2, 2022
@xinke-wang xinke-wang deleted the dp branch November 2, 2022 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants