Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[references] TF / PT crop & document orientation classifier train scripts #1432

Merged
merged 16 commits into from
Jan 22, 2024

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Jan 17, 2024

This PR:

  • Add training scripts (TF & PT) for the crop orientation classifier and additional for general document orientation classification (needed for 0.9.0 (similar to the crop_orientation_predictor) to generally recognize the documents rotation angle correctly before we compute the angle from the bin_map - which works only for rotated documents in the range ~ -60-60 well)
  • Add a simple dataset (prefilled with 0 degree targets - to match the AbstractDataset checks)

Any feedback is welcome :)

  • Tests
  • Optimization (input_size tests left) - for crop orientation (32x32 is enough and works well) / for document orientation (256x256 seems to be enough)
  • Clean up
  • Readme

Dummy runs (mobilenet_v3_small)

crop orientation (800k train / 200k val)
Validation loss decreased 0.372947 --> 0.355291: saving state...
Epoch 5/30 - Validation loss: 0.355291 (Acc: 81.51%)

doc orientation (only 4k train / 1k val)
Validation loss decreased 0.559973 --> 0.481535: saving state...
Epoch 14/30 - Validation loss: 0.481535 (Acc: 79.22%)

Copy link

codecov bot commented Jan 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (bcf3cd4) 95.76% compared to head (8a65bd3) 95.84%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1432      +/-   ##
==========================================
+ Coverage   95.76%   95.84%   +0.07%     
==========================================
  Files         155      162       +7     
  Lines        6941     7095     +154     
==========================================
+ Hits         6647     6800     +153     
- Misses        294      295       +1     
Flag Coverage Δ
unittests 95.84% <100.00%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@felixdittrich92 felixdittrich92 added this to the 0.9.0 milestone Jan 18, 2024
@felixdittrich92 felixdittrich92 added topic: documentation Improvements or additions to documentation type: enhancement Improvement topic: ci Related to CI module: datasets Related to doctr.datasets ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend topic: character classification Related to the task of character classification labels Jan 18, 2024
@felixdittrich92 felixdittrich92 self-assigned this Jan 18, 2024
@felixdittrich92 felixdittrich92 changed the title [DRAFT] orientation train script [DRAFT] [references] TF / PT crop & document orientation classifier train scripts Jan 18, 2024
Copy link
Collaborator

@odulcy-mindee odulcy-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code seems good 👍

@felixdittrich92
Copy link
Contributor Author

@odulcy-mindee So this part could be merged into 0.8.0 and the rest of this task is for 0.9.0 :)
Are you fine with the additional OrientationDataset if was thinking on putting it into references/classification/utils but i think the dataset folder is a better fit 😅

@felixdittrich92 felixdittrich92 marked this pull request as ready for review January 18, 2024 15:10
@felixdittrich92 felixdittrich92 changed the title [DRAFT] [references] TF / PT crop & document orientation classifier train scripts [references] TF / PT crop & document orientation classifier train scripts Jan 18, 2024
Copy link
Collaborator

@odulcy-mindee odulcy-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you fine with the additional OrientationDataset if was thinking on putting it into references/classification/utils but i think the dataset folder is a better fit 😅
@felixdittrich92 yeah, that makes more sense 👍

I added one more suggestion.
For another PR: is it possible to have only one train_{framework}.py for both purposes ? Otherwise, maybe it makes more sense to rename the other scripts train_{framework}_character.py or something else ?

references/classification/train_pytorch_orientation.py Outdated Show resolved Hide resolved
references/classification/train_tensorflow_orientation.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@odulcy-mindee odulcy-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @felixdittrich92 ! 🚀

@felixdittrich92 felixdittrich92 merged commit f316489 into mindee:main Jan 22, 2024
68 of 69 checks passed
@felixdittrich92
Copy link
Contributor Author

Are you fine with the additional OrientationDataset if was thinking on putting it into references/classification/utils but i think the dataset folder is a better fit 😅
@felixdittrich92 yeah, that makes more sense 👍

I added one more suggestion. For another PR: is it possible to have only one train_{framework}.py for both purposes ? Otherwise, maybe it makes more sense to rename the other scripts train_{framework}_character.py or something else ?

I totally missed this 😅
Merging both into one makes no sense in my mind i think this would be confusing
But renaming the other one with suffix character would be fine 👍

@odulcy-mindee
Copy link
Collaborator

But renaming the other one with suffix character would be fine 👍

Ok, we can do that in another PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ext: references Related to references folder framework: pytorch Related to PyTorch backend framework: tensorflow Related to TensorFlow backend module: datasets Related to doctr.datasets topic: character classification Related to the task of character classification topic: ci Related to CI topic: documentation Improvements or additions to documentation type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[references] Add a training script for crop orientation classification
2 participants