Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Fix] fix obj det train and suppress endless warning prints #1267

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

felixdittrich92
Copy link
Contributor

@felixdittrich92 felixdittrich92 commented Jul 24, 2023

This PR:

  • fix object detection training also with pretrained (change to torchvision's new weights loading -> pretrained is deprecated)
  • fix the endless warnings from each torchvision resize operation (warning that antialias will change by default to True in v0.17)

Any feedback is welcome

@felixdittrich92 felixdittrich92 added type: bug Something isn't working module: models Related to doctr.models module: transforms Related to doctr.transforms framework: pytorch Related to PyTorch backend topic: object detection Related to the task of object detection labels Jul 24, 2023
@felixdittrich92 felixdittrich92 added this to the 0.7.0 milestone Jul 24, 2023
@felixdittrich92 felixdittrich92 self-assigned this Jul 24, 2023
@codecov
Copy link

codecov bot commented Jul 24, 2023

Codecov Report

Merging #1267 (87ef727) into main (f113530) will decrease coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #1267      +/-   ##
==========================================
- Coverage   95.75%   95.72%   -0.03%     
==========================================
  Files         154      154              
  Lines        6901     6901              
==========================================
- Hits         6608     6606       -2     
- Misses        293      295       +2     
Flag Coverage Δ
unittests 95.72% <100.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/models/obj_detection/faster_rcnn/pytorch.py 100.00% <100.00%> (ø)
doctr/transforms/modules/pytorch.py 72.22% <100.00%> (ø)

... and 1 file with indirect coverage changes

@felixdittrich92 felixdittrich92 merged commit efe7ca0 into mindee:main Jul 27, 2023
57 of 58 checks passed
@felixdittrich92 felixdittrich92 deleted the fix-obj-det branch July 27, 2023 13:56
@@ -26,7 +26,7 @@ def __init__(
preserve_aspect_ratio: bool = False,
symmetric_pad: bool = False,
) -> None:
super().__init__(size, interpolation)
super().__init__(size, interpolation, antialias=True)
Copy link

@tvasenin tvasenin Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change of antialias parameter makes detection of single-digit words consistently worse for some pages in our dataset.

@felixdittrich92 Is it possible to change this back to antialias=False (at least until new models are trained with this set to True) or at least make it configurable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tvasenin 👋🏼,

Thanks for the feedback.
Are you sure that the problem is antialising ?
We have changed the default value from preserve_aspect_ratio=False to True see https://github.com/mindee/doctr/releases/tag/v0.7.0

I think reverting it is not an option because in torch under the hood Pillow will always use antialising.
Making it configurable if you use a custom preprocessor would be possible.
But i suggest to test it in front of this change with the old behaviour ocr_predictor(...., preserve_aspect_ratio=False)

Copy link

@tvasenin tvasenin Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @felixT2K !

Yes, I'm absolutely sure that changing antialias value in these 2 lines in this commit causes the output difference.

This PR has been done before preserve_aspect_ratio=True change, and I specifically took merge commit for the current PR (efe7ca0), replaced antialias=True to antialias=False and got the old results back :)

I think reverting it is not an option because in torch under the hood Pillow will always use antialising.

Since the current models were trained before this change, it seems correct to:

  1. Temporarily revert (remove) antialias setting (or set it back to False to avoid warnings) to retain the old behavior.
  2. Release re-trained models with antialias=True and then re-apply this change back.

Copy link
Contributor

@felixT2K felixT2K Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@odulcy-mindee What do you think ?
I would prefer to update (resume) the "old" models instead of reverting this because we have also added some more augmentations in the pipeline which should make the models more robust. (Luckily PT trainings are much faster as the corresponding in TF)
@tvasenin FYI we started already with the training period :) On the other hand i got already some feedback that the detection after the last release for the existing models seems to work better 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixT2K Yeah, I also prefer to udpate the "old" models as you said!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@felixT2K @odulcy-mindee Thanks for the update, waiting for the new release with models trained with the new antialias value!

As for release v0.7.0, while preserve_aspect_ratio=True indeed improves results, changing antialias value for tensor resize (both affected resize operations in the current code are tensor-only) breaks the behavior for the models shipped with this release.
As a workaround, we had to downgrade to the previous commit 0a47726 and thus can wait for the updated models (though we'd still prefer temporary revert and have v0.7.1 released as a hotfix).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI we started already with the training period :)

@felixT2K Any updates regarding re-trained models with new antialias value?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tvasenin mostly done -> #1364 :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
framework: pytorch Related to PyTorch backend module: models Related to doctr.models module: transforms Related to doctr.transforms topic: object detection Related to the task of object detection type: bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants