-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix] fix obj det train and suppress endless warning prints #1267
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1267 +/- ##
==========================================
- Coverage 95.75% 95.72% -0.03%
==========================================
Files 154 154
Lines 6901 6901
==========================================
- Hits 6608 6606 -2
- Misses 293 295 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
@@ -26,7 +26,7 @@ def __init__( | |||
preserve_aspect_ratio: bool = False, | |||
symmetric_pad: bool = False, | |||
) -> None: | |||
super().__init__(size, interpolation) | |||
super().__init__(size, interpolation, antialias=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change of antialias
parameter makes detection of single-digit words consistently worse for some pages in our dataset.
@felixdittrich92 Is it possible to change this back to antialias=False
(at least until new models are trained with this set to True
) or at least make it configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tvasenin 👋🏼,
Thanks for the feedback.
Are you sure that the problem is antialising ?
We have changed the default value from preserve_aspect_ratio=False
to True
see https://github.com/mindee/doctr/releases/tag/v0.7.0
I think reverting it is not an option because in torch under the hood Pillow will always use antialising.
Making it configurable if you use a custom preprocessor would be possible.
But i suggest to test it in front of this change with the old behaviour ocr_predictor(...., preserve_aspect_ratio=False)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @felixT2K !
Yes, I'm absolutely sure that changing antialias
value in these 2 lines in this commit causes the output difference.
This PR has been done before preserve_aspect_ratio=True
change, and I specifically took merge commit for the current PR (efe7ca0), replaced antialias=True
to antialias=False
and got the old results back :)
I think reverting it is not an option because in torch under the hood Pillow will always use antialising.
Since the current models were trained before this change, it seems correct to:
- Temporarily revert (remove)
antialias
setting (or set it back toFalse
to avoid warnings) to retain the old behavior. - Release re-trained models with
antialias=True
and then re-apply this change back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odulcy-mindee What do you think ?
I would prefer to update (resume) the "old" models instead of reverting this because we have also added some more augmentations in the pipeline which should make the models more robust. (Luckily PT trainings are much faster as the corresponding in TF)
@tvasenin FYI we started already with the training period :) On the other hand i got already some feedback that the detection after the last release for the existing models seems to work better 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felixT2K Yeah, I also prefer to udpate the "old" models as you said!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felixT2K @odulcy-mindee Thanks for the update, waiting for the new release with models trained with the new antialias
value!
As for release v0.7.0
, while preserve_aspect_ratio=True
indeed improves results, changing antialias
value for tensor resize (both affected resize operations in the current code are tensor-only) breaks the behavior for the models shipped with this release.
As a workaround, we had to downgrade to the previous commit 0a47726 and thus can wait for the updated models (though we'd still prefer temporary revert and have v0.7.1
released as a hotfix).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI we started already with the training period :)
@felixT2K Any updates regarding re-trained models with new antialias
value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR:
antialias
will change by default toTrue
in v0.17)Any feedback is welcome