-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add MASTER in models.recognition module #300
Conversation
Codecov Report
@@ Coverage Diff @@
## main #300 +/- ##
==========================================
- Coverage 96.11% 93.98% -2.14%
==========================================
Files 48 50 +2
Lines 2135 2326 +191
==========================================
+ Hits 2052 2186 +134
- Misses 83 140 +57
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I added a few comments, could you also add the entry in the documentation and the list of implemented papers in the README & landing page?
name='transform' | ||
) | ||
|
||
@tf.function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious: what does the decorator add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It disables eager mode to improve efficiency. However I don't think if it is better in our case to use it.
doctr/models/recognition/master.py
Outdated
final_logits = tf.zeros(shape=(B, max_len - 1, self.vocab_size + 1), dtype=tf.float32) # don't fgt EOS | ||
# max_len = len + 2 | ||
for i in range(self.max_length - 1): | ||
tf.autograph.experimental.set_loop_options( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with set_loop_options, what does it change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without enforcing the shape with this function (as they did in the official implementation) I had an issue before but it seems to pass now, so let's remove it
I added the model in the landing page and the readme, the entry in the models page of the doc will appears in the next PR when model loss & wrapping function will be available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the edits! Looks good to me!
This PR implements the MASTER model, following the official TF implementation.
This model (based on a transformer decoder, and a resnet/Global Context block encoder) achieved very impressive performances compared to SAR and others, at ICDAR 2021.
The decoder used is exactly the one presented in "Attention is all you need", therefore it is implemented in a recognition.transformer.py. The MASTER is implemented in recognition.master.py.
Any feedback is welcome!
Closes #215