Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add MASTER in models.recognition module #300

Merged
merged 7 commits into from
Jun 11, 2021
Merged

feat: add MASTER in models.recognition module #300

merged 7 commits into from
Jun 11, 2021

Conversation

charlesmindee
Copy link
Collaborator

@charlesmindee charlesmindee commented Jun 9, 2021

This PR implements the MASTER model, following the official TF implementation.

This model (based on a transformer decoder, and a resnet/Global Context block encoder) achieved very impressive performances compared to SAR and others, at ICDAR 2021.

The decoder used is exactly the one presented in "Attention is all you need", therefore it is implemented in a recognition.transformer.py. The MASTER is implemented in recognition.master.py.

Any feedback is welcome!
Closes #215

@charlesmindee charlesmindee added type: enhancement Improvement module: models Related to doctr.models labels Jun 9, 2021
@charlesmindee charlesmindee added this to the 0.3.0 milestone Jun 9, 2021
@charlesmindee charlesmindee self-assigned this Jun 9, 2021
@codecov
Copy link

codecov bot commented Jun 9, 2021

Codecov Report

Merging #300 (c72e2f2) into main (820634b) will decrease coverage by 2.13%.
The diff coverage is 70.15%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #300      +/-   ##
==========================================
- Coverage   96.11%   93.98%   -2.14%     
==========================================
  Files          48       50       +2     
  Lines        2135     2326     +191     
==========================================
+ Hits         2052     2186     +134     
- Misses         83      140      +57     
Flag Coverage Δ
unittests 93.98% <70.15%> (-2.14%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
doctr/models/recognition/master.py 53.12% <53.12%> (ø)
doctr/models/recognition/transformer.py 87.09% <87.09%> (ø)
doctr/models/recognition/__init__.py 100.00% <100.00%> (ø)
...tr/models/detection/differentiable_binarization.py 93.25% <0.00%> (-0.40%) ⬇️
doctr/models/core.py 95.00% <0.00%> (+0.83%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 820634b...c72e2f2. Read the comment docs.

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I added a few comments, could you also add the entry in the documentation and the list of implemented papers in the README & landing page?

doctr/models/recognition/master.py Outdated Show resolved Hide resolved
name='transform'
)

@tf.function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious: what does the decorator add?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It disables eager mode to improve efficiency. However I don't think if it is better in our case to use it.

doctr/models/recognition/master.py Outdated Show resolved Hide resolved
doctr/models/recognition/master.py Outdated Show resolved Hide resolved
doctr/models/recognition/master.py Outdated Show resolved Hide resolved
doctr/models/recognition/master.py Outdated Show resolved Hide resolved
doctr/models/recognition/master.py Outdated Show resolved Hide resolved
final_logits = tf.zeros(shape=(B, max_len - 1, self.vocab_size + 1), dtype=tf.float32) # don't fgt EOS
# max_len = len + 2
for i in range(self.max_length - 1):
tf.autograph.experimental.set_loop_options(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with set_loop_options, what does it change?

Copy link
Collaborator Author

@charlesmindee charlesmindee Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without enforcing the shape with this function (as they did in the official implementation) I had an issue before but it seems to pass now, so let's remove it

doctr/models/recognition/transformer.py Outdated Show resolved Hide resolved
doctr/models/recognition/transformer.py Show resolved Hide resolved
@charlesmindee
Copy link
Collaborator Author

I added the model in the landing page and the readme, the entry in the models page of the doc will appears in the next PR when model loss & wrapping function will be available.

Copy link
Contributor

@fg-mindee fg-mindee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the edits! Looks good to me!

@charlesmindee charlesmindee merged commit f27c9e2 into main Jun 11, 2021
@charlesmindee charlesmindee deleted the MASTER branch June 11, 2021 16:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
module: models Related to doctr.models type: enhancement Improvement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[models] Implement HRGAN or MASTER
2 participants