-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add dropout in MASTER #349
Conversation
Codecov Report
@@ Coverage Diff @@
## main #349 +/- ##
==========================================
+ Coverage 95.40% 96.06% +0.65%
==========================================
Files 83 83
Lines 3396 3404 +8
==========================================
+ Hits 3240 3270 +30
+ Misses 156 134 -22
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I added some questions
@@ -239,6 +255,8 @@ def call( | |||
x *= tf.math.sqrt(tf.cast(self.d_model, tf.float32)) | |||
x += self.pos_encoding[:, :seq_len, :] | |||
|
|||
x = self.dropout(x, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmh, we already have dropout in the DecoderLayer
, are you positive we're supposed to put one here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://www.tensorflow.org/text/tutorials/transformer#decoder Here they put another one here, but we can remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In pytorch it is also used here after the postional_embedding
: https://pytorch.org/tutorials/beginner/transformer_tutorial.html#define-the-model
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Seems better aligned between PyTorch and TF with your last commits 👌
This PR adds dropout in both tf & pytorch MASTER models
Any feedback is welcome!