-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IWSLT'14 Results using ESPnet2-MT #4132
Conversation
Hi @siddalmia @ftshijt @brianyan918 Could you check this PR? If the changes are reasonable, we can continue the experiments. |
Looks good! I believe your initial result on this dataset is reasonable as well. |
Codecov Report
@@ Coverage Diff @@
## master #4132 +/- ##
=======================================
Coverage 80.43% 80.43%
=======================================
Files 442 442
Lines 38557 38557
=======================================
Hits 31015 31015
Misses 7542 7542
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
Perfect thank you @pyf98 ! Some minor comments -
|
I leave detailed reviews by @siddalmia @ftshijt @brianyan918. |
Thanks for your comments. I'm fixing them now. BTW, I have a question about the generated hypotheses. In some examples (131 out of 6750 in the test set), there is a trailing
If we directly use this result to calculate BLEU, will it adversely affect the performance? |
I think it would. Not sure that sacrebleu knows to ignore that.
…On Fri, Mar 4, 2022 at 3:36 PM Yifan Peng ***@***.***> wrote:
Thanks for your comments. I'm fixing them now.
BTW, I have a question about the generated hypotheses. In some examples,
there is a trailing <sos/eos> as shown below:
They 're even hurting .<sos/eos> (They-utt000382)
I 'm a writer .<sos/eos> (I-utt000553)
Thank you .<sos/eos> (Thank-utt000722)
If we directly use this result to calculate BLEU, will it adversely affect
the performance?
—
Reply to this email directly, view it on GitHub
<#4132 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQBBOOJHAO7Q3IPYF5DCZTU6JX3ZANCNFSM5P6JNJOA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
This pull request is now in conflict :( |
Hi @siddalmia, about the second point, I checked the It seems that the source text is not combined with the target text if |
I have another question about the encoder input embedding layer. Where should we put the embedding layer, which is a
Current results and the weight tying implementation are based on the first option. |
I think it is actually a bug to be fixed, could you help me to combine the tgt_text ahead of time? |
Since the PR already looks great (and also could be beneficial to upcoming deadlines), I will merge the PR now. @pyf98 could you also reflect the bug in ST implementation in another PR? |
Cool! Thanks @ftshijt |
Hi, I tried some configs for IWSLT'14 De-En using ESPnet2-MT. I followed the standard Transformer config in fairseq, but the batchifying method is different. I think my effective batch size is a few times larger than fairseq, and my learning rate and warmup steps are also larger. The current script for BLEU calculation is different from the standard script for this dataset.
The best result is reported in
README.md
. Besides, I added two options inmt/espnet_model.py
to allow weight tying of the input embedding and the output linear layer inencoder
anddecoder
.We probably need to test the implementation further on other datasets.