New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why use the expected output in decoder training? #76
Comments
Ok.thx
from Alimail iPhone
------------------Original Mail ------------------
From:Yang Tian <notifications@github.com>
Date:2019-02-26 16:01:49
Recipient:Kyubyong/transformer <transformer@noreply.github.com>
CC:Cally <mx15025700935@aliyun.com>, Author <author@noreply.github.com>
Subject:Re: [Kyubyong/transformer] Why use the expected output in decoder training? (#76)
Maybe you can find answer of the question on the paper Attention is all you need.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
In the paper section 3.1:
It takes a mask mechanism to prevent output from decoder_input. And the reason of using decoder_input, i think it is used to calc the loss value. |
I am also confused about this question. |
@ywl0911 if you understand it, please contract me, thx! |
@ty5491003 |
@ywl0911 当我说出原因时你一定会觉得很搞笑。 |
这……,那就是这个代码这个地方有问题么,是不是应该改成将上个时刻decoder的输出作为下个时刻的输入~ |
Transformer.eval(), at model.py:152 the second input is unused for predictions. |
But why don't use this method in training? Because it was very slow? |
when training, use the expected output in decoder training to accelerate convergence, teacher forcing。 |
@Kyubyong I have a question, why use decoder_input in decoder training? I think it will influence the model output,
The text was updated successfully, but these errors were encountered: