-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using trg[: ,;-1] during training #136
Comments
This is because we have a target sequence, Thus, we input Let me know if this needs clarifying. |
Oh I understand now, Thanks indeed for the elaborated reply. Wajih |
Hi, how does this work when the trg sentence is padded? In that case I imagine the eos token would no longer be in last position right? or am I missing something. EDIT: nevermind I figured it out, in case anyone else is wondering: it works with padded inputs anyway because of ignore_index in the loss function. |
Sorry for the late reply - seems like you've figured it out now but just in case someone else is reading this then I'll explain. When we have padding then our This means that yes, the So in the above example, we only calculate the losses when the decoder's input is |
I have a question the size is trg is [3,7] I check the torchtxt, the sentence is concatenated as: if sentence is concatenated like: |
For anyone who'll find this in future, #182 # more in depth explaination of |
You are correct. Seems to have been updated now. |
Thank you for this awesome repo you have made public. I had one question, during the training loop, you perform the following step
output, _ = model(src, trg[:,:-1])
I was wondering why are we doing the trg[:,:-1] step?
Kind regards
Wajih
The text was updated successfully, but these errors were encountered: