Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teacher-forcing in the Transformer tutorial #30852

UltraSpecialException opened this issue Jul 18, 2019 · 2 comments


Copy link

commented Jul 18, 2019

Thank you for submitting a TensorFlow documentation issue. Per our GitHub
policy, we only address code/doc bugs, performance issues, feature requests, and
build/installation issues on GitHub.

The TensorFlow docs are open source! To get involved, read the documentation
contributor guide:

Please provide a link to the documentation entry, for example:

Description of issue (what needs changing):

Teacher-forcing seems to not be implemented?

Clear description

The documentation here mentions that the training uses teacher-forcing, however, it doesn't seem like, with the code shown, that this is implemented. The variable tar_real is the true outputs, but it seems to only be used for loss and accuracy computations?

Please let me know if I'm making a mistake here! Thanks in advance.

@oanush oanush self-assigned this Jul 19, 2019

@oanush oanush added the comp:model label Jul 19, 2019

@oanush oanush assigned ymodak and unassigned oanush Jul 19, 2019


This comment has been minimized.

Copy link

commented Jul 21, 2019

Each train_step takes in inp and tar objects from the dataset in the training loop. Teacher forcing is indeed used since the correct example from the dataset is always used as input during training (as opposed to the "incorrect" output from the previous training step):

  • tar is split into tar_inp, tar_real (offset by one character)
  • inp, tar_inp is used as input to the model
  • model produces an output which is compared with tar_real to calculate loss
  • model output is discarded (not used anymore)
  • repeat loop

Teacher forcing is a procedure ... in which during training the model receives the ground truth output y(t) as input at time t+1.
Page 372, Deep Learning, 2016.


This comment has been minimized.

Copy link

commented Jul 22, 2019

@tlkh Thanks for the response! I see what you mean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
4 participants
You can’t perform that action at this time.