Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use the last char instead of the first in prediction #22247

Merged
merged 5 commits into from
Oct 16, 2018
Merged

Use the last char instead of the first in prediction #22247

merged 5 commits into from
Oct 16, 2018

Conversation

leondgarse
Copy link
Contributor

  1. As it's mentioned the start_string could be a string here, in my understanding, the correctly predicted next character should be the last one in predictions.
  2. As tf.multinomial is not used any more here, temperature has no effect either.
  • i.e.
    • With start_string = 'QUEEN', the model prediction string is UEEN:, so the next char should be :
    • In the loop run for i in range(num_generate):
      • With predicted_id = tf.argmax(predictions[0]).numpy(), output is QUEENUS:\nI think so ...
      • With predicted_id = tf.argmax(predictions[-1]).numpy(), output is QUEEN:\nThe grand the ...

@alextp
Copy link
Contributor

alextp commented Sep 13, 2018

@yashk2810 does this make sense to you?

I'm a little confused about what this change entails.

@yifeif yifeif added the awaiting review Pull request awaiting review label Sep 13, 2018
@yifeif yifeif self-assigned this Sep 13, 2018
@yashk2810
Copy link
Member

yashk2810 commented Sep 13, 2018

So, I tried out the change and for a single character start_string, the predictions shape is (1, 65). But if the start_string is a word like 'Queen', the shape of the first prediction is (5, 65). Hence we need to use the last character from the prediction to predict the next character, hence predictions[-1] makes sense there.

This doesn't matter if the input is a single character. So changing to predictions[-1] will make the predictions right if the start_string is a word and it won't change anything if the start_string is a character.

As far as temperature goes, it makes sense to remove it since tf.multinomial is not being used.

@leondgarse Can you also remove the comment above the predicted_id line which mentions tf.multinomial and change the text in the markdown above?

Replace 'a multinomial distribution' in comment by 'argmax', and also in the text introduction.
@leondgarse
Copy link
Contributor Author

Sorry for the delay, I just replaced the sentence a multinomial distribution by argmax in comment and the above text.

Remove a temperature line in Next Steps part
@alextp alextp added awaiting testing (then merge) kokoro:force-run Tests on submitted change ready to pull PR ready for merge process labels Sep 17, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Sep 17, 2018
@tensorflowbutler tensorflowbutler removed the awaiting review Pull request awaiting review label Sep 18, 2018
@asimshankar asimshankar removed their request for review September 24, 2018 22:16
@tensorflowbutler
Copy link
Member

Nagging Assignee @yifeif: It has been 14 days with no activity and this issue has an assignee. Please update the label and/or status accordingly.

@drpngx drpngx added the kokoro:force-run Tests on submitted change label Oct 15, 2018
@kokoro-team kokoro-team removed the kokoro:force-run Tests on submitted change label Oct 15, 2018
@tensorflow-copybara tensorflow-copybara merged commit a1cd5c2 into tensorflow:master Oct 16, 2018
tensorflow-copybara pushed a commit that referenced this pull request Oct 16, 2018
PiperOrigin-RevId: 217334880
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes ready to pull PR ready for merge process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants