You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for your solid work and for sharing the code!
May I ask why do you choose to predict the label index (like if the masked token has three possible values, then you will output the index 0 to 2 instead of outputting the actual word id corresponding to the label ) when you generate the output? Have you tried to predict the actual word instead of the index?
Thank you!
The text was updated successfully, but these errors were encountered:
To add on, since the model is classifying the index instead of generating the label, can this method still be considered as prompt-tuning? Because It does not actually use the semantics of all those labels. It is more like a multiclass classification.
Our model is based on the probability distribution over actual tokens. The use of index is only for the convenience of code implementation. When we implement the model, we first predict the energy scores of all tokens, and then select the energy scores of those words used for label words to renormalize the probability.
Hi,
Thanks for your solid work and for sharing the code!
May I ask why do you choose to predict the label index (like if the masked token has three possible values, then you will output the index 0 to 2 instead of outputting the actual word id corresponding to the label ) when you generate the output? Have you tried to predict the actual word instead of the index?
Thank you!
The text was updated successfully, but these errors were encountered: