You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
when I used your training code, I found there was something I didn't understand during model forwarding.
During training process, the model firstly predicts the family token (y_type), and then predicts other kind of tokens.
In the code below, it shows that you directly use ground truth family token to predict other kind of tokens.
But in the generate process, you use the family predicts earlier to predict other kind of tokens.
I'm wondering why you choose the way to train and inference?
And if this is possible to cause the inconsistence between training and inference?
thanks for anyone who could help me figure out this!!
The text was updated successfully, but these errors were encountered:
Hi,
when I used your training code, I found there was something I didn't understand during model forwarding.
During training process, the model firstly predicts the family token (y_type), and then predicts other kind of tokens.
In the code below, it shows that you directly use ground truth family token to predict other kind of tokens.
But in the generate process, you use the family predicts earlier to predict other kind of tokens.
I'm wondering why you choose the way to train and inference?
And if this is possible to cause the inconsistence between training and inference?
thanks for anyone who could help me figure out this!!
The text was updated successfully, but these errors were encountered: