You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for public opening of your work. I really appreciate your simple yet param-effective method for tuning PLMs.
In fact, I've gone through hard time re-implementing the original experiment of yours.
Until knowing that you've modified modeling_gpt2.py / GPT2LMHeadModel.prepare_inputs_for_generation() (and maybe lil' modifications in generation_utils.py) results were truly mysterious.
The function mentioned above is necessary for making this method actually work. It preserves past_key_values passed. Otherwise, PLM will not incorporate the learned prefix embedding during the generation.
It was really painful process to track this down. You hinted about modifications of data_collators but not about generation part of the transformers which is critical part of the implementation. Meh😕.
Hope this helps the other visitors.
The text was updated successfully, but these errors were encountered:
Thanks for public opening of your work. I really appreciate your simple yet param-effective method for tuning PLMs.
In fact, I've gone through hard time re-implementing the original experiment of yours.
Until knowing that you've modified
modeling_gpt2.py / GPT2LMHeadModel.prepare_inputs_for_generation()
(and maybe lil' modifications ingeneration_utils.py
) results were truly mysterious.The function mentioned above is necessary for making this method actually work. It preserves
past_key_values
passed. Otherwise, PLM will not incorporate the learned prefix embedding during the generation.It was really painful process to track this down. You hinted about modifications of data_collators but not about generation part of the transformers which is critical part of the implementation. Meh😕.
Hope this helps the other visitors.
The text was updated successfully, but these errors were encountered: