Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should've mentioned about "CRITICAL" modifications done in transformers source code #37

Open
sonsus opened this issue Feb 15, 2022 · 0 comments

Comments

@sonsus
Copy link

sonsus commented Feb 15, 2022

Thanks for public opening of your work. I really appreciate your simple yet param-effective method for tuning PLMs.

In fact, I've gone through hard time re-implementing the original experiment of yours.
Until knowing that you've modified modeling_gpt2.py / GPT2LMHeadModel.prepare_inputs_for_generation() (and maybe lil' modifications in generation_utils.py) results were truly mysterious.

The function mentioned above is necessary for making this method actually work. It preserves past_key_values passed. Otherwise, PLM will not incorporate the learned prefix embedding during the generation.

It was really painful process to track this down. You hinted about modifications of data_collators but not about generation part of the transformers which is critical part of the implementation. Meh😕.

Hope this helps the other visitors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant