Should've mentioned about "CRITICAL" modifications done in transformers source code #37

sonsus · 2022-02-15T06:03:05Z

Thanks for public opening of your work. I really appreciate your simple yet param-effective method for tuning PLMs.

In fact, I've gone through hard time re-implementing the original experiment of yours.
Until knowing that you've modified modeling_gpt2.py / GPT2LMHeadModel.prepare_inputs_for_generation() (and maybe lil' modifications in generation_utils.py) results were truly mysterious.

The function mentioned above is necessary for making this method actually work. It preserves past_key_values passed. Otherwise, PLM will not incorporate the learned prefix embedding during the generation.

It was really painful process to track this down. You hinted about modifications of data_collators but not about generation part of the transformers which is critical part of the implementation. Meh😕.

Hope this helps the other visitors.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should've mentioned about "CRITICAL" modifications done in transformers source code #37

Should've mentioned about "CRITICAL" modifications done in transformers source code #37

sonsus commented Feb 15, 2022 •

edited

Should've mentioned about "CRITICAL" modifications done in transformers source code #37

Should've mentioned about "CRITICAL" modifications done in transformers source code #37

Comments

sonsus commented Feb 15, 2022 • edited

sonsus commented Feb 15, 2022 •

edited