Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt PreTrainedModelWrapper for Hugging Face models #215

Merged
merged 32 commits into from
Feb 22, 2023

Conversation

jon-tow
Copy link
Collaborator

@jon-tow jon-tow commented Jan 23, 2023

Summary

This PR adds the following changes:

  • Adopts a modified version of the PreTrainedModelWrapper as implemented in the trl package, here, to allow for flexible wrapping of Hugging Face (HF) models. This is useful for providing intuitive access to Hugging Face PreTrainedModel attributes such as push_to_hub and save_pretrained without having to access the underlying model.

  • Adds ILQL save_pretrained support`

  • Introduces the following for HF-based architectures:

    • Renames architectures to AutoModelFor... to match Hugging Face counterparts.
    • Removes base_model.transformer and base_model.lm_head references and instead extracts the final hidden states from all hidden states in the forward pass. This saves 2x memory on save to disk where previously a model's state_dict stored both .transformer and .lm_head as separate states with the underlying transformer.
  • Moves modeling code out of the trainer dir into a separate models dir.

  • Renames some utils.modeling HF attribute getters from causal_lm to decoder as many of these utils are helpful in general e.g. for encoder-decoder (seq2seq) models as well (T5Branch uses the same getter for accessing decoder parts of the transformer).

TODOs

  • Convert test scripts to proper unit tests for save_pretrained and from_pretrained for ILQL and PPO
  • Add from_config method to support custom model configs, e.g. as used in some examples
    GPT2Config(n_layer=6, n_embd=144, vocab_size=23),

Reports

@jon-tow jon-tow marked this pull request as ready for review February 16, 2023 17:20
@jon-tow jon-tow marked this pull request as draft February 16, 2023 19:19
@jon-tow jon-tow marked this pull request as ready for review February 16, 2023 20:05
@jon-tow jon-tow marked this pull request as draft February 16, 2023 22:36
@jon-tow jon-tow marked this pull request as ready for review February 16, 2023 22:51
@cat-state
Copy link
Collaborator

cat-state commented Feb 21, 2023

Thanks for this PR, I like the reorganization and changes overall!

@cat-state cat-state self-requested a review February 21, 2023 22:35
Copy link
Collaborator

@cat-state cat-state left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! this LGTM

@jon-tow jon-tow merged commit 715894a into CarperAI:main Feb 22, 2023
@jon-tow jon-tow deleted the update-save-pretrained branch February 22, 2023 20:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants