Skip to content

[megatron_gpt2] dynamic gelu, add tokenizer, save config#13928

Merged
stas00 merged 4 commits intohuggingface:masterfrom
stas00:meg_gpt2_convert_save_pretrained
Oct 26, 2021
Merged

[megatron_gpt2] dynamic gelu, add tokenizer, save config#13928
stas00 merged 4 commits intohuggingface:masterfrom
stas00:meg_gpt2_convert_save_pretrained

Conversation

@stas00
Copy link
Copy Markdown
Contributor

@stas00 stas00 commented Oct 8, 2021

This PR improves megatron_gpt2/convert_megatron_gpt2_checkpoint.py:

  • dynamically figures out the correct activation function based on how megatron was trained
  • dynamically figures out the correct tokenizer and sets config.tokenizer_class
  • adds tokenizer files
  • switches to config.save_pretrained to ensure the config gets the right bits of the most recent transformers

Some of this functionality has been discussed here #13906

@stas00 stas00 marked this pull request as ready for review October 25, 2021 20:32
@stas00 stas00 requested review from LysandreJik and sgugger October 25, 2021 20:32
Copy link
Copy Markdown
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for adding those improvements!

stas00 and others added 2 commits October 25, 2021 13:41
…eckpoint.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Copy link
Copy Markdown
Member

@LysandreJik LysandreJik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thank you @stas00

@stas00 stas00 merged commit bfd8176 into huggingface:master Oct 26, 2021
@stas00 stas00 deleted the meg_gpt2_convert_save_pretrained branch October 26, 2021 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants