Skip to content

Conversation

jackzhxng
Copy link
Contributor

@jackzhxng jackzhxng commented Oct 3, 2024

Summary

  • Removes redundant steps in the Llama2 export
  • Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
  • Comments and orders code more clearly

PR chain:

Test plan

Ensure export + eval is similar before and after. Before:

wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}

After:

wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}

Copy link

pytorch-bot bot commented Oct 3, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5859

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit fb1312f with merge base 3a7056e (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 3, 2024
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from f205927 to 0902dea Compare October 4, 2024 20:35
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 0902dea to 6cd759d Compare October 4, 2024 20:46
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 6cd759d to a6b8704 Compare October 7, 2024 20:49
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 6a285ea to 9be5f57 Compare October 8, 2024 06:41
@jackzhxng jackzhxng marked this pull request as ready for review October 8, 2024 07:13
facebook-github-bot pushed a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Differential Revision: D64027696

Pulled By: dvorjackz
jackzhxng added a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Differential Revision: D64027696

Pulled By: dvorjackz
@jackzhxng jackzhxng force-pushed the jz/eager-model-inputs branch from 63e3b9e to 6ff6615 Compare October 8, 2024 20:09
facebook-github-bot pushed a commit that referenced this pull request Oct 8, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Oct 9, 2024
Summary:
For situations where the forward has non-position arguments, such as https://github.com/pytorch/torchtune/blob/3c450ef5f1fbe8237f899e942fd5222491a47ca7/torchtune/modules/transformer.py#L519

PR chain:
- **YOU ARE HERE ~>** [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Pull Request resolved: #5765

Test Plan:
Exported Stories110M model.
```
wget "https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.pt"
echo '{"dim": 768, "multiple_of": 32, "n_heads": 12, "n_layers": 12, "norm_eps": 1e-05, "vocab_size": 32000}' > params.json
python -m examples.models.llama2.export_llama -c stories110M.pt -p params.json -X -kv
```

Reviewed By: tarun292

Differential Revision: D64027696

Pulled By: dvorjackz

fbshipit-source-id: 15ecfb458c6194159140d4c601e5443a2e524fdc
@jackzhxng jackzhxng changed the base branch from jz/eager-model-inputs to main October 9, 2024 23:10
@facebook-github-bot
Copy link
Contributor

@dvorjackz has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot pushed a commit that referenced this pull request Oct 11, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64145852

facebook-github-bot pushed a commit that referenced this pull request Oct 14, 2024
Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: dbort

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64145852

Summary:
- Removes redundant steps in the Llama2 export
- Factors out checkpointing to be shared with future Llama models (namely 3.2 multimodal)
- Comments and orders code more clearly

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- **YOU ARE HERE ~>** [Llama2 model cleanup](#5859)
- [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)


Test Plan:
Ensure export + eval is similar before and after for Stories 110M:
```
python -m examples.models.llama2.eval_llama -c <checkpoint.pth> -p <params.json> -t <tokenizer.model/bin> -d fp32 --max_seq_len 2048 --limit 1000
```


Before:
```
wikitext: {'word_perplexity,none': 14464.645927166595, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.99788806086652, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.5844545973083983, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

After:
```
wikitext: {'word_perplexity,none': 14464.299192404438, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 5.997861173678705, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 2.584448130015399, 'bits_per_byte_stderr,none': 'N/A', 'alias': 'wikitext'}
```

Reviewed By: dbort

Differential Revision: D64145852

Pulled By: dvorjackz
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D64145852

@facebook-github-bot
Copy link
Contributor

@dvorjackz merged this pull request in 4745070.

facebook-github-bot pushed a commit that referenced this pull request Nov 11, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 12, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
facebook-github-bot pushed a commit that referenced this pull request Nov 13, 2024
Summary:
Specify model to export in the CLI.


Test Plan:
Exported the stories 110M model.
```
python -m examples.models.llama.export_llama -c stories110M/stories110M.pt -p stories110M/params.json -X -kv
```

PR chain:
- [Add kwarg example inputs to eager model base](#5765)
- [Llama2 model cleanup](#5859)
- **YOU ARE HERE ~>** [Accept model type parameter in export_llama](#5910)
- [Export TorchTune llama3_2_vision in ET](#5911)
- [Runner changes for TorchTune Llama3.2 vision text decoder](#6610)
- [Add et version of TorchTune MHA for swapping with custom op](#5912)

Reviewed By: helunwencser

Differential Revision: D65612837

Pulled By: dvorjackz
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants