Adding support for Rotary Position Embeddings #675

ShashankMosaicML · 2023-10-13T21:41:57Z

Adding the support for Rotary Positional Embeddings(RoPE).

tl;dr: Advantages of RoPE: This embedding gets applied to the query and key matrices, and hence (unlike ALiBi embeddings), is agnostic to the attention implementation and works out-of-the box for any attention implementation. Recent works have also shown that some variants of RoPE are good at extrapolating beyond training length.

Design doc

Experiments: 125M model, 1B model.

Pulling the changes from main branch

ShashankMosaicML · 2023-10-31T18:01:42Z

@dakinggg will this merge cause any problems in merging mpt code into the Hugging Face github codebase?

llmfoundry/models/layers/attention.py

llmfoundry/models/layers/blocks.py

llmfoundry/models/mpt/configuration_mpt.py

llmfoundry/models/mpt/modeling_mpt.py

tests/test_flash_triton_torch.py

tests/test_rope_dail_vs_hf.py

tests/test_model.py

TUTORIAL.md

llmfoundry/models/layers/attention.py

llmfoundry/models/mpt/configuration_mpt.py

llmfoundry/models/mpt/modeling_mpt.py

Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

Shashank Rajput and others added 30 commits October 5, 2023 22:00

..

aa9509e

..

7354fcc

..

6801142

..

eff6270

..

cdc6798

..

a74afb4

..

3c02585

..

47f5af6

..

3389f78

..

c9f2154

..

9db76a8

..

722eb0c

..

4eb9f17

..

c675c22

..

de765c4

..

529ada8

..

7d39ffc

..

bb92769

..

841becb

..

e5d0e65

..

dabd231

..

7f1109a

removed the roformer impementation of rope

e98841d

..

dea3b03

fixed all the lint errors

2927a8c

..

d605fbf

..

7b250f7

../llmfoundry/models/mpt/modeling_mpt.py

196b8e1

Merge pull request #2 from mosaicml/main

22212a1

Pulling the changes from main branch

..

0c3942e

ShashankMosaicML and others added 4 commits October 31, 2023 15:25

minor changes

64d4a57

minor changes

04452e6

..

1eff648

Merge branch 'main' into rotary_hf_imp

046fa08

vchiley approved these changes Oct 31, 2023

View reviewed changes

Merge branch 'main' into rotary_hf_imp

cceca07

dakinggg reviewed Nov 1, 2023

View reviewed changes

ShashankMosaicML and others added 5 commits November 2, 2023 18:31

resolved some comments from the PR

1e59de5

fixed tests

ac0fd40

modified is_flash_v2_installed

e59f784

minor changes

c602a06

Merge branch 'main' into rotary_hf_imp

e59ec7e

dakinggg approved these changes Nov 4, 2023

View reviewed changes

ShashankMosaicML and others added 8 commits November 4, 2023 10:48

Update TUTORIAL.md

5744a3e

Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

Update TUTORIAL.md

0036ce7

Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

Update TUTORIAL.md

9ebd2e7

Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

Update TUTORIAL.md

4874713

Co-authored-by: Daniel King <43149077+dakinggg@users.noreply.github.com>

resolved PR comments

e0d8b75

Merge branch 'main' into rotary_hf_imp

5ba6968

Merge branch 'main' into rotary_hf_imp

6b680d7

Merge branch 'main' into rotary_hf_imp

9204e31

ShashankMosaicML enabled auto-merge (squash) November 6, 2023 22:07

ShashankMosaicML disabled auto-merge November 6, 2023 22:08

ShashankMosaicML enabled auto-merge (squash) November 6, 2023 22:12

ShashankMosaicML disabled auto-merge November 6, 2023 22:30

Merge branch 'main' into rotary_hf_imp

828d2bb

ShashankMosaicML enabled auto-merge (squash) November 6, 2023 22:33

ShashankMosaicML merged commit 1d504c8 into mosaicml:main Nov 6, 2023
12 checks passed

ShashankMosaicML deleted the rotary_hf_imp branch November 6, 2023 23:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding support for Rotary Position Embeddings #675

Adding support for Rotary Position Embeddings #675

ShashankMosaicML commented Oct 13, 2023 •

edited by dakinggg

ShashankMosaicML commented Oct 31, 2023

Adding support for Rotary Position Embeddings #675

Adding support for Rotary Position Embeddings #675

Conversation

ShashankMosaicML commented Oct 13, 2023 • edited by dakinggg

Adding the support for Rotary Positional Embeddings(RoPE).

ShashankMosaicML commented Oct 31, 2023

ShashankMosaicML commented Oct 13, 2023 •

edited by dakinggg