Update Llama2 cached sin/cos to use max_sequence_length #780

nverke · 2023-08-17T17:59:27Z

… and cos functions.

Currently if you need to make the max sequence length of a llama2 model larger than 2048 it will run into issues actually handling more than 2048 tokens. The model would spit out garbage data or segfault if greater than 2048 tokens was sent in. The root cause I believe is due to this hardcoding of the sin and cos functions and I have been able to increase the max sequence length using these updates.

… and cos functions.

Hzfengsy

General LGTM. also cc @MasterJH5574

Hzfengsy · 2023-08-18T05:57:32Z

mlc_llm/core.py

    cache_path = os.path.join(args.artifact_path, "mod_cache_before_build.pkl")
    args.raw_params_path = os.path.join(args.artifact_path, "raw_params")
-    use_cache = args.use_cache and os.path.isfile(cache_path)
+    use_cache = args.use_cache and os.path.isfile(cache_path)        


fix style plz

This one fixes error introduced in mlc-ai#780, where `max_seq_len` is set to `-1` when `max_seq_len` is not specified in the config file.

This one fixes error introduced in #780, where `max_seq_len` is set to `-1` when `max_seq_len` is not specified in the config file.

This PR fixes an issue introduced by mlc-ai#780, which broke our intended behavior to make the cos/sin shape independent of the max sequence length, so that no matter what max sequence length people use, they can always use a same set of prebuilt weight and do not need to clone different weight repositories. This intended behavior is broken by mlc-ai#780. However, it is true that the needs for larger max sequence length are growing. Prior to mlc-ai#780, when the max sequence length is larger than 2048, the cached cos/sin do not work anymore and break. To be compatible as much as possible, this PR changes the behavior to "taking the maximum value of 2048 and the specified max sequence length when building the model lib". With this fix, when the maximum sequence length is smaller than 2048, we are still able to use the prebuilt weights. And when it is larger than 2048, we will only be able to use the weight converted along the build.

…840) This PR fixes an issue introduced by #780, which broke our intended behavior to make the cos/sin shape independent of the max sequence length, so that no matter what max sequence length people use, they can always use a same set of prebuilt weight and do not need to clone different weight repositories. This intended behavior is broken by #780. However, it is true that the needs for larger max sequence length are growing. Prior to #780, when the max sequence length is larger than 2048, the cached cos/sin do not work anymore and break. To be compatible as much as possible, this PR changes the behavior to "taking the maximum value of 2048 and the specified max sequence length when building the model lib". With this fix, when the maximum sequence length is smaller than 2048, we are still able to use the prebuilt weights. And when it is larger than 2048, we will only be able to use the weight converted along the build.

* 'main' of https://github.com/jimscard/mlc-llm: [Doc] Minor update to `Build Android Package from Source` section (mlc-ai#785) added cors to fast api (mlc-ai#757) Update Llama2 cached sin/cos to use max_sequence_length (mlc-ai#780) Update gpu.rst to add sudo apt update before first install (mlc-ai#784) [Doc] Update doc for prebuilt models (mlc-ai#767) Improve code completion experience (mlc-ai#772) Automatically set 'offset' parameter if 'messages' parameter is set (mlc-ai#754) Update tokenizers-cpp to latest and fix rust build error (mlc-ai#762) [Utils] Skip generating benchmark scripts in cases (mlc-ai#759) [Android] Add libOpenCL-pixel for supporintg Pixel phones. (mlc-ai#723)

nverke added 2 commits August 17, 2023 17:47

Update the llama2 model to use max sequence length for the cached sin…

83bf7be

… and cos functions.

Comment change

90adcf9

nverke changed the title ~~Nverke/llama seq len~~ Update Llama2 cached sin/cos to use max_sequence_length Aug 17, 2023

Hzfengsy approved these changes Aug 18, 2023

View reviewed changes

Fix style

2105c2d

Hzfengsy approved these changes Aug 19, 2023

View reviewed changes

Hzfengsy merged commit 4127782 into mlc-ai:main Aug 19, 2023

Hzfengsy pushed a commit to Hzfengsy/mlc-llm that referenced this pull request Aug 23, 2023

[FIX] Fix error max_seq_len == -1

87f19bf

This one fixes error introduced in mlc-ai#780, where `max_seq_len` is set to `-1` when `max_seq_len` is not specified in the config file.

Hzfengsy mentioned this pull request Aug 23, 2023

[FIX] Fix error max_seq_len == -1 #797

Merged

tqchen pushed a commit that referenced this pull request Aug 23, 2023

[FIX] Fix error max_seq_len == -1 (#797)

66550e0

This one fixes error introduced in #780, where `max_seq_len` is set to `-1` when `max_seq_len` is not specified in the config file.

MasterJH5574 mentioned this pull request Aug 30, 2023

[Fix] Lib with small max_seq_len incompatible with prebuilt weight #840

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update Llama2 cached sin/cos to use max_sequence_length #780

Update Llama2 cached sin/cos to use max_sequence_length #780

Uh oh!

nverke commented Aug 17, 2023

Uh oh!

Hzfengsy left a comment

Uh oh!

Hzfengsy Aug 18, 2023

Uh oh!

nverke Aug 18, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update Llama2 cached sin/cos to use max_sequence_length #780

Update Llama2 cached sin/cos to use max_sequence_length #780

Uh oh!

Conversation

nverke commented Aug 17, 2023

Uh oh!

Hzfengsy left a comment

Choose a reason for hiding this comment

Uh oh!

Hzfengsy Aug 18, 2023

Choose a reason for hiding this comment

Uh oh!

nverke Aug 18, 2023

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants