Skip to content

Conversation

@nverke
Copy link
Contributor

@nverke nverke commented Aug 17, 2023

… and cos functions.

Currently if you need to make the max sequence length of a llama2 model larger than 2048 it will run into issues actually handling more than 2048 tokens. The model would spit out garbage data or segfault if greater than 2048 tokens was sent in. The root cause I believe is due to this hardcoding of the sin and cos functions and I have been able to increase the max sequence length using these updates.

@nverke nverke changed the title Nverke/llama seq len Update Llama2 cached sin/cos to use max_sequence_length Aug 17, 2023
Copy link
Member

@Hzfengsy Hzfengsy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General LGTM. also cc @MasterJH5574

mlc_llm/core.py Outdated
cache_path = os.path.join(args.artifact_path, "mod_cache_before_build.pkl")
args.raw_params_path = os.path.join(args.artifact_path, "raw_params")
use_cache = args.use_cache and os.path.isfile(cache_path)
use_cache = args.use_cache and os.path.isfile(cache_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix style plz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@Hzfengsy Hzfengsy merged commit 4127782 into mlc-ai:main Aug 19, 2023
Hzfengsy pushed a commit to Hzfengsy/mlc-llm that referenced this pull request Aug 23, 2023
This one fixes error introduced in mlc-ai#780, where `max_seq_len` is set to
`-1` when `max_seq_len` is not specified in the config file.
tqchen pushed a commit that referenced this pull request Aug 23, 2023
This one fixes error introduced in #780, where `max_seq_len` is set to
`-1` when `max_seq_len` is not specified in the config file.
MasterJH5574 added a commit to MasterJH5574/mlc-llm that referenced this pull request Aug 30, 2023
This PR fixes an issue introduced by mlc-ai#780, which broke our intended
behavior to make the cos/sin shape independent of the max sequence
length, so that no matter what max sequence length people use, they
can always use a same set of prebuilt weight and do not need to
clone different weight repositories. This intended behavior is broken
by mlc-ai#780.

However, it is true that the needs for larger max sequence length are
growing. Prior to mlc-ai#780, when the max sequence length is larger than
2048, the cached cos/sin do not work anymore and break. To be compatible
as much as possible, this PR changes the behavior to "taking the
maximum value of 2048 and the specified max sequence length when
building the model lib".

With this fix, when the maximum sequence length is smaller than 2048,
we are still able to use the prebuilt weights. And when it is larger
than 2048, we will only be able to use the weight converted along the
build.
MasterJH5574 added a commit that referenced this pull request Aug 30, 2023
…840)

This PR fixes an issue introduced by #780, which broke our intended
behavior to make the cos/sin shape independent of the max sequence
length, so that no matter what max sequence length people use, they
can always use a same set of prebuilt weight and do not need to
clone different weight repositories. This intended behavior is broken
by #780.

However, it is true that the needs for larger max sequence length are
growing. Prior to #780, when the max sequence length is larger than
2048, the cached cos/sin do not work anymore and break. To be compatible
as much as possible, this PR changes the behavior to "taking the
maximum value of 2048 and the specified max sequence length when
building the model lib".

With this fix, when the maximum sequence length is smaller than 2048,
we are still able to use the prebuilt weights. And when it is larger
than 2048, we will only be able to use the weight converted along the
build.
jimscard added a commit to jimscard/mlc-llm that referenced this pull request Sep 15, 2023
* 'main' of https://github.com/jimscard/mlc-llm:
  [Doc] Minor update to `Build Android Package from Source` section (mlc-ai#785)
  added cors to fast api (mlc-ai#757)
  Update Llama2 cached sin/cos to use max_sequence_length (mlc-ai#780)
  Update gpu.rst to add sudo apt update before first install (mlc-ai#784)
  [Doc] Update doc for prebuilt models (mlc-ai#767)
  Improve code completion experience (mlc-ai#772)
  Automatically set 'offset' parameter if 'messages' parameter is set (mlc-ai#754)
  Update tokenizers-cpp to latest and fix rust build error (mlc-ai#762)
  [Utils] Skip generating benchmark scripts in cases (mlc-ai#759)
  [Android] Add libOpenCL-pixel for supporintg Pixel phones. (mlc-ai#723)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants