-
Notifications
You must be signed in to change notification settings - Fork 8.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement customizable RoPE #2054
Merged
Merged
Commits on Jul 7, 2023
-
The original RoPE has pre-defined parameters theta_i = 10000^(−2(i−1)/d), for i in [1, 2, ..., d/2] Our customizable RoPE, ggml_rope_custom_inplace, uses theta_i = scale * base^(−2(i−1)/d), for i in [1, 2, ..., d/2] with the default matches the original scale = 1.0 base = 10000 The new command line arguments --rope-freq-base --rope-freq-scale set the two new RoPE parameter. Recent researches show changing these two parameters extends the context limit with minimal loss. 1. Extending Context to 8K kaiokendev https://kaiokendev.github.io/til#extending-context-to-8k 2. Extending Context Window of Large Language Models via Positional Interpolation Shouyuan Chen, Sherman Wong, Liangjian Chen, Yuandong Tian https://arxiv.org/abs/2306.15595 3. NTK-Aware Scaled RoPE allows LLaMA models to have extended (8k+) context size without any fine-tuning and minimal perplexity degradation. https://www.reddit.com/user/bloc97 https://www.reddit.com/r/LocalLLaMA/comments/14lz7j5/ntkaware_scaled_rope_allows_llama_models_to_have/ For the bold, try adding the following command line parameters to your favorite model: -c 16384 --rope-freq-base 80000 --rope-freq-scale 0.5
Configuration menu - View commit details
-
Copy full SHA for dc0d0eb - Browse repository at this point
Copy the full SHA dc0d0ebView commit details -
Configuration menu - View commit details
-
Copy full SHA for 1ae4318 - Browse repository at this point
Copy the full SHA 1ae4318View commit details -
Configuration menu - View commit details
-
Copy full SHA for 41819b0 - Browse repository at this point
Copy the full SHA 41819b0View commit details -
llama: increase MEM_REQ_EVAL for MODEL_3B
It avoids crashing for quantized weights on CPU. Better ways to calculate the required buffer size would be better.
Configuration menu - View commit details
-
Copy full SHA for 5c6eed3 - Browse repository at this point
Copy the full SHA 5c6eed3View commit details -
Configuration menu - View commit details
-
Copy full SHA for a728a0d - Browse repository at this point
Copy the full SHA a728a0dView commit details -
server: use proper Content-Type in curl examples
Without the header Content-Type: application/json, curl will POST with Content-Type: application/x-www-form-urlencoded Though our simple server doesn't care, the httplib.h used has a limit with CPPHTTPLIB_FORM_URL_ENCODED_PAYLOAD_MAX_LENGTH 8192 With Content-Type: application/json, we can send large json data.
Configuration menu - View commit details
-
Copy full SHA for a3b4d93 - Browse repository at this point
Copy the full SHA a3b4d93View commit details
Commits on Jul 13, 2023
-
Configuration menu - View commit details
-
Copy full SHA for a6b5695 - Browse repository at this point
Copy the full SHA a6b5695View commit details -
Configuration menu - View commit details
-
Copy full SHA for da730c5 - Browse repository at this point
Copy the full SHA da730c5View commit details
Commits on Jul 15, 2023
-
Configuration menu - View commit details
-
Copy full SHA for d0b6c94 - Browse repository at this point
Copy the full SHA d0b6c94View commit details -
Configuration menu - View commit details
-
Copy full SHA for 6024bcc - Browse repository at this point
Copy the full SHA 6024bccView commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.