Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama.cpp : add documentation about rope_freq_base and scale values #3401

Merged
merged 4 commits into from
Sep 29, 2023

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented Sep 29, 2023

Previously, setting rope_freq_base to 10000 and rope_freq_scale to 1 in llama_context_params would cause llama.cpp to use model's default value. Now, to use the model default values, these parameters must be set to zero.

Setting rope_freq_base to 10000 will cause problems with models trained on a different value, such as CodeLlama-7B. Therefore, downstream users should be careful to update the default values of these parameters.

When using llama_context_default_params() to initialize llama_context_params without changing these parameters, no further action is required. llama_context_default_params() will correctly set the value of these parameters to zero.

@slaren
Copy link
Collaborator Author

slaren commented Sep 29, 2023

I noticed that this is an issue in llama-cpp-python: abetlen/llama-cpp-python#765

Downstream users should be careful to update the default values.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also add a hot topics entry given the impact and importance of this change

@slaren slaren merged commit 40e07a6 into master Sep 29, 2023
32 of 33 checks passed
@slaren slaren deleted the cparams-doc branch September 29, 2023 16:42
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 2, 2023
…example

* 'master' of github.com:ggerganov/llama.cpp:
  ggml-cuda : perform cublas mat mul of quantized types as f16 (ggerganov#3412)
  llama.cpp : add documentation about rope_freq_base and scale values (ggerganov#3401)
  train : fix KQ_pos allocation (ggerganov#3392)
  llama : quantize up to 31% faster on Linux and Windows with mmap (ggerganov#3206)
  readme : update hot topics + model links (ggerganov#3399)
  readme : add link to grammars app (ggerganov#3388)
  swift : fix build on xcode 15 (ggerganov#3387)
  build : enable more non-default compiler warnings (ggerganov#3200)
  ggml_tensor: update the structure comments. (ggerganov#3283)
  ggml : release the requested thread pool resource (ggerganov#3292)
  llama.cpp : split llama_context_params into model and context params (ggerganov#3301)
  ci : multithreaded builds (ggerganov#3311)
  train : finetune LORA (ggerganov#2632)
  gguf : basic type checking in gguf_get_* (ggerganov#3346)
  gguf : make token scores and types optional (ggerganov#3347)
  ci : disable freeBSD builds due to lack of VMs (ggerganov#3381)
  llama : custom attention mask + parallel decoding + no context swaps (ggerganov#3228)
  docs : mark code as Bash (ggerganov#3375)
  readme : add Mistral AI release 0.1 (ggerganov#3362)
  ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (ggerganov#3370)
jhen0409 added a commit to mybigday/llama.rn that referenced this pull request Oct 3, 2023
yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023
…gerganov#3401)

* llama.cpp : add documentation about rope_freq_base and scale values

* add notice to hot topics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants