Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llama : allow gguf RoPE keys to be overridden with defaults #3240

Merged
merged 4 commits into from
Sep 20, 2023

Conversation

cebtenzzre
Copy link
Collaborator

@klosax is this what you had in mind? It seems like the simplest way to resolve your TODO comment. I'd like to find a solution to this before I add more RoPE keys in PR #2268.

This way, the fallback value is hardcoded in llm_load_hparams and not exposed anywhere else. This shouldn't matter, because with GGUF the API user can't assume the value of these parameters before they've loaded the model.

@cebtenzzre cebtenzzre changed the title llama : allow gguf rope keys to be overridden with defaults llama : allow gguf RoPE keys to be overridden with defaults Sep 18, 2023
@cebtenzzre cebtenzzre merged commit a5661d7 into ggerganov:master Sep 20, 2023
34 checks passed
@cebtenzzre cebtenzzre deleted the rope-params-override branch September 20, 2023 16:12
@cebtenzzre cebtenzzre removed the request for review from klosax September 20, 2023 16:13
@cebtenzzre cebtenzzre restored the rope-params-override branch September 20, 2023 16:13
@shibe2
Copy link
Collaborator

shibe2 commented Sep 22, 2023

I have an older model codellama-34b.Q6_K.gguf, and it outputs garbage after commit a5661d7. Why would that be?

@cebtenzzre
Copy link
Collaborator Author

I have an older model codellama-34b.Q6_K.gguf, and it outputs garbage after commit a5661d7.

Well, codellama-34b does use a different rope_freq_base, so maybe this commit did somehow break it. What is the sha256sum of your model file? There are two versions.

@shibe2
Copy link
Collaborator

shibe2 commented Sep 22, 2023

1002ef07afb15d208819c844f9b0e28b98b3c0c8da3df54521e727c17b45917c

@cebtenzzre
Copy link
Collaborator Author

I'm still looking into this, but there's something really odd about that model file - the strings for the keys all have an embedded NUL byte at the end, which is counted towards their length. I don't know of any version of the llama.cpp GGUF writer that did that.

cebtenzzre added a commit to cebtenzzre/llama.cpp that referenced this pull request Sep 23, 2023
pkrmf pushed a commit to morlockstudios-com/llama.cpp that referenced this pull request Sep 26, 2023
pkrmf pushed a commit to morlockstudios-com/llama.cpp that referenced this pull request Sep 26, 2023
@ggerganov
Copy link
Owner

@shibe2 Do you observe the issue with a newly created converted model with the latest version of the code?

@shibe2
Copy link
Collaborator

shibe2 commented Sep 29, 2023

LlongOrca-7B-16k works with default parameters on 569550d. I converted it again just now. Output on a short prompt matches exactly the output of my previous conversion. Also, information retrieval from a long text works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants