We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The CodeQwen 1.5 Model supports Fill-in-the-middle (https://github.com/QwenLM/CodeQwen1.5?tab=readme-ov-file#2-file-level-code-completion-fill-in-the-middle) therefore I was hoping to use the /infill api to leverage it. After #6689 being merged I was hoping it would work out-of-the-box, but I guess the FIM tokens are not set correctly in the GGUF model file, only for Codellama and CodeGemma?
/infill
I tested it with codeqwen-1_5-7b-chat-q3_k_m.gguf:
curl --location 'http://localhost:9090/infill' \ --header 'Content-Type: application/json' \ --header 'Accept: application/json' \ --data '{ "prompt": "", "input_prefix": "public int gcd(int x, int y) {", "input_suffix": "\n}", "n_predict": 100, "stream": false }'
Which gave the following response:
{ "content": "WriteLine (\n '\n{\n \"id\": \"x\",\n \"name\": \"x\",\n \"description\": \"x\",\n \"version\": \"x\",\n \"author\": \"x\",\n \"license\": \"x\",\n \"type\": \"x\",\n \"main\": \"x\",\n \"dependencies\": [],\n \"devDependencies\": [],\n \"scripts\": {\n \"start\": \"node x", "id_slot": 0, "stop": true, "model": "/home/user/Downloads/codeqwen-1_5-7b-chat-q3_k_m.gguf", //... }
Which looks like gibberish. I suppose llama.cpp can't find the FIM prefix, suffix and middle token and then the prompt doesnt make any sense?
The same request but with Codellama respond with a much more expected answer:
{ "content": "\n return (x % y == 0) ? y : gcd(y, x % y);\n }\n\n public static void main(String[] args) {\n int x = 30, y = 20;\n GCD gcd = new GCD();\n System.out.println(gcd.gcd(x, y));\n }\n}\n\n// 30\n\n// 20", "id_slot": 0, "stop": true, "model": "/home/user/.codegpt/models/gguf/codellama-7b-instruct.Q4_K_M.gguf", "tokens_predicted": 100, "tokens_evaluated": 18, //... }
The text was updated successfully, but these errors were encountered:
Correct, you need to set the metadata yourself, using the updated script in #6778 or get a GGUF with the metadata already set, like this one. :)
Sorry, something went wrong.
see #7166
No branches or pull requests
The CodeQwen 1.5 Model supports Fill-in-the-middle (https://github.com/QwenLM/CodeQwen1.5?tab=readme-ov-file#2-file-level-code-completion-fill-in-the-middle) therefore I was hoping to use the
/infill
api to leverage it.After #6689 being merged I was hoping it would work out-of-the-box, but I guess the FIM tokens are not set correctly in the GGUF model file, only for Codellama and CodeGemma?
I tested it with codeqwen-1_5-7b-chat-q3_k_m.gguf:
Which gave the following response:
Which looks like gibberish. I suppose llama.cpp can't find the FIM prefix, suffix and middle token and then the prompt doesnt make any sense?
The same request but with Codellama respond with a much more expected answer:
The text was updated successfully, but these errors were encountered: