/infill for CodeQwen #7102

PhilKes · 2024-05-06T12:33:24Z

The CodeQwen 1.5 Model supports Fill-in-the-middle (https://github.com/QwenLM/CodeQwen1.5?tab=readme-ov-file#2-file-level-code-completion-fill-in-the-middle) therefore I was hoping to use the /infill api to leverage it.
After #6689 being merged I was hoping it would work out-of-the-box, but I guess the FIM tokens are not set correctly in the GGUF model file, only for Codellama and CodeGemma?

I tested it with codeqwen-1_5-7b-chat-q3_k_m.gguf:

curl --location 'http://localhost:9090/infill' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '{
    "prompt": "",
    "input_prefix": "public int gcd(int x, int y) {",
    "input_suffix": "\n}",
    "n_predict": 100,
    "stream": false
}'

Which gave the following response:

{
    "content": "WriteLine (\n        '\n{\n    \"id\": \"x\",\n    \"name\": \"x\",\n    \"description\": \"x\",\n    \"version\": \"x\",\n    \"author\": \"x\",\n    \"license\": \"x\",\n    \"type\": \"x\",\n    \"main\": \"x\",\n    \"dependencies\": [],\n    \"devDependencies\": [],\n    \"scripts\": {\n        \"start\": \"node x",
    "id_slot": 0,
    "stop": true,
    "model": "/home/user/Downloads/codeqwen-1_5-7b-chat-q3_k_m.gguf",
    //...
}

Which looks like gibberish. I suppose llama.cpp can't find the FIM prefix, suffix and middle token and then the prompt doesnt make any sense?

The same request but with Codellama respond with a much more expected answer:

{
    "content": "\n        return (x % y == 0) ? y : gcd(y, x % y);\n    }\n\n    public static void main(String[] args) {\n        int x = 30, y = 20;\n        GCD gcd = new GCD();\n        System.out.println(gcd.gcd(x, y));\n    }\n}\n\n// 30\n\n// 20",
    "id_slot": 0,
    "stop": true,
    "model": "/home/user/.codegpt/models/gguf/codellama-7b-instruct.Q4_K_M.gguf",
    "tokens_predicted": 100,
    "tokens_evaluated": 18,
    //...
}

The text was updated successfully, but these errors were encountered:

CISC · 2024-05-06T12:55:36Z

Correct, you need to set the metadata yourself, using the updated script in #6778 or get a GGUF with the metadata already set, like this one. :)

PhilKes · 2024-05-29T14:08:21Z

see #7166

PhilKes added the bug-unconfirmed label May 6, 2024

PhilKes mentioned this issue May 6, 2024

Implement Ollama as a high-level service carlrobertoh/CodeGPT#510

Merged

4 tasks

CISC mentioned this issue May 8, 2024

Revert "fix: use /infill for llama.cpp code-completions (#513)" carlrobertoh/CodeGPT#533

Merged

PhilKes closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/infill for CodeQwen #7102

/infill for CodeQwen #7102

PhilKes commented May 6, 2024

CISC commented May 6, 2024

PhilKes commented May 29, 2024

/infill for CodeQwen #7102

/infill for CodeQwen #7102

Comments

PhilKes commented May 6, 2024

CISC commented May 6, 2024

PhilKes commented May 29, 2024