Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/infill for CodeQwen #7102

Closed
PhilKes opened this issue May 6, 2024 · 2 comments
Closed

/infill for CodeQwen #7102

PhilKes opened this issue May 6, 2024 · 2 comments

Comments

@PhilKes
Copy link

PhilKes commented May 6, 2024

The CodeQwen 1.5 Model supports Fill-in-the-middle (https://github.com/QwenLM/CodeQwen1.5?tab=readme-ov-file#2-file-level-code-completion-fill-in-the-middle) therefore I was hoping to use the /infill api to leverage it.
After #6689 being merged I was hoping it would work out-of-the-box, but I guess the FIM tokens are not set correctly in the GGUF model file, only for Codellama and CodeGemma?

I tested it with codeqwen-1_5-7b-chat-q3_k_m.gguf:

curl --location 'http://localhost:9090/infill' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '{
    "prompt": "",
    "input_prefix": "public int gcd(int x, int y) {",
    "input_suffix": "\n}",
    "n_predict": 100,
    "stream": false
}'

Which gave the following response:

{
    "content": "WriteLine (\n        '\n{\n    \"id\": \"x\",\n    \"name\": \"x\",\n    \"description\": \"x\",\n    \"version\": \"x\",\n    \"author\": \"x\",\n    \"license\": \"x\",\n    \"type\": \"x\",\n    \"main\": \"x\",\n    \"dependencies\": [],\n    \"devDependencies\": [],\n    \"scripts\": {\n        \"start\": \"node x",
    "id_slot": 0,
    "stop": true,
    "model": "/home/user/Downloads/codeqwen-1_5-7b-chat-q3_k_m.gguf",
    //...
}

Which looks like gibberish. I suppose llama.cpp can't find the FIM prefix, suffix and middle token and then the prompt doesnt make any sense?

The same request but with Codellama respond with a much more expected answer:

{
    "content": "\n        return (x % y == 0) ? y : gcd(y, x % y);\n    }\n\n    public static void main(String[] args) {\n        int x = 30, y = 20;\n        GCD gcd = new GCD();\n        System.out.println(gcd.gcd(x, y));\n    }\n}\n\n// 30\n\n// 20",
    "id_slot": 0,
    "stop": true,
    "model": "/home/user/.codegpt/models/gguf/codellama-7b-instruct.Q4_K_M.gguf",
    "tokens_predicted": 100,
    "tokens_evaluated": 18,
    //...
}
@CISC
Copy link
Contributor

CISC commented May 6, 2024

Correct, you need to set the metadata yourself, using the updated script in #6778 or get a GGUF with the metadata already set, like this one. :)

@PhilKes
Copy link
Author

PhilKes commented May 29, 2024

see #7166

@PhilKes PhilKes closed this as completed May 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants