Empty output when running Q4_K_M quantization of Llama-3-8B-Instruct with llama-cpp-python

Hi! I'm trying to run the Q4_K_M quantization of Meta-Llama-3-8B-Instruct on my Mac (M2 Pro, 16GB VRAM) using llama-cpp-python, with the following test code:

```
from llama_cpp import Llama
llm4 = Llama(model_path = "/path/to/model/Q4_K_M.gguf", chat_format = "chatml")

response = llm4.create_chat_completion(
         messages = [
             {
              "role": "system",
              "content": "You are a helpful dietologist.",
             },
             {
                "role": "user",
                "content": "Can I eat oranges after 7 pm?"
                 },
         ],
         response_format = {
              "type": "json_object",
         },
         temperature = 0.7,
)

print(response)
```

However, the output is consistently empty:

`{'id': 'chatcmpl-d6b4c8ae-0f0a-4112-bb32-3c567f383d13', 'object': 'chat.completion', 'created': 1724142021, 'model': ‘path/to/model/Q4_K_M.gguf', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': '{} '}, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 51, 'completion_tokens': 2, 'total_tokens': 53}}`

Everything works fine when using llama-cli through the terminal, and I've reinstalled llama-cpp-python and rebuilt llama-cpp as per the instructions, but it didn't help. This is also the case for the Q8 and F16 quantizations (F16 gives an insufficient memory error when running through llama-cli, but empty output when running through llama-cpp-python). Is there anything obvious I may be missing here?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Empty output when running Q4_K_M quantization of Llama-3-8B-Instruct with llama-cpp-python #1696

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Empty output when running Q4_K_M quantization of Llama-3-8B-Instruct with llama-cpp-python #1696

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions