Skip to content

Add Q3_K_M weight dequant support #5

@unamedkr

Description

@unamedkr

The GGUF weight loader (src/engine/tq_gguf_quants.c) handles Q2_K through Q6_K dequantization, but Q3_K_M (3-bit with medium grouping) needs testing and verification against the llama.cpp reference implementation.

What to do:

  • Verify that the dequantize_q3_K path correctly handles the Q3_K_M nibble ordering and scale layout.
  • Compare output against refs/llama.cpp/ggml-quants.c to ensure bit-level compatibility.
  • Add a unit test in tests/ that roundtrips a known vector through Q3_K_M and checks MSE.

Files to touch: src/engine/tq_gguf_quants.c, tests/ (new test file or add to existing).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions