-
Notifications
You must be signed in to change notification settings - Fork 25
Add Q3_K_M weight dequant support #5
Copy link
Copy link
Open
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed
Description
The GGUF weight loader (src/engine/tq_gguf_quants.c) handles Q2_K through Q6_K dequantization, but Q3_K_M (3-bit with medium grouping) needs testing and verification against the llama.cpp reference implementation.
What to do:
- Verify that the
dequantize_q3_Kpath correctly handles the Q3_K_M nibble ordering and scale layout. - Compare output against
refs/llama.cpp/ggml-quants.cto ensure bit-level compatibility. - Add a unit test in
tests/that roundtrips a known vector through Q3_K_M and checks MSE.
Files to touch: src/engine/tq_gguf_quants.c, tests/ (new test file or add to existing).
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
good first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is needed