Skip to content

[BUG-GGUF-001] GGUF outputs garbage - LAYOUT-001 regression #155

@noahgift

Description

@noahgift

Summary

GGUF inference produces garbage output like "Domainuster random_ random random.Mult" instead of correct answers.

Evidence

apr trace hf://Qwen/Qwen2.5-Coder-0.5B-Instruct-GGUF/qwen2.5-coder-0.5b-instruct-q4_k_m.gguf --payload

Test prompt: "What is 2+2?"
Expected: "4" or "The answer is 4"
Actual: "Domainuster random_ random random.Mult"

Root Cause

LAYOUT-001: Column-major vs row-major kernel mismatch in quantized matmul.

GGUF/APR use ROW-MAJOR layout but some kernels assume COLUMN-MAJOR.

Acceptance Criteria

  • apr trace --payload on GGUF shows correct "4" output
  • QA matrix CPU×GGUF passes (15/15 points)
  • QA matrix GPU×GGUF passes (15/15 points)
  • No regression in SafeTensors path

Files to Investigate

  • realizar/src/gguf_monolith.rs - forward_single_with_cache
  • realizar/src/quantize/ - Q4K/Q6K kernels
  • trueno/src/backends/q4k.rs - matmul kernels

Labels

bug, P0, LAYOUT-001, quantization

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions