Skip to content

fix(export): infer GGUF metadata (num_heads/etc.) so apr->gguf works on metadata-light .apr (PMAT-920)#2212

Merged
noahgift merged 2 commits into
mainfrom
beat/apr-gguf-export-metadata
Jun 24, 2026
Merged

fix(export): infer GGUF metadata (num_heads/etc.) so apr->gguf works on metadata-light .apr (PMAT-920)#2212
noahgift merged 2 commits into
mainfrom
beat/apr-gguf-export-metadata

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Bug (EV#4 — user-facing export correctness)

apr export --format gguf model.apr -o out.gguf hard-fails with a
C-07 num_heads required for GGUF export (missing in APR metadata) error
on any .apr whose metadata block lacks explicit num_heads (and other
GGUF-required dims). A user who trained or apr converted a .apr
without a fully-populated header could not export it to GGUF for
llama.cpp / ollama at all — even though the tensor shapes carry those
dimensions unambiguously.

Fix

In the APR→GGUF raw passthrough path (export_apr_to_gguf_raw), add
infer_missing_gguf_dims_from_shapes, which reuses the existing
shape-inference engine (infer_model_config_from_tensors) to derive the
missing GGUF-required dimensions from tensor shapes before the C-07 error
fires:

  • num_heads / num_kv_heads from q/k/v projection shapes (via head_dim)
  • hidden_size / vocab_size from the embedding shape
  • intermediate_size from the FFN gate/up shapes

Explicit APR metadata always wins; inference only fills None fields.
Shapes are interpreted row-major (LAYOUT-001), consistent with import.
(The non-raw export_to_gguf path already inferred via
resolve_gguf_config; this closes the gap in the raw passthrough path,
which still hard-failed.)

Falsifier (RED→GREEN, mutation-verified)

FALSIFY-EXPORT-INFER-METADATA-001
(ft_apr_gguf_export_infers_heads_when_metadata_light): a metadata-light
.apr (num_heads=None; shapes imply 2 heads / 1 kv head / hidden 128)
now exports to GGUF and the produced GGUF re-reads with the correct
attention.head_count (2) / head_count_kv (1) / embedding_length (128).

  • RED on the unfixed exporter (C-07 hard-fail).
  • GREEN on the fix.
  • Mutation-verified: reverting the inference call drives the falsifier RED.

Contract

contracts/apr-export-num-layers-v1.yaml bumped to 1.1.0 — new equation
attn_dim_inference, a proof obligation, and the single-line falsifier
ref. pv validate + pv lint contracts/ pass (0 errors).

Tests

  • cargo test -p aprender-core --lib — 14000 pass
  • cargo test -p apr-cli --lib — 5993 pass
  • all existing export_apr_to_gguf_raw tests still green

🤖 Generated with Claude Code

…on metadata-light .apr (PMAT-920)

`apr export --format gguf model.apr` hard-failed with a C-07
"num_heads required for GGUF export" error on any .apr whose metadata
block lacked explicit num_heads / hidden_size / vocab_size /
intermediate_size — e.g. a model produced by training or `apr convert`
without a fully-populated header. Users could not export such .apr files
to GGUF for llama.cpp / ollama at all, even though the tensor shapes
carry these dimensions unambiguously.

Fix: in the APR->GGUF raw passthrough path (export_apr_to_gguf_raw), add
infer_missing_gguf_dims_from_shapes, which reuses the existing
shape-inference engine (infer_model_config_from_tensors) to derive the
missing GGUF-required dimensions from tensor shapes before the C-07
error fires:
  - num_heads / num_kv_heads from q/k/v projection shapes (head_dim)
  - hidden_size / vocab_size from the embedding shape
  - intermediate_size from the FFN gate/up shapes
Explicit APR metadata always wins; inference only fills None fields.
Shapes are interpreted row-major (LAYOUT-001), consistent with import.
(The non-raw export_to_gguf path already infers via resolve_gguf_config;
this closes the gap in the raw passthrough path.)

Falsifier FALSIFY-EXPORT-INFER-METADATA-001
(ft_apr_gguf_export_infers_heads_when_metadata_light): a metadata-light
.apr (num_heads=None, shapes imply 2 heads / 1 kv head / hidden 128) now
exports to GGUF AND the produced GGUF re-reads with the correct
attention.head_count / head_count_kv / embedding_length. RED on the
unfixed exporter (C-07 hard-fail), GREEN on the fix; mutation-verified by
reverting the inference call (falsifier goes RED).

Contract apr-export-num-layers-v1 bumped to 1.1.0: new equation
attn_dim_inference, a proof obligation, and the single-line falsifier
ref. pv validate + pv lint contracts/ pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge June 24, 2026 08:39
@noahgift noahgift disabled auto-merge June 24, 2026 08:44
…r guess (PMAT-920)

The first cut of PMAT-920 inferred num_heads for apr->gguf export via
infer_head_counts, which GUESSES head_dim from a hardcoded
[64, 128, 96, 80] list and takes the first divisor of q_dim. That
SILENTLY mis-stamps real models: Qwen2-1.5B (q_dim=1536, head_dim=128,
12 heads) gets 1536/64 = 24 heads written into a valid-looking GGUF with
no error. A wrong-but-valid head count is worse than the original honest
hard-fail.

Root cause: num_heads is NOT inferable from shapes alone —
q_dim = num_heads x head_dim has no unique factorization without
head_dim (1536 = 12x128 = 24x64 = 16x96, all valid).

Fix (export_apr_to_gguf_raw / infer_missing_gguf_dims_from_shapes):
  - num_heads / num_kv_heads: derived ONLY from an EXPLICIT head_dim in
    the APR header (AprV2Metadata.head_dim), num_heads = q_dim/head_dim,
    num_kv_heads = kv_dim/head_dim — EXACT and sound
    (fill_head_counts_from_explicit_head_dim). Explicit num_heads always
    wins. The [64,128,96,80] guess is no longer consulted on the export
    path (infer_head_counts stays for the import fallback only).
  - When head_dim AND num_heads are BOTH absent → actionable hard-fail
    (missing_num_heads_err) naming the missing dim and a working remedy
    (stamp head_dim/num_heads, or re-convert from the source config). No
    GGUF is written.
  - hidden_size / vocab_size / intermediate_size stay inferred from
    embedding/FFN shapes (these ARE unambiguous).

Falsifiers (both directions, mutation-verified):
  FALSIFY-EXPORT-INFER-METADATA-001
  (ft_apr_gguf_export_uses_explicit_head_dim_exactly_not_guess):
    metadata-light .apr with explicit head_dim=128, q_dim=256 → GGUF
    head_count == 2 EXACTLY (the old guess would say 256/64 = 4).
  FALSIFY-EXPORT-INFER-METADATA-002
  (ft_apr_gguf_export_hard_fails_when_head_dim_and_num_heads_absent):
    head_dim AND num_heads absent → Err naming num_heads + remedy, NO
    GGUF written. Reverting to the guess flips BOTH falsifiers RED.

Contract apr-export-num-layers-v1 bumped to 1.2.0: drops the
"q_dim/head_dim always inferable" overclaim, states num_heads requires
explicit head_dim (else honest error), adds both falsifier refs
(single-line). pv validate + pv lint contracts/ pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge June 24, 2026 09:09
@noahgift noahgift added this pull request to the merge queue Jun 24, 2026
Merged via the queue into main with commit 45a2ce2 Jun 24, 2026
10 checks passed
@noahgift noahgift deleted the beat/apr-gguf-export-metadata branch June 24, 2026 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant