fix(export): infer GGUF metadata (num_heads/etc.) so apr->gguf works on metadata-light .apr (PMAT-920) by noahgift · Pull Request #2212 · paiml/aprender

noahgift · 2026-06-24T08:38:54Z

Bug (EV#4 — user-facing export correctness)

apr export --format gguf model.apr -o out.gguf hard-fails with a
C-07 num_heads required for GGUF export (missing in APR metadata) error
on any .apr whose metadata block lacks explicit num_heads (and other
GGUF-required dims). A user who trained or apr converted a .apr
without a fully-populated header could not export it to GGUF for
llama.cpp / ollama at all — even though the tensor shapes carry those
dimensions unambiguously.

Fix

In the APR→GGUF raw passthrough path (export_apr_to_gguf_raw), add
infer_missing_gguf_dims_from_shapes, which reuses the existing
shape-inference engine (infer_model_config_from_tensors) to derive the
missing GGUF-required dimensions from tensor shapes before the C-07 error
fires:

num_heads / num_kv_heads from q/k/v projection shapes (via head_dim)
hidden_size / vocab_size from the embedding shape
intermediate_size from the FFN gate/up shapes

Explicit APR metadata always wins; inference only fills None fields.
Shapes are interpreted row-major (LAYOUT-001), consistent with import.
(The non-raw export_to_gguf path already inferred via
resolve_gguf_config; this closes the gap in the raw passthrough path,
which still hard-failed.)

Falsifier (RED→GREEN, mutation-verified)

FALSIFY-EXPORT-INFER-METADATA-001
(ft_apr_gguf_export_infers_heads_when_metadata_light): a metadata-light
.apr (num_heads=None; shapes imply 2 heads / 1 kv head / hidden 128)
now exports to GGUF and the produced GGUF re-reads with the correct
attention.head_count (2) / head_count_kv (1) / embedding_length (128).

RED on the unfixed exporter (C-07 hard-fail).
GREEN on the fix.
Mutation-verified: reverting the inference call drives the falsifier RED.

Contract

contracts/apr-export-num-layers-v1.yaml bumped to 1.1.0 — new equation
attn_dim_inference, a proof obligation, and the single-line falsifier
ref. pv validate + pv lint contracts/ pass (0 errors).

Tests

cargo test -p aprender-core --lib — 14000 pass
cargo test -p apr-cli --lib — 5993 pass
all existing export_apr_to_gguf_raw tests still green

🤖 Generated with Claude Code

…on metadata-light .apr (PMAT-920) `apr export --format gguf model.apr` hard-failed with a C-07 "num_heads required for GGUF export" error on any .apr whose metadata block lacked explicit num_heads / hidden_size / vocab_size / intermediate_size — e.g. a model produced by training or `apr convert` without a fully-populated header. Users could not export such .apr files to GGUF for llama.cpp / ollama at all, even though the tensor shapes carry these dimensions unambiguously. Fix: in the APR->GGUF raw passthrough path (export_apr_to_gguf_raw), add infer_missing_gguf_dims_from_shapes, which reuses the existing shape-inference engine (infer_model_config_from_tensors) to derive the missing GGUF-required dimensions from tensor shapes before the C-07 error fires: - num_heads / num_kv_heads from q/k/v projection shapes (head_dim) - hidden_size / vocab_size from the embedding shape - intermediate_size from the FFN gate/up shapes Explicit APR metadata always wins; inference only fills None fields. Shapes are interpreted row-major (LAYOUT-001), consistent with import. (The non-raw export_to_gguf path already infers via resolve_gguf_config; this closes the gap in the raw passthrough path.) Falsifier FALSIFY-EXPORT-INFER-METADATA-001 (ft_apr_gguf_export_infers_heads_when_metadata_light): a metadata-light .apr (num_heads=None, shapes imply 2 heads / 1 kv head / hidden 128) now exports to GGUF AND the produced GGUF re-reads with the correct attention.head_count / head_count_kv / embedding_length. RED on the unfixed exporter (C-07 hard-fail), GREEN on the fix; mutation-verified by reverting the inference call (falsifier goes RED). Contract apr-export-num-layers-v1 bumped to 1.1.0: new equation attn_dim_inference, a proof obligation, and the single-line falsifier ref. pv validate + pv lint contracts/ pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…r guess (PMAT-920) The first cut of PMAT-920 inferred num_heads for apr->gguf export via infer_head_counts, which GUESSES head_dim from a hardcoded [64, 128, 96, 80] list and takes the first divisor of q_dim. That SILENTLY mis-stamps real models: Qwen2-1.5B (q_dim=1536, head_dim=128, 12 heads) gets 1536/64 = 24 heads written into a valid-looking GGUF with no error. A wrong-but-valid head count is worse than the original honest hard-fail. Root cause: num_heads is NOT inferable from shapes alone — q_dim = num_heads x head_dim has no unique factorization without head_dim (1536 = 12x128 = 24x64 = 16x96, all valid). Fix (export_apr_to_gguf_raw / infer_missing_gguf_dims_from_shapes): - num_heads / num_kv_heads: derived ONLY from an EXPLICIT head_dim in the APR header (AprV2Metadata.head_dim), num_heads = q_dim/head_dim, num_kv_heads = kv_dim/head_dim — EXACT and sound (fill_head_counts_from_explicit_head_dim). Explicit num_heads always wins. The [64,128,96,80] guess is no longer consulted on the export path (infer_head_counts stays for the import fallback only). - When head_dim AND num_heads are BOTH absent → actionable hard-fail (missing_num_heads_err) naming the missing dim and a working remedy (stamp head_dim/num_heads, or re-convert from the source config). No GGUF is written. - hidden_size / vocab_size / intermediate_size stay inferred from embedding/FFN shapes (these ARE unambiguous). Falsifiers (both directions, mutation-verified): FALSIFY-EXPORT-INFER-METADATA-001 (ft_apr_gguf_export_uses_explicit_head_dim_exactly_not_guess): metadata-light .apr with explicit head_dim=128, q_dim=256 → GGUF head_count == 2 EXACTLY (the old guess would say 256/64 = 4). FALSIFY-EXPORT-INFER-METADATA-002 (ft_apr_gguf_export_hard_fails_when_head_dim_and_num_heads_absent): head_dim AND num_heads absent → Err naming num_heads + remedy, NO GGUF written. Reverting to the guess flips BOTH falsifiers RED. Contract apr-export-num-layers-v1 bumped to 1.2.0: drops the "q_dim/head_dim always inferable" overclaim, states num_heads requires explicit head_dim (else honest error), adds both falsifier refs (single-line). pv validate + pv lint contracts/ pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

noahgift enabled auto-merge June 24, 2026 08:39

noahgift disabled auto-merge June 24, 2026 08:44

noahgift enabled auto-merge June 24, 2026 09:09

noahgift added this pull request to the merge queue Jun 24, 2026

Merged via the queue into main with commit 45a2ce2 Jun 24, 2026
10 checks passed

noahgift deleted the beat/apr-gguf-export-metadata branch June 24, 2026 10:03

noahgift mentioned this pull request Jun 24, 2026

chore(release): 0.55.0 — PMAT-918..921 wave (convert/export runnable + GPU parity reconciled + autograd training proven) #2218

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(export): infer GGUF metadata (num_heads/etc.) so apr->gguf works on metadata-light .apr (PMAT-920)#2212

fix(export): infer GGUF metadata (num_heads/etc.) so apr->gguf works on metadata-light .apr (PMAT-920)#2212
noahgift merged 2 commits into
mainfrom
beat/apr-gguf-export-metadata

noahgift commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

noahgift commented Jun 24, 2026

Bug (EV#4 — user-facing export correctness)

Fix

Falsifier (RED→GREEN, mutation-verified)

Contract

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant