Skip to content

fix(format): f16 export uses round-to-nearest-even — match half::f16::from_f32 (PMAT-905)#2193

Merged
noahgift merged 3 commits into
mainfrom
beat/f16-export-rne-v054
Jun 23, 2026
Merged

fix(format): f16 export uses round-to-nearest-even — match half::f16::from_f32 (PMAT-905)#2193
noahgift merged 3 commits into
mainfrom
beat/f16-export-rne-v054

Conversation

@noahgift

Copy link
Copy Markdown
Contributor

Summary

The SafeTensors f32 -> F16 export path (f32_slice_to_f16_bytes / f32_to_f16_bits_rne in crates/aprender-core/src/serialization/safetensors.rs, reached from apr export --format safetensors for F16-dtype tensors) did not match half::f16::from_f32 (the IEEE-754 / PyTorch / HuggingFace round-to-nearest-even reference). It truncated the mantissa (mantissa >> 13) and flushed the entire f16 subnormal range to signed zero.

This is the F16 sibling of the PMAT-859 BF16 fix. Unlike BF16, F16 has a real 5-bit exponent and a genuine subnormal range (2^-24 .. 2^-14), so two distinct roundtrip-fidelity defects followed:

  1. RNE bias. Every value with a non-zero discarded mantissa rounded toward zero instead of nearest-even — e.g. f32 0x476A_7E00 must encode to 0x7B54 but truncation produced 0x7B53; and the near-overflow boundary 65520.0 must round up to +Inf (0x7C00) but truncation kept it finite (0x7BFF).
  2. Subnormal flush. The smallest representable magnitudes (f16 subnormals 0x0001..0x03FF) were silently destroyed — f32 2^-24 must encode to 0x0001 but flush-to-zero produced 0x0000.

Fix

The encoder now performs round-to-nearest-even across both the normal mantissa (drop 13 low bits) and the subnormal grid (variable right-shift), propagates the rounding carry into the exponent field (and onward to Inf on overflow), and preserves NaN — matching half::f16::from_f32 bit-for-bit. No new dependency: implemented manually like the BF16 sibling (the half crate is feature-gated; the encoder is always compiled).

Contract

contracts/safetensors-f16-round-v1.yaml — obligation OBLIG-SAFETENSORS-F16-EXPORT-RNE (equations C-F16-001..004). pv validate + pv lint contracts/ both PASS.

Verification (full beat discipline)

  • RED: 4 falsifiers FAIL on the truncation/flush code (ours 0x0000 vs half 0x0200, near-overflow 31743 vs 31744).
  • GREEN: all 7 F16 falsifiers pass; cargo test -p aprender-core --lib --features format-quantize = 14044 passed, 0 failed.
  • Mutation-verified: re-disabling the RNE rounding makes the falsifier FAIL (non-tautological), then reverted.

🤖 Generated with Claude Code

…:from_f32 (PMAT-905)

The SafeTensors f32 -> F16 export path (f32_slice_to_f16_bytes /
f32_to_f16_bits_rne, reached from `apr export --format safetensors` for
F16-dtype tensors) did NOT match half::f16::from_f32. It truncated the
mantissa (mantissa >> 13) and flushed the entire f16 subnormal range to
signed zero, causing two distinct roundtrip-fidelity defects:

  1. Every value with a non-zero discarded mantissa was biased toward
     zero instead of round-to-nearest-even (e.g. f32 0x476A_7E00 must
     encode to 0x7B54 but truncation produced 0x7B53), and the
     near-overflow boundary 65520.0 must round UP to +Inf (0x7C00) but
     truncation kept it finite (0x7BFF).
  2. The smallest representable magnitudes (f16 subnormals 0x0001..0x03FF,
     i.e. |x| in [2^-24, 2^-14)) were silently destroyed — f32 2^-24 must
     encode to 0x0001 but flush-to-zero produced 0x0000.

This is the F16 sibling of the PMAT-859 BF16 fix. Unlike BF16, F16 has a
real subnormal range, so the encoder now applies round-to-nearest-even
across BOTH the normal mantissa and the subnormal grid, propagates the
rounding carry into the exponent (and onward to Inf on overflow), and
preserves NaN — matching half::f16::from_f32 bit-for-bit.

Contract: OBLIG-SAFETENSORS-F16-EXPORT-RNE in
contracts/safetensors-f16-round-v1.yaml (C-F16-001..004), with 7
falsifiers in safetensors_tests_core.rs verified RED on the
truncation/flush code and GREEN on the fix; mutation-verified by
re-disabling the RNE rounding (falsifier FAILS, non-tautological).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge June 23, 2026 18:11
@noahgift noahgift disabled auto-merge June 23, 2026 18:27
… follow-up)

Adversarial review of #2193 found two of six named falsifiers did not actually
catch the bug they claim to discharge:

1. FALSIFY-F16-ROUND-001 used f32 0x3C01_2000, whose low 13 discarded mantissa
   bits are 0x0000 — no rounding decision, so the OLD truncation code (man>>13)
   produced the identical result and the test PASSED on the bug. Switched to
   0x476A_7E00 (discarded 0x1E00 > 0x1000 halfway): RNE rounds up to 0x7B54
   while truncation keeps 0x7B53. Verified non-tautological — oracle 0x7B54 !=
   truncation 0x7B53 (numpy RNE + bit math) — and GREEN on the fix.

2. C-F16-004 / FALSIFY-F16-NAN-001 claimed a NaN->Inf regression, but the prior
   code already preserved NaN (its Inf/NaN branch ORed 0x0200 for non-zero
   mantissa). Reframed honestly as a forward invariant (pins NaN-preservation
   against a future flush) rather than a discharged bug, removing the false
   RED-claim from the contract.

The numerical fix itself was already correct (verified vs a 50M-sample
brute-force RNE oracle). The other four falsifiers (PARITY/SUBNORMAL/
SUBNORMAL-RNE/OVERFLOW) were already genuine RED-on-bug. This restores beat
discipline: every obligation now has a dedicated, truthful falsifier.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@noahgift noahgift enabled auto-merge June 23, 2026 19:06
@noahgift noahgift added this pull request to the merge queue Jun 23, 2026
Merged via the queue into main with commit 9e81a84 Jun 23, 2026
10 checks passed
@noahgift noahgift deleted the beat/f16-export-rne-v054 branch June 23, 2026 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant