fix(format): f16 export uses round-to-nearest-even — match half::f16::from_f32 (PMAT-905)#2193
Merged
Conversation
…:from_f32 (PMAT-905)
The SafeTensors f32 -> F16 export path (f32_slice_to_f16_bytes /
f32_to_f16_bits_rne, reached from `apr export --format safetensors` for
F16-dtype tensors) did NOT match half::f16::from_f32. It truncated the
mantissa (mantissa >> 13) and flushed the entire f16 subnormal range to
signed zero, causing two distinct roundtrip-fidelity defects:
1. Every value with a non-zero discarded mantissa was biased toward
zero instead of round-to-nearest-even (e.g. f32 0x476A_7E00 must
encode to 0x7B54 but truncation produced 0x7B53), and the
near-overflow boundary 65520.0 must round UP to +Inf (0x7C00) but
truncation kept it finite (0x7BFF).
2. The smallest representable magnitudes (f16 subnormals 0x0001..0x03FF,
i.e. |x| in [2^-24, 2^-14)) were silently destroyed — f32 2^-24 must
encode to 0x0001 but flush-to-zero produced 0x0000.
This is the F16 sibling of the PMAT-859 BF16 fix. Unlike BF16, F16 has a
real subnormal range, so the encoder now applies round-to-nearest-even
across BOTH the normal mantissa and the subnormal grid, propagates the
rounding carry into the exponent (and onward to Inf on overflow), and
preserves NaN — matching half::f16::from_f32 bit-for-bit.
Contract: OBLIG-SAFETENSORS-F16-EXPORT-RNE in
contracts/safetensors-f16-round-v1.yaml (C-F16-001..004), with 7
falsifiers in safetensors_tests_core.rs verified RED on the
truncation/flush code and GREEN on the fix; mutation-verified by
re-disabling the RNE rounding (falsifier FAILS, non-tautological).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… follow-up) Adversarial review of #2193 found two of six named falsifiers did not actually catch the bug they claim to discharge: 1. FALSIFY-F16-ROUND-001 used f32 0x3C01_2000, whose low 13 discarded mantissa bits are 0x0000 — no rounding decision, so the OLD truncation code (man>>13) produced the identical result and the test PASSED on the bug. Switched to 0x476A_7E00 (discarded 0x1E00 > 0x1000 halfway): RNE rounds up to 0x7B54 while truncation keeps 0x7B53. Verified non-tautological — oracle 0x7B54 != truncation 0x7B53 (numpy RNE + bit math) — and GREEN on the fix. 2. C-F16-004 / FALSIFY-F16-NAN-001 claimed a NaN->Inf regression, but the prior code already preserved NaN (its Inf/NaN branch ORed 0x0200 for non-zero mantissa). Reframed honestly as a forward invariant (pins NaN-preservation against a future flush) rather than a discharged bug, removing the false RED-claim from the contract. The numerical fix itself was already correct (verified vs a 50M-sample brute-force RNE oracle). The other four falsifiers (PARITY/SUBNORMAL/ SUBNORMAL-RNE/OVERFLOW) were already genuine RED-on-bug. This restores beat discipline: every obligation now has a dedicated, truthful falsifier. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The SafeTensors
f32 -> F16export path (f32_slice_to_f16_bytes/f32_to_f16_bits_rneincrates/aprender-core/src/serialization/safetensors.rs, reached fromapr export --format safetensorsfor F16-dtype tensors) did not matchhalf::f16::from_f32(the IEEE-754 / PyTorch / HuggingFace round-to-nearest-even reference). It truncated the mantissa (mantissa >> 13) and flushed the entire f16 subnormal range to signed zero.This is the F16 sibling of the PMAT-859 BF16 fix. Unlike BF16, F16 has a real 5-bit exponent and a genuine subnormal range (
2^-24 .. 2^-14), so two distinct roundtrip-fidelity defects followed:0x476A_7E00must encode to0x7B54but truncation produced0x7B53; and the near-overflow boundary65520.0must round up to+Inf(0x7C00) but truncation kept it finite (0x7BFF).0x0001..0x03FF) were silently destroyed — f322^-24must encode to0x0001but flush-to-zero produced0x0000.Fix
The encoder now performs round-to-nearest-even across both the normal mantissa (drop 13 low bits) and the subnormal grid (variable right-shift), propagates the rounding carry into the exponent field (and onward to
Infon overflow), and preserves NaN — matchinghalf::f16::from_f32bit-for-bit. No new dependency: implemented manually like the BF16 sibling (thehalfcrate is feature-gated; the encoder is always compiled).Contract
contracts/safetensors-f16-round-v1.yaml— obligation OBLIG-SAFETENSORS-F16-EXPORT-RNE (equations C-F16-001..004).pv validate+pv lint contracts/both PASS.Verification (full beat discipline)
ours 0x0000vshalf 0x0200, near-overflow31743vs31744).cargo test -p aprender-core --lib --features format-quantize= 14044 passed, 0 failed.🤖 Generated with Claude Code