fix(bit_hash): off-by-one in sha1_compute padding; restore one-shot FFI#91
Merged
Conversation
The native one-shot `sha1_compute` C entry had `if (remainder < 55)` for the "padding fits in one block" branch. At `remainder == 55` — exactly the case where the 0x80 marker plus the 8-byte big-endian length still fit in the current 64-byte block — the comparison rejected the input and routed it through the two-block path, producing a different digest than the incremental `Sha1State` walk. Commit c25986d disabled `sha1_compute_ffi` entirely as a workaround, masking the bug and making `sha1_raw` pay one C FFI call per 64-byte block. Fix the boundary (`< 56`) and restore `sha1_raw` to the one-shot path; add a sweep test that walks both padding cliffs (55-byte and 119-byte). `moon bench -p mizchi/bit_hash --target native --release` on a scalar (non-SHA-NI) CPU: sha1_raw 64 bytes 1.18 µs -> 842 ns (-29%) sha1_raw 1 KiB 9.46 µs -> 6.72 µs (-29%) sha1_raw 8 KiB 69.51 µs -> 50.65 µs (-27%) sha1_raw 64 KiB 546.63 µs -> 403.00 µs (-26%) bit_hash, bit_object, bit_lib, bitx_hub, bit_pack tests all pass (15 + 27 + 285 + 113 + 72 = 512). HubStore's `put_record` / `get_record` round-trip — the original c25986d regression — exercises the fixed path via `array_to_bytes` -> `sha1`. https://claude.ai/code/session_01FgY6EMujwhzucSQkBVcodR
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
moon benchでdocs/benchmarks.mdのベースラインと比較したら SHA-1 native が ~70-80% 退化 していたので、moon-pprofを使うつもりで環境整備しながら原因を追って修正しました。真因
modules/bit_hash/src/sha1_ni.c:324の padding off-by-one:remainder == 55は0x80(1 byte) + 8-byte length がちょうど現ブロックに収まる境界です。この条件だと 55 が 2-block パスに流れ、その block の内容が前ブロックと同じ data 領域を含んでSha1Stateの正しいインクリメンタル計算と違うダイジェストを返していました。c25986d("fix: use pure MoonBit SHA-1 path on native to fix key lookup in HubStore") の症状 (HubStore のget_recordがNone) は副作用で、コミットメッセージが「Bytes::from_iterのメモリレイアウトが違う」と誤診したため fast path を丸ごと無効化していました。実際はBytes::from_iterも内部でBytes::from_array(iter.collect())を呼ぶだけで表現は同一です。修正
< 55→< 56(one-block padding はremainder ≤ 55で OK)sha1_raw/sha1_bytesを 1-shot FFI (sha1_compute_ffi) に戻し、per-block FFI を経由する遅いパスを廃止Bench (
moon bench -p mizchi/bit_hash --target native --release, scalar fallback CPU)(
docs/benchmarks.md上部の数値は SHA-NI 付き CPU の天井で、この PR 単体ではそこまで届きませんが、scalar fallback の理論限界に乗りました。)docs/benchmarks.mdに修正経緯とmoon-pprofで見えたmizchi/simd::simdhash::rotl/rotrの非インライン化 (wasm-gc で self time の 45%) を追記しています。後者は upstream のmizchi/simdへの提案で、本 PR では触っていません。Test plan
moon test -p mizchi/bit_hash --target native --release— 15/15 passed (新しい padding cliff sweep test を含む)moon test -p mizchi/bit_hash --target wasm-gc --release— 15/15 passedmoon test -p mizchi/bit_object --target native --release— 27/27 passedmoon test -p mizchi/bit_lib --target native --release— 285/285 passedmoon test -p mizchi/bitx_hub --target native --release— 113/113 passed (c25986d の元々の regression を踏むパス)moon test -p mizchi/bit_pack --target native --release— 72/72 passednode tools/check-layers.mjs— cleancmd/bitの monolithic test が pass することを確認 (この VM では gcc -O2 が 10 分以上かかって local では完走できず)https://claude.ai/code/session_01FgY6EMujwhzucSQkBVcodR
Generated by Claude Code