Shrink crates.io footprint: build precompute_hash/sample tables lazily#48
Merged
Conversation
Move the two 65,541-line geometric-sampling tables out of the committed
source and regenerate them deterministically from build.rs into OUT_DIR.
Each precompute_sample{,2}.rs in src/common/ now reduces to a single
`include!(concat!(env!("OUT_DIR"), "/..."))` line.
This is a temporary fix to shrink the SLoC reported on crates.io
(currently ~169K, ~83% of which is these tables). precompute_hash.rs
is left as-is because the generator depends on crate-internal hashing
and can't be trivially moved into build.rs.
Note: the seeded SmallRng produces different table values than the
previously committed snapshot, but the values are only used as a
pre-rolled stream of random samples — no test relies on exact values
and all 394 lib tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…LazyLock Replace the giant checked-in `PRECOMPUTED_HASH` array (and the build-script codegen that the previous commit introduced for the two `PRECOMPUTED_SAMPLE*` tables) with `LazyLock`-backed tables that materialise once on first access. For `PRECOMPUTED_HASH` this is meaningfully better than build-time codegen: the table is by definition "what `hash128_seeded` returns for these inputs", so computing it from the crate's own hasher at runtime guarantees the two can never drift, and avoids either duplicating the XxHash3_128 byte-conversion logic in `build.rs` or pulling `twox-hash` in as a build dependency. For the two sample tables the LazyLock path is a strict simplification of the previous build-script approach: same fixed seed, same determinism across runs/versions, but no `OUT_DIR` codegen, no extra build-dep on `rand`, and no megabytes of generated source under `target/` on every clean build. While here, drop the three `generate_precomputed_*` bins, the `internal-bins` feature, and the `src/bin/` directory they lived in -- they only existed to regenerate the checked-in tables and have no remaining purpose. Net diff vs main: +115 / -147,611 lines. API impact: `PRECOMPUTED_HASH`, `PRECOMPUTED_SAMPLE`, and `PRECOMPUTED_SAMPLE_RATE_1PERCENT` keep the same names and the same indexed-access pattern (`X[i]`, `.iter()`, `.len()` all work via `Deref`). The element container changes from `[T; N]` to `Box<[T]>`, so callers spelling the type as `&[u128; 0x4000]` need `&[u128]` instead; new `PRECOMPUTED_HASH_LEN` / `PRECOMPUTED_SAMPLE_LEN` constants are exposed for compile-time length. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the ~131K-line checked-in precompute tables (
PRECOMPUTED_HASH,PRECOMPUTED_SAMPLE,PRECOMPUTED_SAMPLE_RATE_1PERCENT) withstd::sync::LazyLock-backed tables that materialise once on first access.Net diff vs
main: +115 / -147,611 lines.Why not just generate them at build time?
Two clean library-idiomatic options exist for "table that is really just a deterministic function of
i":OUT_DIR— the approach the previous commit on this branch took for the sample tables. It works, but forPRECOMPUTED_HASHspecifically the generator would need to either duplicate the byte-conversion +XxHash3_128logic inbuild.rsor pulltwox-hashin as a build dependency (drift risk; the contract is literally "whathash128_seededreturns for these inputs"). It also writes 64K+ lines of generated Rust undertarget/on every clean build.LazyLock— describe the function, materialise the table on first access. No build script involvement, no extra build deps, and for the hash table the values are computed by the crate's ownhash128_seeded, so by construction they can never drift.This PR picks (2) for all three tables.
Per-table summary
precompute_hash.rs: 16 390 lines → 31 lines.pub static PRECOMPUTED_HASH: LazyLock<Box<[u128]>>computed fromhash128_seeded(0, &DataInput::U64(i)).precompute_sample.rs/precompute_sample2.rs:LazyLock<Box<[f64]>>from aSmallRngseeded with the same0xA5A0_5A71_B11B_C0DEthe previous commit on this branch settled on, so values remain reproducible across runs/versions. The 1%-rate table reuses a privatebuild_ln_one_minus_u_table(scale)helper.build.rs: stripped of all codegen; only compiles.protofiles now.Cargo.toml: removedrandfrom[build-dependencies], the three[[bin]]entries forgenerate_precomputed_*, and the now-unusedinternal-binsfeature.src/bin/: deleted (only ever held the three generator bins; directory removed entirely).docs/library_map.md: updated to point at the newLazyLockdesign instead of the removed bins.API impact
Names are unchanged. Indexed access (
PRECOMPUTED_HASH[i]), iteration (.iter()), and length (.len()) continue to work viaDerefto the underlying slice — the existing call sites insrc/sketch_framework/nitro.rsandsrc/common/structure_utils.rsare unmodified.The element container changes from
[T; N]toBox<[T]>:PRECOMPUTED_HASH[i] -> u128: unchanged.&PRECOMPUTED_HASH as &[u128; 0x4000]: no longer compiles; use&PRECOMPUTED_HASH[..]or&[u128].PRECOMPUTED_HASH_LEN(= 0x4000) andPRECOMPUTED_SAMPLE_LEN(= 0x10000) constants are exposed for compile-time length needs.Runtime cost
One-time initialisation on first access:
XxHash3_128evaluations + one 256 KiB heap allocSmallRngdraws +ln+ one 512 KiB heap allocSteady state: indexed access adds one relaxed atomic load (well-cached) versus the previous static array.
Test plan
cargo buildsucceedscargo test --lib— 394/394 pass, including thenitro_batch_*tests that hammerPRECOMPUTED_SAMPLE_RATE_1PERCENTin the hot pathcargo clippy --lib --tests --no-depscleanBox<[T]>API change is acceptable for downstream callers (the names and indexed-access pattern are preserved)Made with Cursor