PoC: FFI Bundle Size by mizchi · Pull Request #1 · mizchi/core

mizchi · 2025-12-22T08:27:16Z

No description provided.

Add benches_js/ with benchmarks for: - map: HashMap operations (84KB bundle) - json: JSON parse/stringify (289KB bundle) - double: Double::to_string with Ryu (72KB bundle) - regex: Regex compile/execute (358KB bundle) Includes measure.sh script for automated size measurement and benchmark execution. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Documentation covers: - Package structure and dependency analysis - Hasher global side effect issue - Double::to_string Ryu algorithm details - JSON package size analysis - Map implementation comparison - JS backend optimization approaches - JS FFI implementation guide 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Benchmarks for: - hashmap: HashMap/SortedMap insert/get/iteration - double: Double::to_string with various number formats - json: JSON parse/stringify operations - hasher: Hash operations and collision testing - string: String operations and manipulation - array: Array operations - bigint: BigInt arithmetic operations - strconv: String parsing functions - stringview: StringView slicing and iteration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Replace Ryu algorithm with native JavaScript String(x) for JS target. This reduces bundle size by 97% (72KB → 2.3KB) and improves performance. - Add to_string_js.mbt: JS FFI implementation - Add to_string_nonjs.mbt: Keep Ryu for wasm/native targets - Update moon.pkg.json with target configuration 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Implement JSON.parse/stringify FFI with 90% bundle size reduction - Handle MoonBit Map conversion via intermediate array representation - All 19 tests passing - Add optimization report documenting limitations and recommendations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Investigate hash collision vulnerability (hashdos) for fixed seed hashing: - Analyze attack feasibility: ~52 hours to generate 1000 colliding keys - Robin Hood hashing mitigates worst-case O(n) degradation - random_seed() adds only 224 bytes (1.2% of bundle) Propose lazy seed initialization to fix ESM treeshake and Cloudflare Workers: - Current: top-level `const seed = random_seed()` blocks treeshake - Proposed: lazy init with `Ref[Int?]` pattern, no top-level side effects - PoC verified in benches_js/lazy_seed/ 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Implement JavaScript native RegExp wrapper for JS backend: - 97% bundle size reduction (320KB → 11KB) - Up to 80x faster execution using V8 native regex engine Test coverage increased from 18 to 44 tests: - Unicode escape sequences (\u{XXXX}, \uXXXX) - Multiline/singleline flags (m, s) - Non-word boundaries (\B) - Nested groups, named groups - Edge cases (empty pattern, long strings, boundaries) - Character classes with unicode properties - All quantifier types including {n,} 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add json/parse_ffi_js.mbt with native JSON.parse FFI - Configure json/moon.pkg.json targets to exclude lexer/parser on JS - Add string/regex JS FFI implementation (top_ffi_js.mbt) - Add 43 JS-compatible regex tests - Bundle size: regex+json 305KB → 103KB (66% reduction) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add stringify_ffi_js.mbt with JS JSON.stringify FFI - Add stringify_nonjs.mbt for non-JS targets - Move stringify/escape from json.mbt to target-specific files - Use LinkedHashMap.head iteration for Map stringify - Fallback to MoonBit implementation when Replacer is used - Bundle size: regex+json 305KB → 104KB (66% reduction) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add linked_hash_map_ffi_js.mbt with simplified Map for JS target - Uses linked list for iteration (maintains Entry.head/next structure) - Linear search instead of hash table (simpler, smaller code) - Compatible with stringify FFI that accesses entry.key/value/next - Performance: json_stringify 9us → 1.8us (5x faster) - Bundle size: regex+json 104KB → 92KB (10% reduction, 70% total) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Remove unused JS FFI functions (js_map_new/set/get/has/delete/size) - Extract common linked list append to Map::append_entry helper - Simplify merge_in_place, map, set using helper - Remove unused equal_to_string/equal_to_bytes functions - Remove unused mut from Entry fields (prev, psl) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Use next_power_of_two() for capacity in new() - Calculate proper capacity in from_array() and of() - Use same grow threshold formula (cap * 13 / 16) - No bundle size increase 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Check v._1 (repr field) for Number type in stringify - Use marker prefix "__REPR__:" and replace after JSON.stringify - Fixes Infinity/special number representation in JS target - All tests now pass on JS target (2255/2255) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add linked_hash_set_ffi_js.mbt with simplified Set for JS target - Uses linked list for iteration (maintains Entry.head/next structure) - Linear search instead of hash table (simpler, smaller code) - Same pattern as Map JS FFI implementation - All tests pass: JS (9/9), wasm-gc (60/60) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add Map FFI investigation report (.private/13) - Add JSON replacement log (.private/16) - Add intermediate summary (.private/17) - Add Set FFI investigation (.private/18) - Add JS benchmarks for double, map, regex 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Files replaced by JS FFI target-specific implementations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

`ParseContext::lex_skip_whitespace` is called at the start of every `lex_value` / `lex_after_*` step. The original implementation built a `StringView` of the remaining input on every call and ran `lexmatch [ \t\r\n]+` on it. That StringView wrapper showed up as the #1 allocation source in JSON parsing workloads. Since JSON whitespace is ASCII-only, we can scan UTF-16 code units directly on the existing input view with identical semantics and zero allocations. `offset` is held in a local for the duration of the scan so each consumed whitespace character is one `local.set` instead of a `struct.set` against `ctx`. Measured on a 197 KB JSON-array-of-1000-objects parsed 50 times (wasm-gc, wasmtime): Total alloc bytes : 145.13 MB -> 107.37 MB (-26.0 %) Total #allocs : 13.15 M -> 9.85 M (-25.1 %) StringView::view : 32.62 MB -> 7.44 MB (-77.2 %) String::view : 12.59 MB -> 0 (gone) Adds a focused `test "lex_skip_whitespace"` covering the empty-input, all-whitespace, leading-whitespace, no-leading-whitespace, and non-ASCII (U+00A0 NBSP must NOT count as JSON whitespace per RFC 8259) cases. Reproduction: github.com/mizchi/pprof-mbt (moon-pprof memprofile + summary --diff). Tests: moon test passes (6507 / 6507). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three related allocation removals along the JSON number lex chain, all stemming from the same observation: the native target boxes every returned struct / tuple / enum payload by default. ## 1. Drop the `(Double, StringView?)` tuple The number-lex chain (`lex_zero` / `lex_decimal_*` / `lex_number_end` / `lex_integer_end`) returned `(Double, StringView?)`, where the second component is the source-text view that's only ever `Some(_)` on the infinity-overflow path. Native boxes every tuple, so every parsed JSON number cost one heap allocation just to carry a value that was structurally `None`. Side-channel the optional view through a new private `ParseContext.last_number_repr` field. Each leaf return writes it (usually `None`); `lex_main.mbt` reads + clears when constructing the `Number` token. ## 2. Stack-allocate `JsonNumberScan` via `#valtype` `scan_json_number` builds + returns a five-field `JsonNumberScan` struct on every JSON number, used immediately by `lex_number_end` and discarded. Without `#valtype` the native target boxes that struct as a ~32-byte heap object per call. ## 3. NaN sentinel for `try_fast_double` `JsonNumberScan::try_fast_double` returned `Double?` — `Some(d)` on the fast path, `None` to fall back to strconv. The fast path can only produce 0 or a finite `Double` (the `checked_mul` guard rules out infinity), so `NaN` is a free sentinel. Returning a plain `Double` (with NaN meaning "not handled") skips the boxed `Option<Double>` allocation on every JSON number that hits this path. ## Numbers Measured on a native-target alloc profiler over mizchi/pprof-mbt's bench suite (`--sample-rate 100`): | bench | metric | before | after | Δ | |---|---|---|---|---| | `json_numbers` (10 k integers × 30) | allocs | 1 200 200 | 600 200 | **−50 %** | | | bytes | 18.31 MB | 9.16 MB | **−50 %** | | `json_parse` (1 000-obj × 50) | allocs | 3 250 200 | 2 300 200 | **−29 %** | | | bytes | 58.56 MB | 46.93 MB | **−20 %** | (`moonbitlang#3` only fires for non-integer / non-overflow numbers, so it adds 100 k allocs / 681 kB on top of #1+moonbitlang#2 on `json_parse` and nothing on `json_numbers`.) Tests: `moon test --target native -p json` and `moon test --target wasm-gc -p json` both pass (171 / 171 each). `moon fmt --check` clean.

mizchi and others added 18 commits December 22, 2025 13:01

docs: add Double::to_string optimization report

ec103e5

🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

refactor(regex): remove alias.mbt and top.mbt

327ac87

Files replaced by JS FFI target-specific implementations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

finish

d838e9b

mizchi changed the title ~~Poc:~~ PoC: FFI Bundle Size Dec 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PoC: FFI Bundle Size#1

PoC: FFI Bundle Size#1
mizchi wants to merge 18 commits into
mainfrom
jsbench

mizchi commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mizchi commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant