perf(cypher): VarMap replaces Binding HashMap for clone-per-match path (FU-006 H2) by coseto6125 · Pull Request #433 · coseto6125/egent-code-plexus

coseto6125 · 2026-05-24T00:11:52Z

Summary

Third and final fix from the FU-006 follow-up spike's top-3 hotspot list. PR #432 attacked the iteration count (CSR kind lookup + walk_rel closure); this PR attacks the per-iteration clone cost. They compose orthogonally.

The cypher executor's Binding row holds three name→index maps that are cloned once per matched node in exec_pattern's frontier expansion. The old HashMap<String, u32> clone was the dominant cost — 67-85% of query time per the spike — because each clone heap-allocates a fresh bucket table and per-key String.

Replace node_vars and edge_vars with VarMap: a SmallVec of (CompactString, u32) entries that stays inline for the typical 2-4 vars per cypher query. Inline storage means Binding.clone() is a fixed-size memcpy with zero heap allocations on the hot path. CompactString keys inline up to 24 bytes (every realistic cypher var name), so the per-key clone is also alloc-free. Linear scan over ≤4 entries beats HashMap lookup at this size because there is no hashing cost.

Wider Binding API kept compatible — get(key: &str) -> Option<&u32>, contains_key(key: &str) -> bool, insert(key: &str, value: u32) — identical to the HashMap surface so existing call sites just drop the now-redundant var.clone() before insert. Five sites updated.

The computed field stays HashMap<String, Value>. It carries non-Copy Value items and is populated only by WITH clauses, never during frontier expansion. Empty-HashMap clone is a 56-byte memcpy with no allocation, so the savings would not justify the Option<Box<HashMap>> ergonomics churn.

Empirical (--runs 10, freshly-rebuilt `.sample_repo`, same-session A/B vs main)

Phase	Main	H2	Δ
*cypher count() ungrouped**	61.4 ms	35.6 ms	-42%
cypher decorator IN	48.8 ms	41.4 ms	-15%
find (bm25)	22.1 ms	18.9 ms	-14%
inspect (Class)	23.3 ms	20.3 ms	-13%
impact upstream	18.3 ms	16.0 ms	-13%
impact downstream	18.1 ms	16.0 ms	-12%
summary	19.9 ms	17.8 ms	-11%
summary --detailed	19.6 ms	17.4 ms	-11%
routes	12.7 ms	11.4 ms	-10%
cypher Method-Calls→Method	74.4 ms	73.3 ms	-1%
cypher Class→Method	45.0 ms	47.9 ms	+6% (noise)

count(*) is the headline (~26 ms saved on the 110k-Method frontier-expansion clone path); the consistent -10% to -14% across other query shapes comes from the same Binding clone being on more code paths than just exec_pattern (inspect / impact / find all build bindings during their internal traversals).

Composes with PR #432 (kind-CSR + walk_rel closure)

Query	Main	#432 H1+H3	This H2	Stacked expected
cypher count(*)	61.4 ms	32.4 ms (-47%)	35.6 ms (-42%)	~20 ms (-67%)
cypher decorator IN	48.8 ms	39.1 ms (-20%)	41.4 ms (-15%)	~33 ms (-33%)

H1+H3 cuts iteration count from 303k → 110k; H2 cuts per-iteration clone from 74 ns → ~5 ns. Both halve their respective axis, so combined should compound. Will validate empirically once #432 lands and this rebases on top.

Spike attribution

Spike report identified all three hotspots; this PR completes the trio:

H1 (kind-CSR scan): shipped in PR perf(cypher): exec_pattern kind-CSR scan + walk_rel closure (FU-006 follow-up) #432
H2 (Binding SmallVec): this PR
H3 (walk_rel closure): shipped in PR perf(cypher): exec_pattern kind-CSR scan + walk_rel closure (FU-006 follow-up) #432

Test plan

cargo test -p egent-code-plexus --tests — 1182 passed
cargo test -p ecp-core --tests — 268 passed
cargo clippy -p egent-code-plexus --tests -- -D warnings — clean
cargo clippy -p ecp-core --tests -- -D warnings — clean
Same-session A/B against main on 11-phase benchmark — zero regression, -10% to -42% wins
Manual cross-check that the apparent impact --baseline HEAD~1 slowdown in the bench was a bench-state confound (both main and H2 take ~1.4s when L1 needs refresh; the bench's 9 ms reading on main was from a previously-warm L1 state)

github-actions · 2026-05-24T00:22:23Z

ecp impact cache (0 symbols) — internal, used by ecp dev pr-analyze

[]

The cypher executor's `Binding` row holds three name->index maps that are cloned once per matched node in `exec_pattern`'s frontier expansion. The old `HashMap<String, u32>` clone was the dominant cost — 67-85% of query time per the FU-006 follow-up spike — because each clone heap-allocates a fresh bucket table and per-key `String`. Replace `node_vars` and `edge_vars` with `VarMap`: a `SmallVec` of `(CompactString, u32)` entries that stays inline for the typical 2-4 vars per cypher query. Inline storage means `Binding.clone()` is a fixed-size memcpy with zero heap allocations on the hot path. `CompactString` keys inline up to 24 bytes (every realistic cypher var name), so the per-key clone is also alloc-free. Linear scan over <=4 entries beats HashMap lookup at this size because there is no hashing cost. Wider Binding API kept compatible: `get(key: &str) -> Option<&u32>`, `contains_key(key: &str) -> bool`, `insert(key: &str, value: u32)` — identical to the HashMap surface so existing call sites only drop the now-redundant `var.clone()` before insert. Five sites updated. The `computed` field stays `HashMap<String, Value>` — it carries non-Copy `Value` items and is populated only by WITH clauses, never during frontier expansion. Empty-HashMap clone is a 56-byte memcpy with no allocation, so the savings would not justify the `Option<Box<HashMap>>` ergonomics churn. Deps: ecp-core picks up direct `smallvec` and `compact_str` (both already present transitively, so no new vendored grammar work). ## Empirical (--runs 10, freshly-rebuilt .sample_repo, same-session A/B) cypher count(*) ungrouped 61.4 ms -> 35.6 ms -42% find (bm25) 22.1 ms -> 18.9 ms -14% impact upstream 18.3 ms -> 16.0 ms -13% impact downstream 18.1 ms -> 16.0 ms -12% inspect (Class) 23.3 ms -> 20.3 ms -13% summary 19.9 ms -> 17.8 ms -11% summary --detailed 19.6 ms -> 17.4 ms -11% routes 12.7 ms -> 11.4 ms -10% cypher decorator IN 48.8 ms -> 41.4 ms -15% cypher Class->Method 45.0 ms -> 47.9 ms +6% (noise) cypher Method-Calls->Method 74.4 ms -> 73.3 ms -1% Count(*) is the headline (~26 ms saved on the 110k-Method frontier- expansion clone path); the consistent -10% to -14% across other query shapes comes from the same Binding clone being on more code paths than just exec_pattern. Tests: 1182 (egent-code-plexus) + 268 (ecp-core) all pass, clippy clean. Composes orthogonally with PR #432's exec_pattern kind-CSR + walk_rel closure refactor — those attack the iteration count, this attacks the per-iteration clone cost.

coseto6125 enabled auto-merge (squash) May 24, 2026 00:11

coseto6125 added the merge-queue Opt-in to Mergify merge queue label May 24, 2026

github-actions Bot added the ecp:risk-high ecp signal label May 24, 2026

coseto6125 force-pushed the perf/cypher-binding-smallvec branch from d68534e to bd61b75 Compare May 24, 2026 00:29

github-actions Bot added ecp:risk-low ecp signal and removed ecp:risk-high ecp signal labels May 24, 2026

coseto6125 merged commit 64c7cfd into main May 24, 2026
18 checks passed

coseto6125 deleted the perf/cypher-binding-smallvec branch May 24, 2026 00:46

coseto6125 mentioned this pull request May 24, 2026

release: 0.4.0 — bump workspace + crate versions #436

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(cypher): VarMap replaces Binding HashMap for clone-per-match path (FU-006 H2)#433

perf(cypher): VarMap replaces Binding HashMap for clone-per-match path (FU-006 H2)#433
coseto6125 merged 1 commit into
mainfrom
perf/cypher-binding-smallvec

coseto6125 commented May 24, 2026

Uh oh!

github-actions Bot commented May 24, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

coseto6125 commented May 24, 2026

Summary

Empirical (--runs 10, freshly-rebuilt .sample_repo, same-session A/B vs main)

Composes with PR #432 (kind-CSR + walk_rel closure)

Spike attribution

Test plan

Uh oh!

github-actions Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Empirical (--runs 10, freshly-rebuilt `.sample_repo`, same-session A/B vs main)

github-actions Bot commented May 24, 2026 •

edited

Loading