perf(zeroex_v2 trades): materialize settler-txs staging across all 17 chains (collapse ~14x traces re-scan) by a-monteiro · Pull Request #9734 · duneanalytics/spellbook

a-monteiro · 2026-06-09T18:29:57Z

Materializes the 0x Settler transaction scan that the zeroex_v2_<chain>_trades models build inline, across all 17 chains (arbitrum, avalanche_c, base, berachain, blast, bnb, ethereum, ink, linea, mantle, mode, monad, optimism, polygon, scroll, unichain, worldchain).

The settler filter searches transaction calldata (varbinary_position(input, <selector>)), which is non-pushable, so the zeroex_tx CTE — referenced four times in the zeroex_v2_trades macro and re-expanded through tbl_all_logs — was inlined by Trino into ~14 full scans of <chain>.traces on every build. This was the single biggest cost driver on the spellbook-dex cluster (~110 CPU-hrs/day, ~24% of the cluster).

The fix extracts zeroex_settler_txs_cte into a materialized incremental staging model (zeroex_v2_<chain>.settler_txs, partitioned by block_month) and reads it via ref() in each trades model, so traces is scanned once per run instead of ~14x. No macro logic changes; the trades models just source the pre-computed CTE.

Measured A/B (warm runs, prod, fixed 1-day window). NEW total includes the staging build cost:

chain	OLD cpu	NEW cpu	CPU	OLD scan	NEW scan	IO	peak mem
bnb	2583s	1622s	-37%	268 GB	177 GB	-34%	11.2→13.6 GB
base	1786s	1073s	-40%	198 GB	163 GB	-18%	19.8→11.4 GB
polygon	2123s	1008s	-52%	240 GB	172 GB	-28%	16.2→2.9 GB

The IO cut varies with how traces-dominated each chain is (bnb scans the most traces); the CPU cut is consistently 37-52% and the per-task peak memory generally drops. <chain>.traces goes from ~14 inline scans to 1 staging scan.

Equivalence proven per chain:

bnb: full-pipeline (old EXCEPT new) = (new EXCEPT old) = 0 over a fixed window.
base & polygon: the OLD (inline) and NEW (reads staging) full models return identical row counts and identical checksum() over all 29 output columns.
The materialization is safe because settler_txs.rn (ROW_NUMBER over tx_hash ordered by zid) is deterministic. Determinism checks on prod (7-day window) confirm the staging unique key (block_month, tx_hash, rn) holds with 0 collisions on all 17 chains; the only zid ties found (base 5, ethereum 2; all others 0) are byte-identical in every downstream column, so which row gets rn=1 is irrelevant. (If a tie were ever non-benign, the current 4x-inlined model would already be nondeterministic, since base_filtered_logs filters rn = 1; materializing freezes one consistent assignment.)
Staging unique_combination_of_columns test passes on real data for all chains. (blast has no settler activity in-window → empty staging, harmless.)

Estimated family saving ~30-40 CPU-hrs/day; the three measured chains alone (bnb+base+polygon) save ~30 CPU-hrs/day, and the measured CPU reductions exceed the original -37% estimate.

The related zeroex_*_api_fills family (same zeroex_tx-from-traces pattern, referenced ~11x) is a separate follow-up.

Fixes CUR2-2706

…se repeated traces scans Extract zeroex_settler_txs_cte into a materialized incremental staging model (zeroex_v2_bnb.settler_txs) and read it via ref() in the trades model. The settler-trace filter searches calldata (varbinary_position on input), which is non-pushable, so Trino re-inlined the zeroex_tx CTE into ~14 full scans of bnb.traces per build. Materializing scans traces once. Measured on a fixed 1-day window (warm): CPU 2583s->1622s (-37%), scan 268GB->177GB (-34%), bnb.traces IO 14x->1x. Proven equivalent: full-pipeline (old EXCEPT new) = (new EXCEPT old) = 0.

a-monteiro · 2026-06-09T18:30:10Z

This stack of pull requests is managed by Graphite. Learn more about stacking.

cursor · 2026-06-09T18:49:11Z

PR Summary

Medium Risk
Changes the build graph and incremental merge path for a core DEX trades model; logic is delegated to staging but row identity depends on deterministic rn assignment frozen at materialization time.

Overview
Performance fix for BNB 0x v2 trades: the heavy settler transaction extraction from bnb.traces is no longer inlined in zeroex_v2_bnb_trades. A new incremental staging model zeroex_v2_bnb_settler_txs (zeroex_v2_bnb.settler_txs) runs zeroex_settler_txs_cte once per build; trades now ref() that table with the same incremental block_time filter instead of expanding the macro (~14 repeated trace scans under Trino).

Schema YAML adds the staging model with column docs and a uniqueness test on (block_month, tx_hash, rn). Downstream trade logic and macros are unchanged—only the data source for the zeroex_tx CTE.

^{Reviewed by Cursor Bugbot for commit fbe197c. Configure here.}

a-monteiro · 2026-06-09T18:49:33Z

Regression passed on prod (7-day window): 162,467 staging rows, 0 duplicate (block_month, tx_hash, rn), 0 dup (tx_hash, rn), 0 dup cow_rn — the staging unique_combination_of_columns test holds.

Per-chain rollout estimate (spellbook-dex, 24h). The 14x traces re-scan is structural (same macro, same 4 CTE references) on every chain, so the one-CTE swap applies identically. Applying the bnb-measured -37% CPU:

chain	CPU-hrs/day	avg scan GB/run	est. saving @-37%
bnb (shipped here)	28.6	43.8	10.6
base	21.6	14.9	8.0
polygon	21.5	28.6	8.0
worldchain	9.6	11.8	3.6
monad	7.8	7.2	2.9
arbitrum	4.8	4.4	1.8
optimism	4.1	4.5	1.5
ethereum	3.6	3.9	1.3
berachain	3.3	5.7	1.2
avalanche_c	2.6	3.4	1.0
ink	1.6	2.6	0.6
unichain	1.3	1.0	0.5
mantle / scroll / linea / mode	~1.1	<0.2	~0.3
family total	111.5		~41 CPU-hrs/day

Top 4 (bnb, base, polygon, worldchain) = 81.3 CPU-hrs/day = 73% of the family — priority rollout. Caveat: -37% is bnb-measured; per-chain actuals scale with each chain's traces-scan share, and steady-state is marginally lower once the staging table's own incremental MERGE + OPTIMIZE is counted. Worth measuring base/polygon before the full rollout.

…worldchain Same materialization as bnb applied to the next 3 highest-cost chains. Each gets a zeroex_v2_<chain>.settler_txs staging model and a ref() swap so <chain>.traces is scanned once per build instead of ~14x inline. Per-chain rn determinism verified on prod (7d): unique (block_month, tx_hash, rn) holds with 0 collisions; worldchain/polygon 0 zid-ties, base 5 ties all byte-identical in downstream columns (benign). Estimated saving at the bnb-measured -37% CPU: base ~8.0, polygon ~8.0, worldchain ~3.6 CPU-hrs/day.

…7 chains Adds the remaining 13 chains (arbitrum, avalanche_c, berachain, blast, ethereum, ink, linea, mantle, mode, monad, optimism, scroll, unichain) to the settler-txs materialization. Each gets a zeroex_v2_<chain>.settler_txs staging model + ref() swap so <chain>.traces is scanned once per build instead of ~14x inline. Per-chain rn determinism verified on prod (7d) before shipping: unique (block_month, tx_hash, rn) holds with 0 collisions on every chain; only ethereum had zid-ties (2) and all were byte-identical in downstream columns (benign). blast has no settler activity in-window (empty staging, harmless).

The new settler_txs staging models build as state:new in CI (full refresh), and the 17 modified trades models also full-refresh on the initial CI run - each scanning traces/logs/prices from start_date '2024-07-15'. Building all 17 chains at once blew the 1h30m CI budget. Floor start_date to the last 14 days when target.name == 'ci' inside the three shared macros (zeroex_settler_txs_cte, zeroex_v2_trades, zeroex_v2_trades_detail), which govern every full-refresh date filter. Production (target dunesql) is unchanged.

github-actions Bot added the WIP work in progress label Jun 9, 2026

github-actions Bot added the dbt: dex covers the DEX dbt subproject label Jun 9, 2026

a-monteiro changed the title ~~perf(zeroex_v2_bnb_trades): materialize settler-txs staging to collapse repeated traces scans~~ perf(zeroex_v2_bnb_trades): materialize settler-txs staging to collapse ~14x bnb.traces re-scan Jun 9, 2026

a-monteiro marked this pull request as ready for review June 9, 2026 18:49

a-monteiro requested a review from a team June 9, 2026 18:49

github-actions Bot added ready-for-review this PR development is complete, please review and removed WIP work in progress labels Jun 9, 2026

a-monteiro changed the title ~~perf(zeroex_v2_bnb_trades): materialize settler-txs staging to collapse ~14x bnb.traces re-scan~~ perf(zeroex_v2 trades): materialize settler-txs staging for bnb/base/polygon/worldchain (collapse ~14x traces re-scan) Jun 9, 2026

a-monteiro changed the title ~~perf(zeroex_v2 trades): materialize settler-txs staging for bnb/base/polygon/worldchain (collapse ~14x traces re-scan)~~ perf(zeroex_v2 trades): materialize settler-txs staging across all 17 chains (collapse ~14x traces re-scan) Jun 9, 2026

jeff-dude approved these changes Jun 10, 2026

View reviewed changes

Merge branch 'main' into andre/zeroex-v2-settler-staging

3c9b891

jeff-dude added ready-for-merging and removed ready-for-review this PR development is complete, please review labels Jun 10, 2026

a-monteiro mentioned this pull request Jun 10, 2026

perf(zeroex api_fills): materialize settler-txs staging for bnb + polygon (collapse ~9x traces re-scan) #9744

Merged

tomfutago merged commit c882cbe into main Jun 11, 2026
11 checks passed

tomfutago deleted the andre/zeroex-v2-settler-staging branch June 11, 2026 12:06

github-actions Bot locked and limited conversation to collaborators Jun 11, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(zeroex_v2 trades): materialize settler-txs staging across all 17 chains (collapse ~14x traces re-scan)#9734

perf(zeroex_v2 trades): materialize settler-txs staging across all 17 chains (collapse ~14x traces re-scan)#9734
tomfutago merged 5 commits into
mainfrom
andre/zeroex-v2-settler-staging

a-monteiro commented Jun 9, 2026 •

edited

Loading

Uh oh!

a-monteiro commented Jun 9, 2026

Uh oh!

cursor Bot commented Jun 9, 2026 •

edited

Loading

Uh oh!

a-monteiro commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

a-monteiro commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a-monteiro commented Jun 9, 2026

Uh oh!

cursor Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

a-monteiro commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

a-monteiro commented Jun 9, 2026 •

edited

Loading

cursor Bot commented Jun 9, 2026 •

edited

Loading