benchctl: clargs + sqlite_linq rewrite, sql/ benchmarks, where+count fold#2599
Merged
Conversation
…fold benchctl - replace hand-rolled flags.das with daslib/clargs (single struct, positional command + files, structured filters) - replace raw sqlite3_* calls in bench_sql.das with [sql_table] Benchmark + with_sqlite + db |> insert(rows) bulk + _sql(... |> select_from |> _where(...)) for query and compare paths - structured filter flags (--commit / --tag / --old-commit / --old-tag / --new-commit / --new-tag) replace user-supplied --select raw-WHERE; flag composition uses _sql || empty-string short-circuit so one call site covers all flag combinations - isolate clargs in bench_args.das (private require) to work around an Option<string> ambiguity bug when daslib/clargs and sqlite/sqlite_boost are in the same module (filed as #2598) - delete bench_sql.das, flags.das; add bench_args.das, bench_table.das - README rewritten for the structured-flag surface benchmarks/sql/ - new comparison suite mirroring tests/dasSQLITE/parity_check_*.das shape but oriented to throughput: _common.das fixture + select_where.das, select_where_order_take.das, count_aggregate.das - 6 modes per file: m1m / m1d (_sql, mem/disk), m2m / m2d (no _sql, select_from materializes, mem/disk), m3 (array LINQ), m3f (_fold fused array LINQ); disk DBs created+deleted outside timed block - benchmarks/README.md adds the sql/ section daslib/linq_boost - new fold_where_count + ["where_", "count"] FoldSequence: emits a single-pass invoke($(source) { var n; for it in source; if pred(it) n++; return n }, top) with the predicate spliced via fold_linq_cond - eliminates intermediate filter array AND block-call overhead - count_aggregate m3f: 25.5 -> 8 ns/op (INTERP), 3 ns/op (JIT); zero alloc; ~5x faster than _sql at 10K rows in memory Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR modernizes utils/benchctl to use the newer daslib/clargs and typed sqlite_* APIs, adds a dedicated SQL/LINQ throughput benchmark suite, and introduces a new _fold optimization that fuses where + count into a single pass.
Changes:
- Rewrites
benchctlCLI parsing todaslib/clargsand replaces rawsqlite3_*calls with[sql_table]+SqlRunner+select_from/_where/_sql. - Adds
benchmarks/sql/suite comparing_sqlvs non-macro DB LINQ vs pure in-memory LINQ (materializing vs_fold). - Adds a new
daslib/linq_boostfold rule to fuse_where(p) |> count()into an inlined single-pass counter loop.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| utils/benchctl/utils.das | Enables gen2 and reads input files in binary mode for cross-platform JSON ingest consistency. |
| utils/benchctl/table_fmt.das | Enables gen2 for benchctl table formatting helpers. |
| utils/benchctl/README.md | Updates docs for structured filters, --no-color, and removal of raw SQL selection flags. |
| utils/benchctl/main.das | New clargs-based flow and typed SQLite interactions for reset/insert/query/compare. |
| utils/benchctl/flags.das | Removes legacy hand-rolled flag parsing. |
| utils/benchctl/benchstat.das | Switches from BenchmarkEntry to the new typed [sql_table] Benchmark. |
| utils/benchctl/bench_table.das | Introduces typed [sql_table] Benchmark schema for benchctl storage. |
| utils/benchctl/bench_sql.das | Removes legacy raw SQL helpers and schema strings. |
| utils/benchctl/bench_args.das | Adds clargs wrapper module to avoid Option<string> ambiguity with sqlite modules. |
| daslib/linq_boost.das | Adds where_ + count fold rule emitting an inlined single-pass loop. |
| benchmarks/sql/_common.das | Shared fixture (Car table + DB/array setup helpers) for SQL throughput benchmarks. |
| benchmarks/sql/select_where.das | Adds benchmark for _where filtering across DB/array modes. |
| benchmarks/sql/select_where_order_take.das | Adds benchmark for _where + _order_by + take across DB/array modes. |
| benchmarks/sql/count_aggregate.das | Adds benchmark for _where + count, showcasing new fold optimization. |
| benchmarks/README.md | Documents the new benchmarks/sql/ suite and its mode matrix. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot review on PR #2599: - run_reset_cmd now uses try_drop_table_if_exists / try_create_table and returns an error string; the implicit-init path propagates it instead of crashing on a corrupt or locked DB. - run_query_cmd rejects multiple --tag values for query (previously silently used only the first); compare-side scalar flags unchanged. - Insert tag loop skips empty tags and rejects '[' / ']' in tag values (the bracket scheme cannot delimit them safely). - README annotates --tag as single-value for query, bracket-free for insert. benchmarks/sql/ simplification (per Boris): drop the m1d / m2m / m2d modes (disk vs memory was a one-shot finding; no-_sql DB modes only re-prove that select_from materializes -- already documented). Each file now compares 3 modes: m1 (_sql over :memory:), m3 (plain array LINQ), m3f (_fold-fused array LINQ). Removed disk_db_setup / cleanup helpers from _common.das; updated benchmarks/README.md mode matrix. count_aggregate bumped to 1M; per-element cost is flat from 100k -> 1M across all three modes (m1 38, m3 12, m3f 3 ns/op). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nt fold test Copilot review round 2 on PR #2599: - run_subcommand validates `cmd` against {reset, insert, query, compare} before opening the SQLite DB. Typos (e.g. `qurey`) no longer create an empty benchdata.db as a side effect. - Extracted validate_tag_chars helper; insert/query/compare all reject '[' or ']' in tag values consistently (previously only insert did). - run_compare_cmd restores the overlap-exclusion semantic the old code had: collects --old result IDs into a table<int;bool> and post-filters --new entries against it. Identical/empty filters on both sides can no longer compare a result set against itself. Implemented in daslang (no raw-SQL escape hatch). - benchmarks/README.md table previously claimed `count_aggregate` ran at 100K; updated to 1M to match the shipped benchmark constant. Two adds: - benchmarks/sql/indexed_lookup.das — point-lookup benchmark (`_where(_.id == K)` against PRIMARY KEY) at 1M rows. Inverse-asymmetry pair to count_aggregate: SQLite's b-tree wins by ~1000x (m1 ~3 us, m3 ~9 ms, m3f ~3.4 ms per lookup at JIT). Documents where indexed storage earns its keep. - tests/linq/test_linq_fold.das — regression test for the where+count fold rule added in fcb6481. Covers half-match, zero-match, all-match, and empty-source cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three coupled changes that fall out of modernizing
utils/benchctl/onto the new SQL surface.1.
utils/benchctl/rewriteflags.das) replaced withdaslib/clargs— single[CommandLineArgs]struct, positionalcommand+files, structured filter flags. Help via-?(script-host eats--help).sqlite3_*calls inbench_sql.dasreplaced with[sql_table] Benchmark,with_sqliteRAII,db |> insert(rows)bulk transactions,_sql(... |> select_from(type<Benchmark>) |> _where(...))for bothqueryandcompare.--select/--select_old/--select_newraw-WHERE flags removed. Replaced with structured composable filters:--commit,--tag(insert-side: repeatable; query-side: single-value, bracket-free),--old-commit,--old-tag,--new-commit,--new-tag. Filters compose with AND via the_sql||empty-string short-circuit pattern (one call site, all flag combinations). Workflow becomescompare --old-tag before --new-tag after.--colors false→--no-color(clargs convention).string_allocs > 0, etc.) — README points users at thesqlite3shell directly. No raw-SQL escape hatch in the tool itself.bench_sql.dasandflags.dasdeleted;bench_table.das(the[sql_table]struct) andbench_args.das(clargs wrapper) added.BenchmarkEntrycollapses into the[sql_table] Benchmarkstruct.options gen2toutils.dasandtable_fmt.das(transitive macro requirements force gen2 parser mode in callers);read_fileopens in"rb"mode (Windows CRLF was trippingfread's byte-count check on the JSON ingest path).Behavioral hardening (review round 2)
run_subcommandvalidatescmdagainst {reset, insert, query, compare} before opening the SQLite DB. Typos likequreyno longer create an emptybenchdata.dbas a side effect.validate_tag_charshelper rejects[/]in tag values consistently across insert/query/compare (insert-side originally; now query and compare too). Empty tags are skipped on insert.--tagforqueryrejects multiplicity > 1 with a clear error (insert-side--tagremains repeatable; compare-side flags are scalar already).run_compare_cmdpost-filters--newrows against--oldIDs (table<int;bool>lookup) — restores the overlap-exclusion semantic the old code had viaWHERE id NOT IN (...). Without it, identical/empty filters on both sides would compare a result set against itself.run_reset_cmdusestry_drop_table_if_exists/try_create_tableand returns an error string; the implicit-init path threads the error throughrun_subcommandinstead of crashing on a corrupt or locked DB.Workaround for an upstream typer bug
bench_args.dasexists because combiningdaslib/clargsandsqlite/sqlite_boostin the same module makesOption<string>ambiguous in instantiation contexts — three mod-aliased views of the samedaslib/option.das:30template collide. Filed as #2598 with a minimal repro. The wrapper requires clargs privately and returns a plain struct with noOptionfields, keepingclargs::Optionout ofmain.das's namespace.2.
benchmarks/sql/(new)Comparison suite mirroring
tests/dasSQLITE/parity_check_*.dasbut oriented to throughput. Four files (select_where,select_where_order_take,count_aggregate,indexed_lookup) plus a_common.dasfixture with a[sql_table] Carstruct +fixture_db/fixture_arrayhelpers. Three modes per file:m1:memory:SQLite_sql— compile-time SQL emission, work pushed to the enginem3array<Car>m3farray<Car>_foldfromdaslib/linq_boost— fuses the chain into a single passselect_where.dasselect_where_order_take.dascount_aggregate.dascount()after_whereover 1M rows. m3f wins 12× over_sqland 4× over m3 — the where+count fold beats SQLite's COUNT(*) roundtrip on already-materialized data.indexed_lookup.das_where(_.id == K)against PRIMARY KEY over 1M rows. Inverse-asymmetry: SQLite wins by ~1000× (m1 ≈ 3.1 µs, m3 ≈ 9 ms, m3f ≈ 3.4 ms per lookup) — daslang has no indexed-lookup primitive onarray<T>, so it pays full O(n). Documents where indexed storage earns its keep.3.
daslib/linq_boost—where + countfoldNew
["where_", "count"]FoldSequence+fold_where_countmacro function. Emits an inlined single-pass loop (var n = 0; for it in src; if <pred> n++; return n) with the predicate spliced viafold_linq_cond. No intermediate filter array, no block-call overhead.Numbers from
count_aggregate_1m_m3f(1M Cars, predicate match rate ~50%):JIT-compiled m3f at 3 ns/op is ~12× faster than
_sql(38 ns) at 1M rows. Per-element cost is flat from 100K → 1M; all three modes scale linearly.Regression test in
tests/linq/test_linq_fold.das::test_where_count_foldcovers half-match, zero-match, all-match, and empty-source cases.Test plan
utils/lint/main.dason every changed.dasfile)tests/suite (7944 passed, 0 failed/errors) — no regressions intests/linq/ortests/dasSQLITE/.dasfiles MCP-formattedreset→insert --tag before→ re-run →insert --tag after→compare --old-tag before --new-tag afterproduces clean Welch's t-test outputqureyrejected without creatingbenchdata.db🤖 Generated with Claude Code