sqlite_linq: add long_count + _distinct_by dispatch arms#2845
Merged
borisbat merged 1 commit intoMay 24, 2026
Conversation
Two new chain shapes lower in `_sql(...)`: - `|> long_count()` → `SELECT COUNT(*)` with an int64 column binding, symmetric with `|> count()` (int). Also recognised in the grouped projection arm so `_select((..., N = _._1 |> long_count()))` widens the per-group counter to int64. - `|> _distinct_by(_.col) |> count()` (and the long_count variant) → `SELECT COUNT(DISTINCT "col")`. Receiver-pinned to ExprVar so `u.opt.X` keys reject loudly instead of silently emitting the wrong COUNT. `peel_count_terminal` factors the shared count/long_count peel so both share the distinct-by recognition path. Test surface: - `tests/dasSQLITE/test_31_long_count_distinct_by_canary.das` — 8 new tests covering SQL text emission + runtime values for both terminals, plus the int64-grouped projection. - `failed_sql_macro.das` case 23 — computed `_distinct_by` key (e.g. `_.Price % 100`) rejects with a precise diagnostic. Bench corpus: - `benchmarks/sql/distinct_by_count.das` — m1 lane backfilled (`_sql(... |> _distinct_by(_.brand) |> count())`); the dated TODO comment is removed. - `benchmarks/sql/long_count_aggregate.das` — m1 lane now actually calls `long_count()` instead of `count()`; the int64 accept matches m3f/m4. Latent bench-bug fixed. - `benchmarks/sql/results.md` — refreshed INTERP + JIT tables; commit header bumped; the `distinct_by_count SQL` bullet in "Notes on missing lanes" is removed now that the lane has a number. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends the SQLite LINQ lowering (_sql(...)) to support two additional terminal chain shapes: long_count() (returning int64) and _distinct_by(_.Field) |> count()/long_count() (lowering to COUNT(DISTINCT "col")). It also updates benches and adds canary tests to validate both SQL text emission and runtime behavior.
Changes:
- Add
_sqllowering forlong_count()including group aggregate translation (_._1 |> long_count()→COUNT(*)typed as int64). - Add
_distinct_by(_.Field) |> count()/long_count()lowering toCOUNT(DISTINCT "Field"), plus a shared peel helper forcount/long_count. - Add tests and update benchmarks/results to cover the new SQL-lowered lanes.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| modules/dasSQLITE/daslib/sqlite_linq.das | Adds shared count/long_count terminal peeling and _distinct_by→COUNT(DISTINCT ...) support; adds long_count support in grouped aggregates. |
| tests/dasSQLITE/test_31_long_count_distinct_by_canary.das | New canary tests for long_count and distinct_by+count/long_count (SQL text + runtime values). |
| tests/dasSQLITE/failed_sql_macro.das | Adds a new negative case ensuring computed _distinct_by keys reject with a diagnostic. |
| benchmarks/sql/distinct_by_count.das | Adds/reenables the SQL (m1) lane for distinct_by_count using the new lowering. |
| benchmarks/sql/long_count_aggregate.das | Fixes the SQL lane to actually benchmark long_count() and updates the zero-check to int64. |
| benchmarks/sql/results.md | Refreshes benchmark tables to include the new SQL lane and updated numbers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+620
to
+624
| if (cCall.arguments |> length != 1) { | ||
| macro_error(prog, at, "_sql: `_{linqName}(predicate)` is not yet supported — use `|> _where(predicate) |> _{linqName}()` instead") | ||
| q.hadError = true | ||
| return true | ||
| } |
Comment on lines
+590
to
+605
| var body = peel_lambda_single_return(dbCall.arguments[1]) | ||
| if (body == null) { | ||
| macro_error(prog, at, "_sql: _distinct_by key lambda has unexpected shape") | ||
| q.hadError = true | ||
| return "" | ||
| } | ||
| var receiver : ExpressionPtr | ||
| var fieldName : string | ||
| if (!qmatch(body, $e(receiver).$f(fieldName)).matched | ||
| || receiver == null || !(receiver is ExprVar)) { | ||
| macro_error(prog, at, "_sql: _distinct_by key must be `<arg>.<field>` (single column reference) — computed keys are not yet translatable to SQL") | ||
| q.hadError = true | ||
| return "" | ||
| } | ||
| distinctSource = dbCall.arguments[0] | ||
| return fieldName |
6 tasks
borisbat
added a commit
that referenced
this pull request
May 24, 2026
…-groupby-expression-keys sqlite_linq: expression keys in _group_by + PR #2845 review fixes
pull Bot
pushed a commit
to forksnd/daScript
that referenced
this pull request
May 24, 2026
… review fixes Two improvements this PR: 1. Expression-key support in `_group_by` (Session 2 from the post-PR GaijinEntertainment#2843 plan). `_group_by(_.Price % 100)` (and tuples mixing field keys with expression keys) now lower to `GROUP BY ((price) % (100))`. The rendered fragment is reused verbatim in SELECT (`K = _._0`) and ORDER BY (`_order_by(_._0)`) positions, so the SQL stays a single source of truth across all three clauses. Mechanism: `collect_group_keys` calls `pred_to_sql` with a new `q.inlineConstants` mode so the bound `_.X` field refs render as columns and `ExprConst*` literals (ints, floats, strings, bools) inline as SQL literals instead of `?` placeholders. The fragment carries no binds, which lets us re-use it at multiple SQL positions without bind-position bookkeeping. Runtime values (`_.Col - capturedVar`) reject loudly with a precise diagnostic — same one exercised by `failed_sql_macro` case 25. Order swap in the `_group_by` peel: recurse into the source first so `q.rootType` is set before `pred_to_sql` runs on the key (the existing field-key path didn't need this since translation was deferred to emission; expression keys do). Backfills `benchmarks/sql/groupby_select_sum.das` m1 lane. The bench uses the explicit-inner-select shape inside SUM (`_._1 |> select($(c : Car) => c.price) |> sum()`) — m3f/m4 keep their splice-friendly bare-`sum` form; both emit equivalent SQL/compute. results.md refreshed per the living-doc policy, "Notes on missing lanes" bullet for `groupby_select_sum SQL` removed. 2. Two PR GaijinEntertainment#2845 review fixes (Copilot, both real): - `peel_count_terminal`'s predicate-overload error mis-named the terminal: it said `_count(predicate)` and suggested `_count()`, but in `_sql` chains the bare `count()` / `long_count()` linq functions are the actual terminals. Drop the underscores in both the offending name and the suggested fix. - `try_peel_distinct_by_field` pinned receiver to `ExprVar` but didn't verify the var IS the lambda's bound parameter. So `_distinct_by(capturedRow.Brand) |> count()` would silently emit `COUNT(DISTINCT "Brand")` against the SQL source — wrong result, no diagnostic. Now extracts the lambda's arg name from `keyLambda._block.arguments[0]` and rejects when receiver name differs (`failed_sql_macro` case 24). Test surface: - `tests/dasSQLITE/test_32_group_by_expression_keys.das` — 6 tests covering SQL emission + runtime for single expression keys, multi-key tuples mixing field + expression keys, ORDER BY on the group key, and aggregate-over-expression-key projection. - `failed_sql_macro.das` cases 24 + 25 added (50503:23). Validation: 796/796 dasSQLITE (interp + AOT), 1390/1390 linq, 0 JIT bench failures, lint clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Session 1 of the post-#2843 follow-up plan. Two new chain shapes now lower in
_sql(...):|> long_count()→SELECT COUNT(*)with an int64 column binding, symmetric with the existing|> count()(int). Also recognised inside_selectafter_group_byso(..., N = _._1 |> long_count())widens the per-group counter to int64.|> _distinct_by(_.col) |> count()(and thelong_countvariant) →SELECT COUNT(DISTINCT "col"). Receiver-pinned toExprVarsou.opt.Xkeys reject loudly instead of silently emitting the wrong COUNT.A small refactor extracts
peel_count_terminalso both terminals share the distinct-by recognition path.What's in the diff
modules/dasSQLITE/daslib/sqlite_linq.das— new helperstry_peel_distinct_by_field+peel_count_terminal; the inlinecountpeel block becomes two calls to the shared helper.try_translate_group_aggregatelearnslong_count(widens to int64).tests/dasSQLITE/test_31_long_count_distinct_by_canary.das(new) — 8 tests covering SQL-text emission and runtime values for both terminals, plus the int64-grouped projection.tests/dasSQLITE/failed_sql_macro.das— case 23 added: computed_distinct_bykey (_.Price % 100) rejects with a precise diagnostic.expect 50503bumped 20 → 21.benchmarks/sql/distinct_by_count.das— m1 lane backfilled with_sql(... |> _distinct_by(_.brand) |> count()); the dated TODO comment is removed.benchmarks/sql/long_count_aggregate.das— m1 lane now actually callslong_count()instead ofcount()(latent bench-bug). Accept is nowc == 0lto match the int64 return.benchmarks/sql/results.md— refreshed INTERP + JIT tables per the [[feedback-living-results-md]] policy; commit-hash header bumped; the "distinct_by_count SQL" bullet in "Notes on missing lanes" is removed now that the lane has a number.Headline new cells:
distinct_by_countSQL = 40.5 ns/op (INTERP) / 40.9 (JIT);long_count_aggregateSQL = 29.4 (INTERP) / 29.5 (JIT) and now genuinely measureslong_count.Test plan
bin/daslang dastest/dastest.das -- --test tests/dasSQLITE/— 790/790 pass (interpreted)bin/test_aot dastest/dastest.das -- --test tests/dasSQLITE/— 790/790 pass (AOT)bin/daslang dastest/dastest.das -- --test tests/linq/— 1390/1390 pass (no shared-machinery regression)benchmarks/sql/— 0 failures, results.md refreshedmcp__daslang__linton all touched.dasfiles — cleanmcp__daslang__format_file— all already-formatted🤖 Generated with Claude Code