linq_fold: plan_decs_unroll Slice 5c — distinct/distinct_by on decs#2785
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR adds a dedicated LINQ fold planner for distinct / distinct_by on DEC systems (decs bridge) to enable streaming-style de-duplication using a hoisted seen table that persists across archetypes, improving performance and enabling early-exit for take(N).
Changes:
- Added
plan_decs_distincttodaslib/linq_fold.dasand inserted it beforeplan_distinctin the fold cascade. - Added a new Slice 5c DEC-based test suite covering parity + AST-shape gating for
distinct/distinct_bywith multiple terminators (to_array,count,long_count,sum,take). - Updated benchmark documentation to describe Slice 5c behavior and results.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| tests/linq/test_linq_from_decs.das | Adds Slice 5c tests for DEC-backed distinct / distinct_by, including AST shape gates and single-eval side-effect checks. |
| daslib/linq_fold.das | Introduces plan_decs_distinct macro planner and wires it into the fold cascade before plan_distinct. |
| benchmarks/sql/M4_DECS_EXPANSION.md | Documents Slice 5c planner behavior, emission shapes, and benchmark deltas. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
New `plan_decs_distinct` planner mirrors `plan_distinct`, inserted in the cascade BEFORE `plan_distinct` (same rule as 5d/5e — decs bridge match is stricter). Streaming dedup via `var inscope seen : table<typedecl(_::unique_key (...))>` hoisted above `for_each_archetype` so the seen-table persists across archetypes. Two emit shapes: - **Buffer-required** (`to_array` default, optionally `take(N)`): per-element splice computes the key, `key_exists` check, on miss does `taken++ → seen|>insert → buf|>push_clone`. When `take(N)` is present, outer iteration switches to `for_each_archetype_find` and the take-cap `return true` propagates across archetypes — true streaming early-exit. Take-limit bound to a `let` at outer scope so a side-effecting `take(arg)` evaluates exactly once. - **Buffer-elided** (`count`/`long_count` → `length(seen)` / `int64(length(seen))`; `sum` folds inline at fresh-key site via `acc += <projection>`): no buffer allocation, only the seen-table is materialized. count + take both present cascades to tier 2 (matches array- side `plan_distinct`). Side-effecting `_select(proj)` upstream binds once per element to a fresh `decs_pv` local — key + buf push (or sum fold) share the bind, matching array- side single-eval per source element. `distinct_by(key)` wraps the key block in `invoke(<keyBlock>, <projection-or-tup>)` and the seen-table's value type is derived via `_::unique_key(invoke(<keyBlock>, default<elemT>))`. Copilot review fixes: - **plan_distinct predicate-drop bug (pre-existing).** Adds the symmetric arg-count guard on the terminator pop. `plan_distinct` was popping `count`/`long_count`/`sum` unconditionally — so `each(arr).distinct().count(predicate)` silently dropped the predicate and emitted `length(seen)` instead of filtering. Repro returned 5 instead of 3 for `[1,1,2,2,3,3,4,4,5,5].distinct().count(_>2)`. Fix mirrors the guard already in `plan_decs_distinct`: pop only when `length(args)==1`; otherwise cascade to tier-2 which honors the predicate via `count(seq, pred)`. +4 parity tests (2 array, 2 decs) that fail pre-fix and pass post-fix. - Comment wording in `plan_decs_distinct` no longer references the non-existent `sum(src, sel)` overload. Benchmarks (100K rows, INTERP): | benchmark | m1 sql | m3 | m3f | m4 (was) | m4 (now) | win | |---|---:|---:|---:|---:|---:|---:| | distinct_count | 41 | 44 | 15 | 97 | **28** | 3.5× | | distinct_take | 0 | 30 | 0 | 34 | **0** | matches m3f | Coverage: `_select(_.brand).distinct()` + to_array / count / long_count / sum / take(N) / take(0) / take(N>num_distinct), `_where(_.id<8)._select(_.brand). distinct()`, `_distinct_by(_.brand)` + to_array / count / take(N), empty-decs distinct yields empty, side-effecting take(N) arg evaluates exactly once. `count(pred)`/`long_count(pred)` after distinct (both array + decs sources) must honor the predicate. AST shape gates for: distinct+count (no to_sequence, no decs_buf, single key_exists, plain for_each_archetype), distinct+take (for_each_archetype_find + decs_buf + decs_seen + decs_taken counters), distinct_by+to_array (unique_key wrapping the key invocation), distinct+sum (no decs_buf, decs_acc declared+folded). +19 tests (122 → 141 in test_linq_from_decs; +2 in test_linq_fold). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
53c8b98 to
343f579
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
New
plan_decs_distinctplanner mirrorsplan_distinct, inserted in the cascade BEFOREplan_distinct(same rule as 5d/5e — decs bridge match is stricter). Streaming dedup viavar inscope seen : table<typedecl(_::unique_key(...))>hoisted abovefor_each_archetypeso the seen-table persists across archetypes.Emission shapes
Buffer-required (
to_arraydefault, optionallytake(N)): per-element splice computes the key, runskey_existscheck, on miss doestaken++ → seen|>insert → buf|>push_clone. Whentake(N)is present, outer iteration switches tofor_each_archetype_findand the take-capreturn truepropagates across archetypes — true streaming early-exit. Take-limit bound to aletat outer scope so a side-effectingtake(arg)evaluates exactly once.Buffer-elided (
count/long_count→length(seen)/int64(length(seen));sumfolds inline at fresh-key site viaacc += <projection>): no buffer allocation, only the seen-table is materialized.count + takeboth present cascades to tier 2 (matches array-sideplan_distinct).Side-effecting
_select(proj)upstream binds once per element to a freshdecs_pvlocal — key + buf push (or sum fold) share the bind, matching array-side single-eval per source element.distinct_by(key)wraps the key block ininvoke(<keyBlock>, <projection-or-tup>)and the seen-table's value type is derived via_::unique_key(invoke(<keyBlock>, default<elemT>)).Benchmarks (100K rows, INTERP)
distinct_countlands within 2× of m3f's 15 — the gap is the Wave 4 multi-componentget_rofloor (3 components participate in the inner for-loop even when the chain only readsbrand).distinct_takecollapses to 0 ns/op — early-exit at the 3rd distinct brand visits only ~3 source elements regardless of N=100K, matching the array-side splice.Coverage
_select(_.brand).distinct()+ to_array / count / long_count / sum / take(N) / take(0) / take(N>num_distinct),_where(_.id<8)._select(_.brand).distinct(),_distinct_by(_.brand)+ to_array / count / take(N), empty-decs distinct yields empty, side-effecting take(N) arg evaluates exactly once. AST shape gates for: distinct+count (no to_sequence, no decs_buf, single key_exists, plain for_each_archetype), distinct+take (for_each_archetype_find + decs_buf + decs_seen + decs_taken counters), distinct_by+to_array (unique_key wrapping the key invocation), distinct+sum (no decs_buf, decs_acc declared+folded). +17 tests (122 → 139 in file).Test plan
mcp__daslang__lint+format_filecleanutils/lint/main.das) cleantests/linq/1327/1327 +tests/decs/240/240 greentest_aotrebuild): same 1327 + 240 greendistinct_count97→28 ns/op,distinct_take34→0 ns/opast_dump --mode sourceontarget_unroll5c_select_distinct_take2_foldshowsfor_each_archetype_find+ take-cap return true + unique_key + single key_exists + insert+push_clone+taken++ on fresh-key🤖 Generated with Claude Code