Skip to content

linq_fold: plan_decs_unroll Slice 5b — take_while/skip_while on decs#2772

Merged
borisbat merged 2 commits into
masterfrom
bbatkin/linq-fold-decs-unroll-slice5b
May 21, 2026
Merged

linq_fold: plan_decs_unroll Slice 5b — take_while/skip_while on decs#2772
borisbat merged 2 commits into
masterfrom
bbatkin/linq-fold-decs-unroll-slice5b

Conversation

@borisbat
Copy link
Copy Markdown
Collaborator

Summary

Wave 3 Slice 5b extends the decs splice planner with predicate-driven ranges (take_while/skip_while), building on Slice 5a's DecsRangeInfo scaffolding (#2770).

  • extract_decs_ranges gains skipWhileCond/takeWhileCond recognition. Canonical suffix order: skip → skip_while → take_while → take. Bails when a select appears in the chain prefix (mirrors array-side seenSelect bail at daslib/linq_fold.das:1615/1623). Predicates peel against the source tuple (tupName).
  • wrap_inner_for_with_decs_ranges + decs_range_prelude add skippingName (one-way flag, hoisted at invoke scope so it persists across archetypes). Per-element guard order matches array-side wrap_with_ranges: take-cap → skip-counter → skip_while flag → take_while break → takeC++.
  • All 4 emit fns (accumulator / early_exit / min_max_by / to_array) switch the outer dispatch to for_each_archetype_find when takeExpr != null OR takeWhileCond != null (both trigger inner return true).
  • useExplicitState in emit_decs_early_exit extends correspondingly so any/all/contains + take_while still distinguish "real match" from "take_while-stop" via explicit foundName state — same disambiguation as the Slice 5a take + any/all/contains case caught in Copilot review of linq_fold: plan_decs_unroll Slice 5a — take/skip ranges on decs #2770.

Bench

benchmark shape m1 sql m3 m3f (array splice) m4 (old, eager bridge) m4 (new, splice) m3f→m4 gap win vs baseline
take_while_match ._take_while(_.id < 50K).count() 7 23 2 55 8 6.9×

ast_dump --mode source confirms for_each_archetype_find with if (!(decs_tup.id < 50000)) return true else ++decs_acc. m4 lands close to m3f (8 vs 2 ns/op — gap is known Wave 4 multi-component get_ro overhead).

Coverage

21 new tests in tests/linq/test_linq_from_decs.das (60 → 81 in file):

  • Parity: take_while, skip_while, skip_while+take_while, where+take_while, take_while+sum, take_while+first, take_while+to_array
  • Edge cases: always-true / always-false predicates, skip+take_while, skip_while+take
  • Explicit-state regression guards: take_while + any (match-in-window vs after-window), all (pass vs fail), contains (hit vs miss) — verifies take_while-stop return true doesn't false-positive on any/all/contains terminators
  • AST shape gates: take_while-present → for_each_archetype_find; skip_while-only → for_each_archetype (no early-exit)

Pre-existing STYLE015 in emit_decs_count_archsize condensed inline (3-line block → 1 line; from PR #2768).

Test plan

  • mcp__daslang__compile_check on linq_fold.das + test file: clean
  • mcp__daslang__lint clean
  • CI lint (utils/lint/main.das) clean
  • mcp__daslang__format_file: already formatted
  • Interp: 1268/1268 across tests/linq/
  • AOT: 1647/1647 across tests/linq/, tests/decs/, tests/aot/
  • AST shape verified via ast_dump --mode source (take_while, skip_while, combined)
  • Bench take_while_match.das shows m4=8 ns/op (was ~55 via eager bridge)

🤖 Generated with Claude Code

Slice 5b extends DecsRangeInfo (added in #2770) with predicate-driven
ranges (skipWhileCond/takeWhileCond). extract_decs_ranges now walks the
suffix in canonical order skip→skip_while→take_while→take, bails when
a select appears in the prefix (mirrors array-side seenSelect bail at
linq_fold.das:1615/1623). Predicates peel against the source tuple
tupName, so the post-where stream is visible but selects are forbidden.

wrap_inner_for_with_decs_ranges + decs_range_prelude gain skippingName
(one-way flag, hoisted at invoke scope so it persists across archetypes).
Per-element guard order matches array-side wrap_with_ranges:
take-cap → skip-counter → skip_while flag → take_while break → takeC++.

All 4 emit fns (accumulator, early_exit, min_max_by, to_array) extend
the outer dispatch to switch to for_each_archetype_find when takeExpr
!= null OR takeWhileCond != null (both trigger inner `return true`).
useExplicitState in emit_decs_early_exit extends correspondingly so
any/all/contains + take_while still distinguish "real match" from
"take_while-stop" via explicit foundName state.

Bench: take_while_match m4 55 → 8 ns/op (6.9× win over eager bridge).
Coverage: 21 new tests (60 → 81 in test_linq_from_decs.das) — including
AST shape gates (take_while → _find; skip_while-only → for_each_archetype),
explicit-state regression guards (take_while + any/all/contains), and
edge cases (always-true/always-false predicates, skip+take_while,
skip_while+take, where+take_while). 1268/1268 linq interp + 1647/1647
AOT (linq+decs+aot). Pre-existing STYLE015 in emit_decs_count_archsize
condensed inline.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 20, 2026 23:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends plan_decs_unroll (decs splice planner) to recognize predicate-driven range operators (skip_while / take_while) and emit the correct archetype iteration form (for_each_archetype vs for_each_archetype_find) when early-exit is required.

Changes:

  • Added skipWhileCond / takeWhileCond to DecsRangeInfo, parsing them in extract_decs_ranges, and emitting corresponding per-element guards + hoisted state (skipping flag).
  • Updated all decs unroll emit paths to switch to for_each_archetype_find when take_while can trigger inner return true, and extended explicit-state handling for any/all/contains accordingly.
  • Added a new batch of decs-linq tests and updated the M4 decs-expansion benchmark notes/results.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
daslib/linq_fold.das Adds decs-side take_while/skip_while range extraction + codegen, including hoisted state and early-exit routing.
tests/linq/test_linq_from_decs.das Adds parity + AST-shape tests for take_while/skip_while on decs unroll.
benchmarks/sql/M4_DECS_EXPANSION.md Documents the Slice 5b update and benchmark numbers for take_while on decs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/linq/test_linq_from_decs.das Outdated
…ally exercise the splice

Copilot caught that the take_while+_select+terminator tests (any/all/contains/
sum/first/to_array variants) were falling through to the tier-2 eager bridge,
NOT exercising the new explicit-state routing in emit_decs_early_exit. Root
cause: canonical chain order forbids _select after a range op
(extract_decs_ranges bails at the non-range op in suffix walk), and the array
side enforces the same rule (linq_fold.das:1623).

Fix:
- any/all rewritten to predicate over the source tuple directly:
  `_take_while(_.val<3)._any(_.val==2)` instead of
  `_take_while(_.val<3)._select(_.val)._any(_==2)`. Now splices through
  emit_decs_early_exit's useExplicitState path (verified via ast_dump).
- sum/first/to_array: sum-after-take_while is unsplicable on either array or
  decs side (need _select to project, but select-after-while bails). Dropped
  sum variant; replaced first/to_array to operate on source tuples (test field
  access on the returned tuple). Added min_by variant — exercises
  emit_decs_min_max_by under take_while routing through for_each_archetype_find.
- contains: source-tuple equality not defined; dropped (any/all already
  exercise the same useExplicitState branch).
- New `test_unroll_take_while_any_splice_shape` AST gate asserts:
  to_sequence count == 0 (splice fires, not tier-2),
  for_each_archetype_find count == 1 (take_while-stop early-exit dispatch),
  decs_found token count == 3 (explicit-state routing emits var+set+return).

Coverage: 60 → 80 (was 81 pre-rewrite, -1 for dropped contains, +2 for
min_by/splice_shape, net +20). 1267 linq interp + 1512 AOT+decs green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

Comment on lines +987 to +993
// take_while + select is REJECTED by extract_decs_ranges (canonical order forbids
// select-after-range — array side bails at linq_fold.das:1623). So for sum/first/
// to_array splice coverage we drop select and operate on the source tuple
// directly. first/to_array yield source tuples; sum needs a scalar so we use
// _select-BEFORE-take_while is also forbidden (skip_while/take_while bail on
// seenSelect), so sum-after-take_while can't splice on either side. The variants
// here keep the chain in shapes that DO splice.

m4 lands close to m3f (8 vs 2 ns/op — within Wave 4 known multi-component get_ro overhead). Splice fires; `ast_dump --mode source` confirms `for_each_archetype_find` with `if !(decs_tup.id < 50000) return true else ++decs_acc`.

**Coverage:** take_while, skip_while, skip_while+take_while, where+take_while, take_while+sum, take_while+first, take_while+to_array, take_while always-true (no break) / always-false (immediate break), skip_while always-true (drops all) / always-false (immediate done), skip+take_while, skip_while+take, take_while+any/all/contains (regression guards for explicit-state routing under take_while), AST shape gates for take_while→`_find` routing + skip_while-only→`for_each_archetype` routing. +21 tests (60 → 81 in file).
@borisbat borisbat merged commit 2554891 into master May 21, 2026
31 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants