Skip to content

linq_fold Phase 2C: take/skip splice + chained-select := clone#2697

Merged
borisbat merged 4 commits into
masterfrom
bbatkin/linq-fold-phase-2c-take-skip
May 17, 2026
Merged

linq_fold Phase 2C: take/skip splice + chained-select := clone#2697
borisbat merged 4 commits into
masterfrom
bbatkin/linq-fold-phase-2c-take-skip

Conversation

@borisbat
Copy link
Copy Markdown
Collaborator

Summary

Phase 2C of the _fold splice-mode buildout. Two rings + one small lint-driven gensym tweak.

Ring 3 — take(N) / skip(N) splice in all four lanes (eac77b8dd)

_fold now recognizes [where_*][select*][skip?][take?] |> terminator and emits per-element work wrapped with bounded-loop counters. Skip/take guards splice into the per-match block (after the optional where filter, before lane-specific work). Counters live alongside the lane's accumulator in the outer invoke.

  • Planner state machine on (seenSelect, seenSkip, seenTake) enforces canonical chain order; reversed forms (where-after-select, etc.) fall through to plain linq.
  • take/skip as trailing operator (no explicit to_array) routes to ARRAY lane with implicit to_array.
  • Range-form take(start..end) falls through (different semantics — slice operator, future PR).
  • Buffer-required marker arms for order_by/order_descending, distinct/distinct_by, reverse, group_by/group_by_lazy, zip, join/left_join/group_join — each gets a named recognition arm with a // TODO Phase 2X: <FutureMode> comment so future buffer-mode emission grafts in without re-walking the chain-recognition logic.
  • Reserve hint refinement: when take(N) is present, array-lane reserve tightens from length(src) to min(N, length(src)) (inline ?: — no math::min dep at call site).
  • Take-increment placement: BEFORE lane work, so early-exit terminators with return X don't trigger LINT001 unreachable.

Benchmarks (100K rows, INTERP):

Benchmark Shape m3f_old m3f (Phase 2C)
take_count take(N).to_array 0 0 (parity; structural splice win)
skip_take skip(K).take(N).to_array 23 0
take_sum_aggregate (new) select.take(N).sum 14 0
take_count_filtered (new) where.take(N).count 11 0

Sub-ns/op reflects bounded-loop nature: actual loop runs ≤ K+N (≤ 1010 here vs 100K source).

qmatch gensym tweak (10e6cdca1)

Switch daslib/ast_match.das::next_var from qm_{n} to qm`{n}. Backtick is the established compiler-generated-name marker in daslib (linq_fold gensyms etc.), and daslib/lint.das:152 already skips backtick-bearing names from LINT004. Eliminates collision risk with any user identifier (qm_5 is a valid user identifier; qm`5 is not).

Ring 4 — chained-select clone via := (088d80ed0)

Drops the prevWorkhorse=false → return null Phase 2B placeholder guard at the chained-select arm. Correctness observation: := is safe on every type (byte copy on workhorse, deep-clone on non-workhorse). Single emission shape var $i(bind) := $e(projection) covers both cases. Chained selects of any type now splice through one path.

Concretely: each(arr)._select(ComplexType(a = [_*2]))._select(_.a[0]).sum() previously fell through to plain linq; now splices to a single fused for-loop.

Workhorse-branch audit — after Ring 4, two workhorse branches remain in linq_fold.das, both intentional:

  1. fold_select_where static_if (typeinfo is_workhorse(...))_old_fold's frozen baseline path via g_foldSeq. Changing it would invalidate the m3f_old benchmark column.
  2. min_max_compare workhorse < / > vs _::less — perf-critical (~2× per-element on int columns; PR linq_fold Phase 2B: aggregates + early-exit terminators (+ ICE 50609 defuse) #2696 numbers).

Planned: fail-loudly contract

Documented in benchmarks/sql/LINQ.md as the next planned PR. Current contract — _fold silently falls through to plain linq when it can't splice — is temporary. Planned switch: macro_error("_fold: cannot splice — <reason>") on any unsupported shape, mirroring sqlite_linq's "splice or error" contract. Buffer-required marker arms are pre-wired for this.

Test plan

  • tests/linq/test_linq_fold.das — 167 tests pass (interpreter; +4 new test funcs covering counter/accumulator/early-exit/array lane take/skip + chained non-workhorse selects).
  • tests/linq/test_linq_fold_ast.das — 71 tests pass (interpreter; +5 splice-shape assertions including order_by/distinct fall-through markers).
  • Full tests/linq/ AOT suite — 734/734 pass with bin/test_aot -use-aot dastest/dastest.das -- --use-aot --test tests/linq.
  • tests/ast_match/* regression — 69 tests pass (qmatch gensym tweak).
  • mcp__daslang__format_file + mcp__daslang__lint clean on all 8 changed files.
  • Dupe-detect run on changed .das (corpus: daslib + tests/linq) — only finding is a noise-level token-shape match between is_buffer_required_op and macro_boost::is_mutation_op2 (both are string → bool membership checks; different domains, no shared abstraction worth extracting).

Plan file: ~/.claude/plans/foamy-leaping-karp.md. Builds on PR #2696.

🤖 Generated with Claude Code

borisbat and others added 3 commits May 16, 2026 22:07
Recognizes `[where_*][select*][skip?][take?] |> terminator` chains and
emits bounded-loop counters spliced into the per-element work across
counter, accumulator, early-exit, and array lanes. Trailing take/skip
(no explicit aggregator) routes to ARRAY lane with implicit to_array.

New helpers (linq_fold.das):
- append_skip_take_prelude — emits `var skipRem = K` / `var taken = 0`
  alongside the lane accumulator.
- wrap_with_skip_take — prepends take-limit break + skip-counter continue
  + take-counter increment to the per-match block. Take-increment placed
  BEFORE the per-match body so early-exit terminators (return-from-block)
  don't make it unreachable (LINT001).
- is_buffer_required_op — recognizes order_by/distinct/reverse/group_by/
  zip/join/left_join/group_join by name; planner returns null with
  per-op `// TODO Phase 2X: <FutureMode>` markers, leaving room for
  future BufferTopN / BufferDistinct / MultiSourceZip / BufferedJoin /
  BufferGroupBy / BufferReverse emit modes.

Planner extensions:
- ChainStage recognition state machine on (seenSelect, seenSkip, seenTake)
  rejects reverse-order shapes (e.g. where-after-skip). At most one skip
  and one take per chain in Phase 2C.
- Range-form `take(start..end)` / `skip(start..end)` (slice operator)
  falls through — different semantics from int-form.
- Count-shaped length shortcut + any-empty shortcut both gate on
  `noLimits` (skip/take truncate the length).
- classify_terminator: take/skip terminal → ARRAY lane (implicit to_array
  after `to_array` strip via linqCalls `skip = true`).
- Array-lane reserve tightens to inline `min(N, length(src))` when take
  is present (no math::min dep at the call site).

Tests:
- test_linq_fold.das: 28 new functional cases (4 lanes × {where, skip,
  take, where.skip.take, edge cases like take(0), skip(huge)}). Covers
  correctness against expected values; for accumulator long_count uses
  `let r` (drops typename assertion since `let` adds const).
- test_linq_fold_ast.das: 5 new AST tests with count_break_continue
  helper — assert splice form (one fused for-loop + break for take +
  continue for skip) for counter/accumulator/array lanes, and
  fall-through (no invoke wrapper) for order_by.take and distinct.

Benchmarks:
- New take_sum_aggregate.das (accumulator-lane take coverage) and
  take_count_filtered.das (counter-lane take + where).
- skip_take, take_sum_aggregate, take_count_filtered all drop to 0
  ns/op at 100K rows (bounded loop O(K+N) vs source O(100K)). take_count
  unchanged (m3f_old already iterator-fused).

Lint cleanup (Boris's "lint clean even in generated code"):
- daslib/ast_match.das: rename `next_var`-emitted gensyms from `_qm_N`
  to `qm_N` — used-downstream, the underscore prefix mis-signaled
  unused and triggered LINT004 in every qmatch_function consumer.
- daslib/ast_match.das: switch 4 specific qmacro_expr emissions from
  `var $i(...)` to `let $i(...)` (the gensyms aren't reassigned in
  generated code; LINT003 was firing across consumers).
- daslib/linq.das: zip_impl (predicate variant) and chunk_impl gain
  pre-loop reserve hints — both were emitting PERF006 from push_clone-
  in-loop-without-reserve when instantiated from test code.

LINQ.md gains a Phase 2C Ring 3 section with deltas + emission-shape
sketch, the buffer-required marker arms inventory, and a "Planned:
fail-loudly contract" subsection documenting the future PR that will
upgrade silent fallback to `macro_error("_fold: cannot splice — ...")`
per Boris's sqlite_linq-style "splice or error" design directive.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backtick is the established compiler-generated-name marker in daslib
(linq_fold gensyms, etc.) and lint.das:152 already skips backtick-bearing
names. Eliminates collision risk with any user identifier — `qm_5` is a
valid user identifier; `qm\`5` is not.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…horse guard)

The planner rejected any chained `_select|_select|...` whose previous
projection had a non-workhorse type — the `prevWorkhorse=false → return null`
guard at the `select` arm in plan_loop_or_count. Stated reason: `<-` (move)
corrupts source for lvalue projections like `_._field`. The guard was a
Phase 2B placeholder.

Correctness observation: `:=` is safe on every type — byte copy on workhorse,
deep-clone on non-workhorse — so a single emission shape (`var $i(bind) := ...`)
covers both cases. The workhorse/non-workhorse branch in chained-bind emission
is removed; chained selects of any type now splice through one path.

Audit result — two workhorse-type branches remain in linq_fold.das, both
intentional:

  1. fold_select_where (line ~392, `static_if (typeinfo is_workhorse(...))`)
     — used exclusively by `_old_fold`'s frozen baseline path via g_foldSeq.
     Changing it would alter the frozen baseline output and invalidate the
     m3f_old benchmark column.

  2. min_max_compare (line ~746) + caller (line ~933)
     — perf-critical. Workhorse types use `<` / `>` directly
     (single-instruction compare); non-workhorse falls back to `_::less`
     so user/tuple comparator overloads still apply. Boris's design
     directive 2026-05-16 mandates keeping this branch.

No further workhorse branches in the splice path.

New test: tests/linq/test_linq_fold.das::test_chained_non_workhorse_select
covers three chain shapes that previously fell through:

  - int → ComplexType → int → sum
  - where + int → ComplexType → int → sum (where-before-selects, canonical)
  - workhorse → ComplexType → workhorse → max

LINQ.md gets a new Phase 2C Ring 4 section documenting the change + the
two-branch audit. Phase status table updated.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 17, 2026 05:31
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Phase 2C extends _fold's splice planner to recognize [where_*][select*][skip?][take?] |> terminator chains across all four lanes (counter / array / accumulator / early-exit), emitting bounded-loop counters spliced into the per-match block. Adds marker arms for buffer-required ops (order_by, distinct, reverse, group_by, zip, joins) that deliberately fall through with future-mode TODOs. Also drops the Phase 2B workhorse-only restriction on chained _select by using := (clone) for all intermediate binds — enabling chained non-workhorse projections to splice. Includes a small qmatch gensym tweak (_qm_{n}qm`{n}) using the backtick compiler-generated marker, plus minor reserve() hints in chunk_impl/zip_impl.

Changes:

  • Splice take/skip in 4 lanes via new helpers append_skip_take_prelude + wrap_with_skip_take; planner state machine on (seenSelect, seenSkip, seenTake) enforces canonical chain order.
  • Switch chained _select intermediate bind from var = ... to var := ... (clone), removing the workhorse-only Phase 2B guard; add is_buffer_required_op marker arm.
  • qmatch gensym uses backtick marker; some varlet for read-only locals.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
daslib/linq_fold.das Core: take/skip splice across lanes, chained-select := clone, buffer-required marker arm.
daslib/ast_match.das qmatch gensym rename to backtick form; varlet for read-only helpers.
daslib/linq.das Minor pre-reserve hints in chunk_impl and zip_impl.
tests/linq/test_linq_fold.das +4 test funcs for take/skip lanes + chained non-workhorse select; widespread nolint:LINT003 annotations on existing var decls.
tests/linq/test_linq_fold_ast.das +5 AST shape assertions (take splices break/continue; order_by/distinct fall through).
benchmarks/sql/take_sum_aggregate.das New 100K benchmark for accumulator-lane take splice.
benchmarks/sql/take_count_filtered.das New 100K benchmark for counter-lane take + where splice.
benchmarks/sql/LINQ.md Phase 2C status + delta tables + planned fail-loudly contract.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread daslib/linq_fold.das
Copilot review on PR #2697 (#2697 (comment))
caught a silent semantic divergence in the take-limit guard.

Pre-fix: `if (takenCount == N) break` — for negative N, `takenCount` (which
starts at 0 and only ever increments) never satisfies equality, so the
spliced loop takes ALL elements. Reference iterator semantics in
daslib/linq.das treat non-positive N as "take nothing":

  take_impl     (line 858): `if (total <= 0) break`
  take_to_array (line 879): `if (total <= 0) break`

Post-fix: `if (takenCount >= N) break`. Identical to `==` for positive N
(since `takenCount` increments by 1 from 0), short-circuits on the first
iteration for N <= 0.

Skip is unaffected — the existing emission
`if (skipRem > 0) { skipRem--; continue }` is structurally identical to
`skip_impl` (linq.das:770-773) and naturally inert for non-positive K
(condition false → all elements pass through → skip nothing).

Added 4 pinning tests covering take(-1), take(0), skip(-1), skip(0) so
the non-positive semantics are documented and the regression can't slip
back in.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

@borisbat borisbat merged commit 1402167 into master May 17, 2026
31 of 32 checks passed
pull Bot pushed a commit to forksnd/daScript that referenced this pull request May 17, 2026
Three linq_fold cards from Phase 2C (PR GaijinEntertainment#2697) that didn't make it into
the merged PR:
  - chained-select splice: bind via clone-assign universal (drops the
    prevWorkhorse-only guard)
  - macro planner: named-marker arms leave room for future modes
  - splice macro: bounded loop guard for take/skip non-positive N

Two dasImgui cards from this session (HDPI PR borisbat/dasImgui#42):
  - HDPI plumbing pattern (glfwGetWindowContentScale + ScaleAllSizes +
    GLFW_SCALE_TO_MONITOR hint; float* binding gotcha)
  - Local daspkg install dance (out-of-tree dasImgui build + --global
    --force re-install to propagate edits; -project_root won't substitute)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
borisbat added a commit that referenced this pull request May 18, 2026
Adds `options _comment_hygiene = true` so daslib/style_lint enforces
STYLE014 (3-line cap on module/public //) and STYLE015 (1-line cap on
private //) at compile time. Stale comment blocks accumulated across
the three-tier cascade landings (PRs #2687, #2697, #2712, #2714, #2721).

Cleanup performed by utils/hygiene/ for //! private-docstring stripping,
then a lint-driven script that keeps the first WHY line of each
multi-line `//` block flagged by STYLE014/STYLE015 and drops the rest.

Net: 383 lines removed, 10 added — file shrunk 2294 → 1919 lines.

All 867 tests/linq still pass (INTERP + AOT).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants