linq_fold Phase 2C: take/skip splice + chained-select := clone#2697
Conversation
Recognizes `[where_*][select*][skip?][take?] |> terminator` chains and
emits bounded-loop counters spliced into the per-element work across
counter, accumulator, early-exit, and array lanes. Trailing take/skip
(no explicit aggregator) routes to ARRAY lane with implicit to_array.
New helpers (linq_fold.das):
- append_skip_take_prelude — emits `var skipRem = K` / `var taken = 0`
alongside the lane accumulator.
- wrap_with_skip_take — prepends take-limit break + skip-counter continue
+ take-counter increment to the per-match block. Take-increment placed
BEFORE the per-match body so early-exit terminators (return-from-block)
don't make it unreachable (LINT001).
- is_buffer_required_op — recognizes order_by/distinct/reverse/group_by/
zip/join/left_join/group_join by name; planner returns null with
per-op `// TODO Phase 2X: <FutureMode>` markers, leaving room for
future BufferTopN / BufferDistinct / MultiSourceZip / BufferedJoin /
BufferGroupBy / BufferReverse emit modes.
Planner extensions:
- ChainStage recognition state machine on (seenSelect, seenSkip, seenTake)
rejects reverse-order shapes (e.g. where-after-skip). At most one skip
and one take per chain in Phase 2C.
- Range-form `take(start..end)` / `skip(start..end)` (slice operator)
falls through — different semantics from int-form.
- Count-shaped length shortcut + any-empty shortcut both gate on
`noLimits` (skip/take truncate the length).
- classify_terminator: take/skip terminal → ARRAY lane (implicit to_array
after `to_array` strip via linqCalls `skip = true`).
- Array-lane reserve tightens to inline `min(N, length(src))` when take
is present (no math::min dep at the call site).
Tests:
- test_linq_fold.das: 28 new functional cases (4 lanes × {where, skip,
take, where.skip.take, edge cases like take(0), skip(huge)}). Covers
correctness against expected values; for accumulator long_count uses
`let r` (drops typename assertion since `let` adds const).
- test_linq_fold_ast.das: 5 new AST tests with count_break_continue
helper — assert splice form (one fused for-loop + break for take +
continue for skip) for counter/accumulator/array lanes, and
fall-through (no invoke wrapper) for order_by.take and distinct.
Benchmarks:
- New take_sum_aggregate.das (accumulator-lane take coverage) and
take_count_filtered.das (counter-lane take + where).
- skip_take, take_sum_aggregate, take_count_filtered all drop to 0
ns/op at 100K rows (bounded loop O(K+N) vs source O(100K)). take_count
unchanged (m3f_old already iterator-fused).
Lint cleanup (Boris's "lint clean even in generated code"):
- daslib/ast_match.das: rename `next_var`-emitted gensyms from `_qm_N`
to `qm_N` — used-downstream, the underscore prefix mis-signaled
unused and triggered LINT004 in every qmatch_function consumer.
- daslib/ast_match.das: switch 4 specific qmacro_expr emissions from
`var $i(...)` to `let $i(...)` (the gensyms aren't reassigned in
generated code; LINT003 was firing across consumers).
- daslib/linq.das: zip_impl (predicate variant) and chunk_impl gain
pre-loop reserve hints — both were emitting PERF006 from push_clone-
in-loop-without-reserve when instantiated from test code.
LINQ.md gains a Phase 2C Ring 3 section with deltas + emission-shape
sketch, the buffer-required marker arms inventory, and a "Planned:
fail-loudly contract" subsection documenting the future PR that will
upgrade silent fallback to `macro_error("_fold: cannot splice — ...")`
per Boris's sqlite_linq-style "splice or error" design directive.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backtick is the established compiler-generated-name marker in daslib (linq_fold gensyms, etc.) and lint.das:152 already skips backtick-bearing names. Eliminates collision risk with any user identifier — `qm_5` is a valid user identifier; `qm\`5` is not. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…horse guard)
The planner rejected any chained `_select|_select|...` whose previous
projection had a non-workhorse type — the `prevWorkhorse=false → return null`
guard at the `select` arm in plan_loop_or_count. Stated reason: `<-` (move)
corrupts source for lvalue projections like `_._field`. The guard was a
Phase 2B placeholder.
Correctness observation: `:=` is safe on every type — byte copy on workhorse,
deep-clone on non-workhorse — so a single emission shape (`var $i(bind) := ...`)
covers both cases. The workhorse/non-workhorse branch in chained-bind emission
is removed; chained selects of any type now splice through one path.
Audit result — two workhorse-type branches remain in linq_fold.das, both
intentional:
1. fold_select_where (line ~392, `static_if (typeinfo is_workhorse(...))`)
— used exclusively by `_old_fold`'s frozen baseline path via g_foldSeq.
Changing it would alter the frozen baseline output and invalidate the
m3f_old benchmark column.
2. min_max_compare (line ~746) + caller (line ~933)
— perf-critical. Workhorse types use `<` / `>` directly
(single-instruction compare); non-workhorse falls back to `_::less`
so user/tuple comparator overloads still apply. Boris's design
directive 2026-05-16 mandates keeping this branch.
No further workhorse branches in the splice path.
New test: tests/linq/test_linq_fold.das::test_chained_non_workhorse_select
covers three chain shapes that previously fell through:
- int → ComplexType → int → sum
- where + int → ComplexType → int → sum (where-before-selects, canonical)
- workhorse → ComplexType → workhorse → max
LINQ.md gets a new Phase 2C Ring 4 section documenting the change + the
two-branch audit. Phase status table updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Phase 2C extends _fold's splice planner to recognize [where_*][select*][skip?][take?] |> terminator chains across all four lanes (counter / array / accumulator / early-exit), emitting bounded-loop counters spliced into the per-match block. Adds marker arms for buffer-required ops (order_by, distinct, reverse, group_by, zip, joins) that deliberately fall through with future-mode TODOs. Also drops the Phase 2B workhorse-only restriction on chained _select by using := (clone) for all intermediate binds — enabling chained non-workhorse projections to splice. Includes a small qmatch gensym tweak (_qm_{n} → qm`{n}) using the backtick compiler-generated marker, plus minor reserve() hints in chunk_impl/zip_impl.
Changes:
- Splice take/skip in 4 lanes via new helpers
append_skip_take_prelude+wrap_with_skip_take; planner state machine on(seenSelect, seenSkip, seenTake)enforces canonical chain order. - Switch chained
_selectintermediate bind fromvar = ...tovar := ...(clone), removing the workhorse-only Phase 2B guard; addis_buffer_required_opmarker arm. - qmatch gensym uses backtick marker; some
var→letfor read-only locals.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| daslib/linq_fold.das | Core: take/skip splice across lanes, chained-select := clone, buffer-required marker arm. |
| daslib/ast_match.das | qmatch gensym rename to backtick form; var→let for read-only helpers. |
| daslib/linq.das | Minor pre-reserve hints in chunk_impl and zip_impl. |
| tests/linq/test_linq_fold.das | +4 test funcs for take/skip lanes + chained non-workhorse select; widespread nolint:LINT003 annotations on existing var decls. |
| tests/linq/test_linq_fold_ast.das | +5 AST shape assertions (take splices break/continue; order_by/distinct fall through). |
| benchmarks/sql/take_sum_aggregate.das | New 100K benchmark for accumulator-lane take splice. |
| benchmarks/sql/take_count_filtered.das | New 100K benchmark for counter-lane take + where splice. |
| benchmarks/sql/LINQ.md | Phase 2C status + delta tables + planned fail-loudly contract. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot review on PR #2697 (#2697 (comment)) caught a silent semantic divergence in the take-limit guard. Pre-fix: `if (takenCount == N) break` — for negative N, `takenCount` (which starts at 0 and only ever increments) never satisfies equality, so the spliced loop takes ALL elements. Reference iterator semantics in daslib/linq.das treat non-positive N as "take nothing": take_impl (line 858): `if (total <= 0) break` take_to_array (line 879): `if (total <= 0) break` Post-fix: `if (takenCount >= N) break`. Identical to `==` for positive N (since `takenCount` increments by 1 from 0), short-circuits on the first iteration for N <= 0. Skip is unaffected — the existing emission `if (skipRem > 0) { skipRem--; continue }` is structurally identical to `skip_impl` (linq.das:770-773) and naturally inert for non-positive K (condition false → all elements pass through → skip nothing). Added 4 pinning tests covering take(-1), take(0), skip(-1), skip(0) so the non-positive semantics are documented and the regression can't slip back in. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three linq_fold cards from Phase 2C (PR GaijinEntertainment#2697) that didn't make it into the merged PR: - chained-select splice: bind via clone-assign universal (drops the prevWorkhorse-only guard) - macro planner: named-marker arms leave room for future modes - splice macro: bounded loop guard for take/skip non-positive N Two dasImgui cards from this session (HDPI PR borisbat/dasImgui#42): - HDPI plumbing pattern (glfwGetWindowContentScale + ScaleAllSizes + GLFW_SCALE_TO_MONITOR hint; float* binding gotcha) - Local daspkg install dance (out-of-tree dasImgui build + --global --force re-install to propagate edits; -project_root won't substitute) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `options _comment_hygiene = true` so daslib/style_lint enforces STYLE014 (3-line cap on module/public //) and STYLE015 (1-line cap on private //) at compile time. Stale comment blocks accumulated across the three-tier cascade landings (PRs #2687, #2697, #2712, #2714, #2721). Cleanup performed by utils/hygiene/ for //! private-docstring stripping, then a lint-driven script that keeps the first WHY line of each multi-line `//` block flagged by STYLE014/STYLE015 and drops the rest. Net: 383 lines removed, 10 added — file shrunk 2294 → 1919 lines. All 867 tests/linq still pass (INTERP + AOT). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Phase 2C of the
_foldsplice-mode buildout. Two rings + one small lint-driven gensym tweak.Ring 3 —
take(N)/skip(N)splice in all four lanes (eac77b8dd)_foldnow recognizes[where_*][select*][skip?][take?] |> terminatorand emits per-element work wrapped with bounded-loop counters. Skip/take guards splice into the per-match block (after the optionalwherefilter, before lane-specific work). Counters live alongside the lane's accumulator in the outer invoke.(seenSelect, seenSkip, seenTake)enforces canonical chain order; reversed forms (where-after-select, etc.) fall through to plain linq.take/skipas trailing operator (no explicitto_array) routes to ARRAY lane with implicitto_array.take(start..end)falls through (different semantics — slice operator, future PR).order_by/order_descending,distinct/distinct_by,reverse,group_by/group_by_lazy,zip,join/left_join/group_join— each gets a named recognition arm with a// TODO Phase 2X: <FutureMode>comment so future buffer-mode emission grafts in without re-walking the chain-recognition logic.take(N)is present, array-lane reserve tightens fromlength(src)tomin(N, length(src))(inline?:— no math::min dep at call site).return Xdon't trigger LINT001 unreachable.Benchmarks (100K rows, INTERP):
take(N).to_arrayskip(K).take(N).to_arrayselect.take(N).sumwhere.take(N).countSub-ns/op reflects bounded-loop nature: actual loop runs ≤ K+N (≤ 1010 here vs 100K source).
qmatch gensym tweak (
10e6cdca1)Switch
daslib/ast_match.das::next_varfromqm_{n}toqm`{n}. Backtick is the established compiler-generated-name marker in daslib (linq_fold gensyms etc.), anddaslib/lint.das:152already skips backtick-bearing names from LINT004. Eliminates collision risk with any user identifier (qm_5is a valid user identifier;qm`5is not).Ring 4 — chained-select clone via
:=(088d80ed0)Drops the
prevWorkhorse=false → return nullPhase 2B placeholder guard at the chained-selectarm. Correctness observation::=is safe on every type (byte copy on workhorse, deep-clone on non-workhorse). Single emission shapevar $i(bind) := $e(projection)covers both cases. Chained selects of any type now splice through one path.Concretely:
each(arr)._select(ComplexType(a = [_*2]))._select(_.a[0]).sum()previously fell through to plain linq; now splices to a single fused for-loop.Workhorse-branch audit — after Ring 4, two workhorse branches remain in
linq_fold.das, both intentional:fold_select_wherestatic_if (typeinfo is_workhorse(...))—_old_fold's frozen baseline path viag_foldSeq. Changing it would invalidate them3f_oldbenchmark column.min_max_compareworkhorse</>vs_::less— perf-critical (~2× per-element on int columns; PR linq_fold Phase 2B: aggregates + early-exit terminators (+ ICE 50609 defuse) #2696 numbers).Planned: fail-loudly contract
Documented in
benchmarks/sql/LINQ.mdas the next planned PR. Current contract —_foldsilently falls through to plain linq when it can't splice — is temporary. Planned switch:macro_error("_fold: cannot splice — <reason>")on any unsupported shape, mirroring sqlite_linq's "splice or error" contract. Buffer-required marker arms are pre-wired for this.Test plan
tests/linq/test_linq_fold.das— 167 tests pass (interpreter; +4 new test funcs covering counter/accumulator/early-exit/array lane take/skip + chained non-workhorse selects).tests/linq/test_linq_fold_ast.das— 71 tests pass (interpreter; +5 splice-shape assertions including order_by/distinct fall-through markers).tests/linq/AOT suite — 734/734 pass withbin/test_aot -use-aot dastest/dastest.das -- --use-aot --test tests/linq.tests/ast_match/*regression — 69 tests pass (qmatch gensym tweak).mcp__daslang__format_file+mcp__daslang__lintclean on all 8 changed files..das(corpus:daslib+tests/linq) — only finding is a noise-level token-shape match betweenis_buffer_required_opandmacro_boost::is_mutation_op2(both arestring → boolmembership checks; different domains, no shared abstraction worth extracting).Plan file:
~/.claude/plans/foamy-leaping-karp.md. Builds on PR #2696.🤖 Generated with Claude Code