linq_fold: plan_decs_unroll Slice 5f — terminal splice for last/single/aggregate/element_at#2822
Merged
borisbat merged 2 commits intoMay 23, 2026
Conversation
…e/aggregate/element_at Closes the 4 outliers identified in PR #2812's m4 bench snapshot. These terminators were falling through to `to_sequence` materialization because plan_decs_unroll only recognized first/any/all/contains/min_by/max_by plus the accumulator family. Wave 5 extends the splice path to cover walk-all single-return terminators + element_at counter early-exit. emit_decs_walk_lane (last/last_or_default/single/single_or_default/aggregate): - State (found flag, retained element, accumulator) hoisted at invoke scope so it persists across archetype boundaries - single_or_default uses for_each_archetype_find for early-exit on 2nd match (sets stop flag, returns true from inner lambda); others use plain for_each_archetype unless take/take_while forces bool-lambda dispatch - aggregate mirrors emit_accumulator_lane's workhorse/non-workhorse seed branch (`=` vs `<-`) and peel-or-invoke fallback for the binary lambda emit_decs_element_at (element_at/element_at_or_default): - idx + counter + found + result hoisted at invoke scope - perElement compares cnt == idx, writes result + returns true on match - Pre-loop idx<0 panic (element_at) or `return default<T>` (or_default) - Tail handles "not found" = out-of-range = idx >= effective length Bench (INTERP, 100K rows, ns/op): last_match m4: 82 → 17 (4.8×) single_match m4: 80 → 14 (5.7×) aggregate_match m4: 82 → 7 (11.7×) beats m3f at 5 element_at_match m4: 34 → 0 (matches m3f) Per-op allocs drop 84 B → 1 B across all four (no more array<tuple>). Tests: 16 new in test_linq_from_decs.das (8 parity + 4 splice-shape gates + 2 range-interaction + 2 multi-match edge cases). 185 tests in file (up from 169), 1376 linq + 245 decs + 371 ast_match suites all green interp + AOT. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Extends the from_decs unroll/splice path in daslib/linq_fold.das to cover previously non-spliced terminal operations (last*, single*, aggregate, element_at*), avoiding fallback to_sequence materialization and improving performance parity with other terminators.
Changes:
- Add new decs splice emitters for walk-all terminators (
emit_decs_walk_lane) and index-based early-exit (emit_decs_element_at). - Update
plan_decs_unrollterminator detection to route these ops through the splice path. - Add parity + AST-shape gate tests for the new terminators in
tests/linq/test_linq_from_decs.das.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
daslib/linq_fold.das |
Adds new decs unroll emission paths for last*/single*/aggregate/element_at* and wires them into plan_decs_unroll. |
tests/linq/test_linq_from_decs.das |
Adds parity and splice-shape tests validating the new terminator splice coverage. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Copilot flagged that `last_or_default` and `single_or_default` splice their user-supplied default via plain `let dv = arg` + `return dv`, which breaks parity with daslib/linq.das's `static_if (typeinfo can_copy(defaultValue)) return defaultValue else return <- clone_to_move(defaultValue)` shape. Non-copyable types (e.g. `array<int>`) compile-error at the prelude bind. Add two small helpers: emit_default_bind(name, argExpr, canCopy) → let / var <- clone_to_move emit_default_return(name, canCopy) → return / return <- clone_to_move Apply at all 6 default-return sites (3 decs + 3 array). The array-side has the same pre-existing latent bug since none of the existing tests exercised non-copyable defaults; mirror per the PR review's parity scope. element_at_or_default in both paths has NO user-supplied default — switch from `return default<T>` to `var t : T; return <- t` (linq.das line 2547's move-init shape, works for both copyable and non-copyable element types). Out of scope: element_at's success path emits `return $i(valueName)` (plain) which would still break for non-copyable element types — pre-existing on both paths, separate fix needs the same can_copy probe applied to element-type returns across first/element_at success arms. Tests: new test_linq_fold_non_copyable_default.das with 3 sub-tests exercising last_or_default + single_or_default with array<int> defaults. Standalone file (not extending test_linq_fold.das) so it doesn't pull in pre-existing PERF022 hits from concat_impl/append_impl/prepend_impl that surface via template instantiation through test_concat. 1380 linq + 245 decs suites green interp + AOT. Bench unchanged (no-op for copyable types). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment on lines
+2922
to
+2934
| [test] | ||
| def test_unroll_last_parity(t : T?) { | ||
| fixture_unroll2(5) | ||
| // vals 0..4 → last is 4 | ||
| t |> equal(target_unroll_last_fold(), 4, "last splice parity") | ||
| } | ||
|
|
||
| [test] | ||
| def test_unroll_last_or_default_empty(t : T?) { | ||
| fixture_unroll2(5) | ||
| // val>1000 → none → default -9 | ||
| t |> equal(target_unroll_last_or_default_fold(), -9, "last_or_default empty source") | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the 4 outliers identified in PR #2812's m4 bench snapshot. These terminators were falling through to
to_sequencematerialization becauseplan_decs_unrollonly recognizedfirst/any/all/contains/min_by/max_byplus the accumulator family. Wave 5 extends the splice path to cover walk-all single-return terminators +element_atcounter early-exit.emit_decs_walk_lane(last/last_or_default/single/single_or_default/aggregate):single_or_defaultusesfor_each_archetype_findfor early-exit on 2nd match (sets stop flag, returnstruefrom inner lambda); others use plainfor_each_archetypeunlesstake/take_whileforces bool-lambda dispatch.aggregatemirrorsemit_accumulator_lane's workhorse / non-workhorse seed branch (=vs<-) and peel-or-invoke fallback for the binary lambda.emit_decs_element_at(element_at/element_at_or_default):idx+ counter +found+ result hoisted at invoke scope.perElementcomparescnt == idx, writes result + returnstrueon match (usesfor_each_archetype_findfor the counter early-exit).idx<0panic (element_at) orreturn default<T>(_or_default).idx >= effective length.Bench (INTERP, 100K rows, ns/op)
last_matchsingle_matchaggregate_matchelement_at_matchPer-op allocs drop 84 B → 1 B across all four — eliminated the
array<tuple>materialization.Test plan
tests/linq/test_linq_from_decs.das(8 parity + 4 splice-shape gates + 2 range-interaction + 2 multi-match edge cases)tests/linqsuite — interp + AOT greentests/decssuite — interp + AOT greentests/ast_match+ 10tests/template+ 9tests/macro_call+ 27tests/macro_boostgreencount_aggregate_m4) unchanged at 6 ns/opdaslib/linq_fold.das🤖 Generated with Claude Code