[pull] master from GaijinEntertainment:master#1010
Merged
Conversation
Three new planners slot into the _fold cascade between plan_order_family
and plan_loop_or_count:
- plan_reverse: src [|> where_(p)]* [|> select(proj)]? |> reverse
[|> take(N)]? [|> {to_array, count, first, first_or_default}]
Buffer-required terminators emit single prefilter pass into buffer +
reverse_inplace + optional resize(min(N, length(buf))). count elides the
buffer (counter loop only); first / first_or_default emit a last-survivor
tracker.
- plan_distinct: src [|> where_(p)]* [|> select(proj)]? |> distinct[_by(key)]
[|> take(N)]? [|> terminator]
Streaming dedup via table<unique_key> integrated into the prefilter loop.
take(N) → break-after-N early-exit (guard placed BEFORE the consume site
so non-positive N short-circuits). count terminator elides the buffer and
returns length(seen); sum / long_count fold inline.
- plan_group_by: src |> group_by[_lazy](key) |> select(group_proj)
[|> {to_array, count}]?
Incremental-aggregate fusion via table<unique_key; tuple<KT; AccT>>.
Recognized reducers: _._1 |> {length, count, sum, long_count}. Bare +
named-tuple wrap (K=_._0, V=_._1|>reducer) both supported; named-tuple
preserves user field-name hints by using group_proj._type as the table
value type. Bails to tier 2 for inner-select-sum, multi-reducer, having_,
side-effectful keys.
Headlines (100K rows INTERP):
- distinct_count m3f: 33 → 15 ns/op (2.2× over prev; beats SQL m1=41)
- distinct_take m3f: new — 0 ns/op (early-exit at 3 unique brands)
- groupby_count m3f: 76 → 33 ns/op (2.3× over prev; 4.3× over SQL m1=141)
- reverse_take m3f: regression vs m3 (lazy reverse iterator wins for small N)
- groupby_sum m3f: unchanged (inner-select-sum deferred — see follow-up plan)
is_buffer_required_op trimmed to {zip, join, left_join, group_join} —
the named-marker arm in plan_loop_or_count is now pure defense-in-depth
because the three new planners catch their ops first.
New benchmarks: benchmarks/sql/reverse_take.das, benchmarks/sql/distinct_take.das.
LINQ.md updated with phase status + new section + coverage table.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `options _comment_hygiene = true` so daslib/style_lint enforces STYLE014 (3-line cap on module/public //) and STYLE015 (1-line cap on private //) at compile time. Stale comment blocks accumulated across the three-tier cascade landings (PRs #2687, #2697, #2712, #2714, #2721). Cleanup performed by utils/hygiene/ for //! private-docstring stripping, then a lint-driven script that keeps the first WHY line of each multi-line `//` block flagged by STYLE014/STYLE015 and drops the rest. Net: 383 lines removed, 10 added — file shrunk 2294 → 1919 lines. All 867 tests/linq still pass (INTERP + AOT). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds four small helpers used by plan_reverse / plan_distinct / plan_group_by:
- merge_where_cond(curr, pred) — collapses the 5-line "if-null-else-&&"
pattern that fused chained where_ predicates into a one-liner.
- strip_const_ref(t) — clears constant + ref flags on a cloned TypeDecl;
collapses the 3-line guarded assignment pattern (was repeated 5×).
- finalize_emission(top, srcName, at, body) — wraps a single body
Expression in the canonical `invoke($($i(src) : T) { $e(body) }, src)`
emission. Used by plan_reverse where the body is a qmacro_block.
- finalize_emission_stmts(top, srcName, at, stmts) — same wrap but
splices a stmts array via $b. Used by plan_distinct + plan_group_by
(their nested stmt construction is parse-fragile inside qmacro_block).
- buffer_return(bufName, isIterator) — emits the `return <- buf` /
`return <- buf.to_sequence_move()` tail; collapses the 7-line
if/else pattern (was repeated 3×).
plan_reverse also gets a structural refactor to qmacro_block bodies
per branch (count, first/first_or_default, to_array), reading top-to-
bottom as the emission shape instead of N × "stmts |> push <| qmacro_expr".
plan_distinct + plan_group_by keep their stmts-array structure (cross-
qmacro $i tag resolution fails on nested qmacro_block in those shapes —
the inner-block tag scope doesn't unify with the outer block's), but
adopt the shared helpers where applicable.
Net: 180 lines removed, 102 added — file shrunk 1919 → 1841 lines.
All 867 tests/linq pass interpret + AOT.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fixes Addresses Copilot review on PR #2721: 1. plan_distinct: select(impure) → distinct evaluated projection twice per source element (once for the unique_key, once for buffer push or += acc), firing observable side effects twice vs once in the original LINQ chain. Fix: when projection has side effects, bind projected value once per element to a fresh `pv` local; key + push/acc share the bind. Pure projections still inline (matches plan_loop_or_count's discipline). 2. plan_distinct: take(N) limit was cloned into the consume body, so a side-effecting take arg fired once per fresh distinct key. Fix: bind the limit once at outer scope (alongside `taken = 0`); guard reads the local. Same single-eval discipline as the projection bind. 3. LINQ.md: corrected stale text claiming `is_buffer_required_op` was trimmed to {zip, join, left_join, group_join} — it retains all buffer- required names, but distinct/reverse/group_by are now defense-in-depth (caught earlier by dedicated planners). 4. test_linq_fold_ast.das: trimmed stale comment claiming named-tuple field names collapse to positional access — splice DOES preserve K/N field names via the user's group_proj body type as table value. New test: test_select_distinct_impure_correctness pins the single-eval behavior (projection fires N times for N source elements, not 2N). All 869 tests/linq pass interpret + AOT (+2 new vs prior). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ening Bundle addressing Copilot review pass on PR #2721: - Formatter fix (3 lines in linq_fold.das): `- const - &` → `-const -&` (no spaces). darwin15 extended_checks formatter verify pass was failing CI. - Per-key group_by test assertions: `test_group_by_count_named_tuple_preserves_keys` and `test_group_by_sum_named_tuple_parity` now check per-bucket count/sum (not just total). Catches wrong-bucket regressions that preserve aggregate totals. - LINQ.md doc tightening: - distinct terminator list: drop `first`/`first_or_default` (not implemented; those cascade to tier 2) - distinct pseudo-code: take-guard BEFORE seen.insert/buf.push_clone (matches actual emission order — `take(0)` doesn't mutate seen-table or buffer) - reverse_take regression explanation: rewrite root cause. Both m3 and m3f materialize the full source buffer; m3's iterator-form `reverse()` yields backwards lazily via index so `take(N)` only triggers N yields; m3f does a full reverse_inplace before resize(N). For N << length the lazy approach wins. Future optimization: backward-index loop bounded to last N — deferred. - reverse_take.das header comment: drop "headline win" wording (it's a regression vs m3, not a win) - BufferGroupBy phase row + bail list: narrow to `src |> group_by_lazy(...)` (no upstream where_/select* fusion in v1 — deferred). Also clarify that side-effectful key bodies are NOT a bail — the emitted code evaluates the key once per element (same as plain group_by_lazy), so no re-eval hazard exists. Verification: 95/95 AST tests, 234/234 functional tests, 329/329 AOT, lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d comments Round-2 Copilot review on PR #2721 — 4 doc/comment-only fixes: - LINQ.md:341 (distinct buffer-not-required terminators): clarify long_count is count-shaped (returns int64(length(seen))), not an inline accumulator — only sum folds inline at the freshly-inserted site with acc += <projection>. - LINQ.md:345 (plan_group_by recognizer description): drop reference to a non-existent `try_recognize_group_reducer` rewriter and return tuple. Actual recognizer is is_bucket_reducer_call (linq_fold.das:1666), inline accumulator update — no separate two-pass rewriter. - linq_fold.das:1694 — finish truncated "yields bucket" comment ("yields raw buckets (no fusion)"). - linq_fold.das:1755 — finish truncated "UniqKeyType wraps with" comment ("key is typedecl(keyBlock(default<elemT>)) wrapped in _::unique_key for the hash slot"). No code changes. Lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…d comment Round-3 Copilot review on PR #2721 — one real bug + one trailing nit: - plan_group_by previously assigned entry._0 = k on EVERY element. Reference group_by_lazy_impl (linq.das:1946-1950) only stores the key when the bucket is first created. For non-workhorse keys where unique_key (= "{a}") happens to collapse two observationally-different objects to the same string, the splice was storing the LAST key per bucket instead of the FIRST — divergent from reference semantics. Fix: gate the assignment on !key_exists before the tab[uk] subscript materializes the entry. One extra hash lookup per element, but correctness over speed for the rare-collision case. For workhorse keys (int/float/etc. — the canonical _ % N idiom) unique_key is a bijection so the assignment was harmless overwriting; behavior unchanged. - linq_fold.das:1817 — finish truncated "(it's already" comment ("already shaped as the named-tuple / reducer result"). Verification: 234 functional + 95 AST + 329 AOT all pass; lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-wins regression
Per Boris's suggestion: replace `key_exists(uk) + tab[uk]` (2 hash ops per element)
with `tab?[uk] ?? dummy` (1 hash op per element) + addr-compare for new-bucket detection.
Pattern:
var dummy : <tab value type> # OUT OF LOOP
for it in src:
let k = invoke(keyBlk, it)
let uk = _::unique_key(k)
unsafe:
var entry & = tab?[uk] ?? dummy # one hash op: ref to existing entry OR dummy
<accUpdate> # entry._1++ or entry._1 += it
if addr(entry) == addr(dummy): # dummy path = new bucket
dummy._0 = k
tab[uk] = dummy
dummy = default<...>
For canonical _ % N int key with small bucket count (groupby_count's 5 brands in 100K
rows = ~99.995% hits), this is ~1 hash op/element vs the prior 2 ops/element.
Headlines (100K rows INTERP):
groupby_count m3f: 43 ns/op (prev fix) -> 37 ns/op (this fix)
groupby_sum m3f: 105 -> 108 (within noise; sum cascades to tier 2 via inner-select-sum)
distinct_count m3f: 15 -> 16 (within noise)
Still slightly above the pre-correctness-fix 33 ns/op baseline due to the addr-compare,
the predictable if-branch, and the dummy reset on miss. Net: 3.8x over SQL m1 (141),
1.9x over m3 (71). All 234 functional + 95 AST + 329 AOT tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…h clamp
CI lint chain follows my new test_linq_fold.das generic instantiations into
linq.das:1071-1072 and trips PERF015 ("ternary min — use 'min(a, b)' from
math module"). Pre-existing on master but per the standing "fix all lint
issues, even pre-existing" rule, swap the clamping ternaries to math::clamp
which is exactly equivalent (max(lo, min(hi, value))). linq.das already
requires math.
Unblocks extended_checks (linux, 64).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address Copilot round-3 review on PR #2721: - distinct_take.das: m3 and m3f now BOTH use each(arr) (iterator source) for apples-to-apples comparison. Prior bench had m3 on arr (eager distinct_impl) vs m3f on arr.to_sequence() (lazy distinct generator). The "0 ns/op" m3f win partly reflected lazy iterator stop-on-take semantics, not the splice. Updated numbers: m3 30 -> m3f 0 ns/op. The real win is splice eliminating per-yield generator-dispatch overhead in the lazy chain. - LINQ.md:35 BufferDistinct row: drop trailing "..." from terminator list; spell out exact spliced set {to_array, count, sum, long_count} and note that first/first_or_default/min/max/average cascade to tier 2. - LINQ.md:343 plan_distinct section: add new "Intentional divergence" note explaining that distinct[_by] |> take(N) is streaming (splice breaks on first N distinct keys) vs the runtime's eager materialize-then-take. For pure code observationally identical; for side-effecting keys/upstream, splice fires side-effects on fewer source elements. This is by design — the splice gives array sources the same iterator-form semantics already provided by linq.das:668 (the iterator-form distinct() is a lazy generator that stops under take). - LINQ.md headline table: refresh distinct_take (0 ns/op, "splice eliminates generator-dispatch overhead"), groupby_count (37 ns/op from Boris's dummy-bind pattern), and benchmark "Shape" column to show each(arr) source. 234 functional tests still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…treaming exit Per-element take-guard was previously inside the `if (!seen.key_exists(uk))` fresh-key branch, which meant the loop only checked `taken >= takeLim` when scanning a fresh key. For inputs like `[0,1,2,0,0,0,...,3]` (first N unique, then many duplicates of seen brands, then an (N+1)th fresh at the end), the loop kept firing key_exists / where / projection on every duplicate until the next fresh key triggered the inner break. Move the guard to the top of the for-loop. Now `taken >= takeLim` fires at the start of any iteration past the Nth distinct consume — duplicates after the Nth distinct skip the key computation entirely. True streaming-take semantics with O(N)-source upper bound. Headline benchmark numbers unchanged (distinct_take.das uses BRAND_COUNT=5 / TAKE_N=3 with cycling brands, so the Nth fresh and (N+1)th fresh are back-to-back at source positions 2 and 3 — both inner and outer guards fire at position 3). The fix is for the worst case (many duplicates between Nth and (N+1)th fresh). Also in this commit: - LINQ.md: "should never fire" wording — corrected to describe `is_buffer_required_op` as a fallback guard for shapes the dedicated planners don't cover (e.g. `distinct().first()`, `order_by(...).first()` cascade via the marker since the spliced terminator set is finite). - linq_fold.das: two truncated "Inner-select" / "Named-tuple wrap" comments finished. - distinct_take.das: header comment synced with the each(arr) source change from db3c392 (was still describing the old eager-array shape). 234 functional + 95 AST + 329 AOT tests pass. Lint clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…uffer-distinct-groupby-reverse linq_fold: buffer-required splice arms — reverse, distinct, group_by
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )