Skip to content

test: make by-group assertions order-independent (fix coverage-build flakiness)#234

Merged
singaraiona merged 2 commits into
masterfrom
fix/order-dependent-groupby-tests
Jun 9, 2026
Merged

test: make by-group assertions order-independent (fix coverage-build flakiness)#234
singaraiona merged 2 commits into
masterfrom
fix/order-dependent-groupby-tests

Conversation

@ser-vasilich

Copy link
Copy Markdown
Collaborator

Problem

A make coverage run failed two tests — agg/pearson_corr.rfl and agg/per_group_holistic.rfl — that pass cleanly under make test (debug + ASan). Both pinned a specific group's value at a fixed result index of a by-group select.

A by-group select returns groups in the hash table's bucket order, which is unspecified — exactly like SQL / DuckDB GROUP BY without ORDER BY (DuckDB's parallel hash aggregate gives no order guarantee either). That order legitimately differs between build configs (debug+ASan vs coverage clang -O0, different memory layout) and across runs, so a position-pinned assertion passes under one ordering and fails under another. The engine is correct — group order was never contractual; this is purely test hygiene.

What

Swept every position-pinned by-group assertion in the suite:

  • 537 (at (at (select {… by:…}) …) N) candidates → 112 after excluding the safe majority (single-group where: (== k v) filters and asc:/desc:/take:-ordered queries) → the genuinely fragile multi-group ones.

Each was classified and either left SAFE (single group, deterministic ordering, or identical value across all groups) or rewritten order-independently while preserving the canonically-correct value (recomputed from the table data, never fudged to pass):

  • pin the intended group with where: (== key val) so index 0 is determinate, or
  • assert the sorted column / multiset — (asc (at sel 'm)) -- […], (asc (raze (at sel 't))) for top/bot LIST cells.

13 test files changed; no engine code touched.

Verification — under THREE distinct group orderings

  • make test gcc debug+ASan/UBSan: 3244/3246, 0 failed
  • make coverage clang -O0 full suite: 3245/3247, 0 failed (was aborting on the 2 flaky tests before)
  • per-file under the coverage binary during the sweep

Coverage after: Regions 84.8% · Functions 98.5% · Lines 89.4% · Branches 68.6%.

🤖 Generated with Claude Code

ser-vasilich and others added 2 commits June 8, 2026 19:00
pearson_corr.rfl and per_group_holistic.rfl pinned a specific by-group emit
position; group order is hash-bucket order (unspecified, like SQL/DuckDB) and
flips between ASan and coverage builds.  Assert order-independently (min/max,
sorted set).  More candidates swept in follow-up commits.
A by-group select returns groups in hash-bucket order, which is unspecified
(like SQL / DuckDB GROUP BY without ORDER BY) and legitimately differs between
build configs (debug+ASan vs coverage clang -O0) and runs. Many tests pinned a
specific group's value at a fixed result index, so they passed under one
ordering and failed under another — a `make coverage` run surfaced two such
failures (pearson_corr, per_group_holistic) that pass under debug+ASan.

Swept every position-pinned by-group assertion (537 candidates → 112 after
excluding single-group `where:` filters and `asc:`/`desc:`/`take:` ordered
queries → the genuinely fragile multi-group ones). Each was either confirmed
SAFE (single group, deterministic ordering, or identical value across groups)
or rewritten order-independently while preserving the canonically-correct
value computed from the data:
  - pin the intended group with `where: (== key val)` so index 0 is determinate;
  - or assert the SORTED column / multiset, e.g. `(asc (at sel 'm)) -- [...]`,
    `(asc (raze ...))` for top/bot LIST cells.

Verified under BOTH group orderings: full suite green under gcc debug+ASan
(3244/3246, 0 failed) and clang coverage (per-file). The engine is unchanged —
group order was never contractual; this is purely test hygiene.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@singaraiona singaraiona merged commit b0f88c2 into master Jun 9, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants