fix(array_agg): reverse ordering_values in state() when accumulator is reversed#22597
Open
ologlogn wants to merge 2 commits into
Open
fix(array_agg): reverse ordering_values in state() when accumulator is reversed#22597ologlogn wants to merge 2 commits into
ologlogn wants to merge 2 commits into
Conversation
…s reversed When the physical optimizer detects that the input to a partial aggregate is pre-sorted ASC and a ARRAY_AGG requires DESC order, it calls reverse_expr() to flip the ordering requirement to ASC and sets is_input_pre_ordered=true and reverse=true on the accumulator. In this configuration, state() called evaluate() (which reverses the values list to DESC) but called evaluate_orderings() which returned the ordering keys in their original ASC order. This mismatch caused the final accumulator's merge_batch to pair each value with the wrong ordering key, producing an incorrect sort order silently — no panic, just wrong results. Fix: reverse the ordering_values iteration in evaluate_orderings() when self.reverse is true, keeping values and ordering keys consistently ordered for merge_batch.
gabotechs
reviewed
May 29, 2026
Adds an SLT test that exercises the exact optimizer path where get_finer_aggregate_exprs_requirement reverses the DESC accumulator (is_reversed=true, ordering_req=[ASC]). Without the fix the DESC column returns [1,2,...,10] instead of [10,9,...,1].
gabotechs
approved these changes
May 29, 2026
Contributor
|
I'll leave this here for a day in order to give chance to other to chime in, and otherwise will merge on monday. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
No existing issue. Discovered while investigating incorrect results from
ARRAY_AGG(x ORDER BY y DESC)in a multi-partition query.Rationale for this change
Bug description
When multiple
ARRAY_AGGexpressions with conflictingORDER BYdirections (ASC and DESC) appear in the same query, results for the DESC variant are silently wrong.How the optimizer creates the bug path
The bug is triggered deterministically by a well-defined optimizer pipeline. Take this query:
Step 1 —
get_finer_aggregate_exprs_requirement(runs atAggregateExecconstruction)This function iterates the aggregate expressions to find a single common ordering requirement that the input sort can satisfy:
common = [c1 ASC][c1 DESC]— conflicts with[c1 ASC]reverse(DESC) = ASC→[c1 ASC]satisfies itaggr_expr[1]in-place: flipsARRAY_AGG(c1 DESC)→ARRAY_AGG(c1 ASC, is_reversed=true)AggregateExec::required_input_orderingis set to[c1 ASC](soft requirement)The DESC aggregate is already reversed to ASC before any other rule runs.
Step 2 —
EnsureRequirementsoptimizerSees
required_input_ordering = [c1 ASC]→ insertsSortExec [c1 ASC]before the partial aggregate.Step 3 —
OptimizeAggregateOrderoptimizerRuns on the partial aggregate (input mode = Raw). Input is now sorted
[c1 ASC]. For each aggregate:ARRAY_AGG(c1 ASC): direct match →is_input_pre_ordered=true, reverse=false✓ARRAY_AGG(c1 ASC, is_reversed=true)(already mutated): direct match →is_input_pre_ordered=true, reverse=true← bug pathThe DESC accumulator ends up with
ordering_req=[c1 ASC],is_input_pre_ordered=true,reverse=true.Root cause
In
OrderSensitiveArrayAggAccumulator::state():evaluate()reversesself.values(ASC input → DESC output), butevaluate_orderings()always iteratedself.ordering_valuesforward. The partial state emits a mismatched pair: values in DESC order, ordering keys in ASC order.The final accumulator's
merge_batchusesmerge_ordered_arrayswith the ordering keys to decide k-way merge priority. Because the keys are paired with the wrong values, the merge produces the wrong order — silently, no error or panic.Note: if DESC is listed first and ASC second, the roles are swapped — the ASC aggregate gets reversed and its result is wrong instead. The bug always hits whichever aggregate
get_finer_aggregate_exprs_requirementreverses.What changes are included in this PR?
Single change in
evaluate_orderings()insideOrderSensitiveArrayAggAccumulator:When
reverse=true, ordering keys are iterated in reverse to match the reversed values emitted byevaluate(), somerge_batchreceives correctly paired(value, ordering_key)entries.Are these changes tested?
Unit test —
desc_order_partial_final_merge_correctinarray_agg.rsdirectly exercises the partial→final merge path with a reversed accumulator (is_input_pre_ordered=true, reverse=true). Before fix:[3, 4, 5, 0, 1, 2]. After fix:[5, 4, 3, 2, 1, 0].SQL logic test — regression test in
aggregate.sltruns the exact bug-triggering query against a real 10-row table:The EXPLAIN confirms
SortExec [c1 ASC]+Partial/Finalstages (the optimizer path that triggers the bug). The result assertion catches the wrong output:Are there any user-facing changes?
Yes —
ARRAY_AGG(x ORDER BY y DESC)now returns correct results when the query uses multi-partition execution and the optimizer reverses the partial accumulator.