Skip to content

fix(substrait): dedupe names of aggregate measures, not just grouping…#126

Merged
LiaCastaneda merged 1 commit into
branch-53from
lia/bring-schema-fix-again
May 22, 2026
Merged

fix(substrait): dedupe names of aggregate measures, not just grouping…#126
LiaCastaneda merged 1 commit into
branch-53from
lia/bring-schema-fix-again

Conversation

@LiaCastaneda
Copy link
Copy Markdown

Cherry picks apache#22453

apache#22453)

## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

- Closes #.

## Rationale for this change

When the substrait consumer hits an `Aggregate` with two identical
measures (e.g. `sum(a)` present twice), planning fails with `Schema
contains duplicate unqualified field name`. Substrait carries column
names at the plan root rather than on the measures themselves, so the
measures arrive at `Aggregate` schema construction without aliases --
and two identical exprs produce two identical field names. PR apache#20539
fixed the `NameTracker` to dedupe duplicate names in the consumer, but
it was only applied to grouping expressions, not to the measures.

The planner sees:

```
field 1: (qualifier: None, name: "sum(data.a)")
field 2: (qualifier: None, name: "sum(data.a)")
```

which is rejected when constructing the Aggregate's output schema.

## What changes are included in this PR?

Run aggregate measures through the same `NameTracker` like the grouping
expressions in `from_aggregate_rel`

## Are these changes tested?

Yes -- added a roundtrip test `aggregate_identical_measures`. Without
the fix it produces `Error: SchemaError(DuplicateUnqualifiedField {
name: "sum(data.a)" }, Some(""))`

## Are there any user-facing changes?

No.

(cherry picked from commit 097efae)
@datadog-official
Copy link
Copy Markdown

datadog-official Bot commented May 22, 2026

Pipelines

Fix all issues with BitsAI

⚠️ Warnings

🚦 1 Pipeline job failed

Rust | build and run with wasm-pack   View in Datadog   GitHub Actions

🔄 Retry job. This looks flaky and may succeed on retry. Failed to install wasm-pack due to 404 error when accessing release URL.

Useful? React with 👍 / 👎

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: dc7810f | Docs | Datadog PR Page | Give us feedback!

@LiaCastaneda LiaCastaneda merged commit 6692f6f into branch-53 May 22, 2026
54 of 56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants