Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

rewrite_sort_cols_by_agg_alias was rewriting ORDER BY min(c2) to a column named just min(t.c2) the rewritten sort expression didn’t match the schema entry so the assertion failed.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Ran both the module level and project level tests.

Are there any user-facing changes?

No

Copy link
Member

@martin-g martin-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add some new unit test(s).

}

fn qualify_column(column: Column) -> Column {
if column.relation.is_some() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if column.relation.is_some() {
if column.relation.is_some() || !column.name.contains('.') {

alias.expr = Box::new(ensure_column_qualifiers(*alias.expr));
Expr::Alias(alias)
}
other => other,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about expressions which contain nested columns ?

@kumarUjjawal
Copy link
Contributor Author

@martin-g Please check. Thanks!

@alamb
Copy link
Contributor

alamb commented Nov 17, 2025

Thank you @kumarUjjawal 🙏

Can we please figure out why the CI was not failing ? It seems like we have a testing gap and even if this PR fixes the issues, the testing gap will allow it to potentially start failing again

@kumarUjjawal
Copy link
Contributor Author

Thank you @kumarUjjawal 🙏

Can we please figure out why the CI was not failing ? It seems like we have a testing gap and even if this PR fixes the issues, the testing gap will allow it to potentially start failing again

I will investigate further.

@kumarUjjawal
Copy link
Contributor Author

@alamb My understanding is the rewrite_sort_cols_by_agg_alias test never actually asserted the problematic shape that is a column whose name itself contains dots like the literal alias "min(t.c2)", because it built its expectation with col("min(t.c2)"), and col(...) normalizes the input into relation/name. That meant we never had a test that expected a flat alias like "min(t.c2)".

assert_eq!(
parsed.relation,
Some(TableReference::Bare {
table: "min(t".into()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is min( expected here ?!
I'd expect just t

table: "min(t".into()
})
);
assert_eq!(parsed.name, "c2)");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And here I'd expect just c2 without the trailing )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ve dropped the whole “split on dot” logic, so we no longer try to invent qualifiers like min(t / c2). The rewrite now just reuses whatever alias string the projection produced (e.g. the flat "min(t.c2)"), and the unit test asserts that exact name using a derived_col("min(t.c2)"). So the expectations you highlighted are gone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what others think about this.
For me the parsing is broken if min( appears in the table (reference) name and ) appears in the plain column name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might have reviewed a earlier version of my code because I have reverted to simply reusing the projection’s alias string, so we no longer invent table references like min(t).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed! It seems I was talking about a previous state of this PR.

Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
@Jefffrey
Copy link
Contributor

@alamb My understanding is the rewrite_sort_cols_by_agg_alias test never actually asserted the problematic shape that is a column whose name itself contains dots like the literal alias "min(t.c2)", because it built its expectation with col("min(t.c2)"), and col(...) normalizes the input into relation/name. That meant we never had a test that expected a flat alias like "min(t.c2)".

But how does this fit with CI succeeding and cargo test as a whole succeeding, but running cargo test -p datafusion-expr itself causes this error? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

rewrite_sort_cols_by_agg_alias test failing when running cargo test on datafusion-expr

4 participants