-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: shouldn't pass alias through into subquery. #4141
Conversation
@@ -778,7 +778,7 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> { | |||
let normalized_alias = alias.as_ref().map(|a| normalize_ident(&a.name)); | |||
let logical_plan = self.query_to_plan_with_alias( | |||
*subquery, | |||
normalized_alias.clone(), | |||
None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix bug here, because we shouldn't pass the alias
into subquery inside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through the plans carefully -- thank you @jackwener
I can't quite figure out if this could cause an error / incorrect results but it certainly looks better
I am a little worried about the new column projection in datafusion/core/tests/sql/explain_analyze.rs but maybe that is a preexisting bug that is simply uncovered by this change.
@@ -746,7 +746,7 @@ async fn test_physical_plan_display_indent_multi_children() { | |||
" ProjectionExec: expr=[c2@0 as c2]", | |||
" ProjectionExec: expr=[c1@0 as c2]", | |||
" RepartitionExec: partitioning=RoundRobinBatch(9000)", | |||
" CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1]", | |||
" CsvExec: files=[ARROW_TEST_DATA/csv/aggregate_test_100.csv], has_header=true, limit=None, projection=[c1, c2]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right to me.
The query is only selecting c1
from the table -- so why is this scan now selecting c1 and c2?
let sql = "SELECT c1 \
FROM (select c1 from aggregate_test_100) AS a \
JOIN\
(select c1 as c2 from aggregate_test_100) AS b \
ON c1=c2\
";
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is confusing the alias c2
with the column c2
, so I am concerned that this could also introduce correctness issues. It would be good to understand this better before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be caused by project_pushdown rule
, I will look into it.
" Projection: t3.t1_id, t3.t1_name, t3.t1_int, alias=t3 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", | ||
" Projection: t1.t1_id, t1.t1_name, t1.t1_int, alias=t3 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it sure looks better to have not have this double alias
@@ -3170,10 +3170,10 @@ mod tests { | |||
) AS a | |||
) AS b"; | |||
let expected = "Projection: b.fn2, b.last_name\ | |||
\n Projection: b.fn2, b.last_name, b.birth_date, alias=b\ | |||
\n Projection: a.fn1 AS fn2, a.last_name, a.birth_date, alias=b\ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah looking at these old plans now this selecting a just to alias them as b
is a little strange
I took a first pass through this, and it seems to make sense, but I was confused by some of the expected results. I'll take another look tomorrow. |
@alamb @andygrove For example:
when we meet It's a future ticket, I have debug this rule, I will try to fix it. |
@jackwener Thanks for tracking down the bug and filing an issue. |
Thanks @jackwener |
Benchmark runs are scheduled for baseline = 9de893e and contender = d91ce06. d91ce06 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #4123.
Rationale for this change
What changes are included in this PR?
This PR has fixed a bug(We can see bug in UT and test case, but no one noticed it).
Original code passed the outside
Alias
intoSubquery
inside, it's wrong.Are these changes tested?
Are there any user-facing changes?