-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Minor: Avoid cloning as many Ident during SQL planning
#4534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // Normalize an identifier to a lowercase string unless the identifier is quoted. | ||
| pub(crate) fn normalize_ident(id: &Ident) -> String { | ||
| match id.quote_style { | ||
| Some(_) => id.value.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here the value is always cloned which is not necessary for most uses when we already have an owned string
d4f1c9c to
895b4cd
Compare
| for cte in with.cte_tables { | ||
| // A `WITH` block can't use the same name more than once | ||
| let cte_name = normalize_ident(&cte.alias.name); | ||
| let cte_name = normalize_ident(cte.alias.name.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
previously normalize_indent always cloned. Now it only clones in a few places and most of the time can reuse the String in the sqlparser-ast directly
| .any(|x| x.option == ColumnOption::Null); | ||
| fields.push(Field::new( | ||
| &normalize_ident(&column.name), | ||
| &normalize_ident(column.name), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is unfortunate to simply drop the String immediately, but Field::new requires a &str (it can't take the String). Filed upstream: https://github.com/apache/arrow-rs/pull/3288/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, Look like we can do improvement like above SubqueryAlias::try_new().
Ident as much during planningIdent during SQL planning
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackwener I wonder if you have time or interest to review this PR?
datafusion/sql/src/planner.rs
Outdated
| } | ||
| } | ||
|
|
||
| fn apply_expr_alias(plan: LogicalPlan, idents: &Vec<Ident>) -> Result<LogicalPlan> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function took a reference but the caller immediately drops the actual Vec. This PR reuses it
| pub fn try_new(plan: LogicalPlan, alias: &str) -> datafusion_common::Result<Self> { | ||
| pub fn try_new( | ||
| plan: LogicalPlan, | ||
| alias: impl Into<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change allows SubqueryAlias::try_new to take a String if the caller has one or a &str that will be copied into a new String if needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice implementation👍, copy just when call &str.
jackwener
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I review them carefully, It make sense to me.
Thanks @alamb.
| pub fn try_new(plan: LogicalPlan, alias: &str) -> datafusion_common::Result<Self> { | ||
| pub fn try_new( | ||
| plan: LogicalPlan, | ||
| alias: impl Into<String>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice implementation👍, copy just when call &str.
|
BTW, I think we can also avoid clone in some method in |
|
Thank you for the review @jackwener |
|
Benchmark runs are scheduled for baseline = 4ecf3e7 and contender = 2457ce4. 2457ce4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Draft as it builds on #4530Which issue does this PR close?
N/A
Rationale for this change
I noticed a bunch of redundant copying while working on #4530 but wanted to keep that PR smaller
What changes are included in this PR?
Remove redundant cloning
Are these changes tested?
covered by existing tests
Are there any user-facing changes?