-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Optimize the concat_ws function
#3869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize the concat_ws function
#3869
Conversation
Signed-off-by: remzi <13716567376yh@gmail.com>
Signed-off-by: remzi <13716567376yh@gmail.com>
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beautiful @HaoYang670
| } | ||
| } | ||
|
|
||
| impl Literal for &String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"TIL" lit() 👍
| for arg in args { | ||
| match arg { | ||
| // filter out null args | ||
| Expr::Literal(ScalarValue::Utf8(None) | ScalarValue::LargeUtf8(None)) => {} | ||
| Expr::Literal(ScalarValue::Utf8(Some(v)) | ScalarValue::LargeUtf8(Some(v))) => { | ||
| match contiguous_scalar { | ||
| None => contiguous_scalar = Some(v.to_string()), | ||
| Some(mut pre) => { | ||
| pre += delimiter; | ||
| pre += v; | ||
| contiguous_scalar = Some(pre) | ||
| } | ||
| } | ||
| } | ||
| Expr::Literal(s) => return Err(DataFusionError::Internal(format!("The scalar {} should be casted to string type during the type coercion.", s))), | ||
| // If the arg is not a literal, we should first push the current `contiguous_scalar` | ||
| // to the `new_args` and reset it to None. | ||
| // Then pushing this arg to the `new_args`. | ||
| arg => { | ||
| if let Some(val) = contiguous_scalar { | ||
| new_args.push(lit(val)); | ||
| } | ||
| new_args.push(arg.clone()); | ||
| contiguous_scalar = None; | ||
| } | ||
| } | ||
| } | ||
| if let Some(val) = contiguous_scalar { | ||
| new_args.push(lit(val)); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern of creating the contiguous scalar is so similar -- I wonder if it could be extracted out into a function -- perhaps as a follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing @alamb
The logic for concat and concat_ws is a little different, because in concat_ws we must consider the delimiter and we can't ignore the empty string literals. I will try to find a way to refactor them.
| args: new_args, | ||
| } | ||
| } | ||
| } => simpl_concat(args)?, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
| // the delimiter is not a literal | ||
| { | ||
| let expr = concat_ws(col("c"), vec![lit("a"), null.clone(), lit("b")]); | ||
| let expected = concat_ws(col("c"), vec![lit("a"), lit("b")]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so cool!
|
Thanks @HaoYang670 |
Signed-off-by: remzi 13716567376yh@gmail.com
Which issue does this PR close?
Closes #3856.
Closes #3857.
Rationale for this change
Simplify the
concat_wsexpression:nullif the delimiter is nullnullargumentsconcatto replaceconcat_wsif the delimiter is an empty stringWhat changes are included in this PR?
Are there any user-facing changes?