-
Notifications
You must be signed in to change notification settings - Fork 1.8k
chore: Refactor with assert_or_internal_err!() in datafusion/spark. #18674
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
CC : @2010YOUY01 |
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @codetyri0n I'm thinking we need to use exec_err or plan_err instead of internal_err. The last one used for something which is not really expected and classifiable.
| assert_eq_or_internal_err!( | ||
| args.len(), | ||
| 2, | ||
| "Spark `date_add` function requires 2 arguments" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "Spark `date_add` function requires 2 arguments" | |
| "Spark `date_add` function requires 2 arguments, got {}" |
This is useful while debugging. Please keep it!
| return internal_err!( | ||
| "Spark `date_add` function requires 2 arguments, got {}", | ||
| args.len() | ||
| assert_eq_or_internal_err!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this change does not improve it. We are already sure that the args length is not 2, so there is no need to check it again.
Also now this requires using unreachable!() at line 98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be using take_function_args for patterns like this (unpacking expected no. of arguments), for reference:
datafusion/datafusion/spark/src/function/map/map_from_arrays.rs
Lines 66 to 72 in 377c0fc
| fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> { | |
| let [key_type, value_type] = take_function_args("map_from_arrays", arg_types)?; | |
| Ok(map_type_from_key_value_types( | |
| get_element_type(key_type)?, | |
| get_element_type(value_type)?, | |
| )) | |
| } |
| assert_or_internal_err!( | ||
| matches!( | ||
| other, | ||
| ColumnarValue::Array(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the idea here ?
ColumnarValue::Array(arr) is already handled at https://github.com/codetyri0n/datafusion/blob/fcd703b784d3ec4d9976c47a55b52d22c0c08757/datafusion/spark/src/function/datetime/make_interval.rs#L532
| matches!( | ||
| other, | ||
| ColumnarValue::Array(_) | ||
| | ColumnarValue::Scalar(ScalarValue::IntervalMonthDayNano(_)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| | ColumnarValue::Scalar(ScalarValue::IntervalMonthDayNano(_)) | |
| | ColumnarValue::Scalar(ScalarValue::IntervalMonthDayNano(None)) |
?!
Because Some(iv) is also already handled at https://github.com/codetyri0n/datafusion/blob/fcd703b784d3ec4d9976c47a55b52d22c0c08757/datafusion/spark/src/function/datetime/make_interval.rs#L557
| return internal_err!( | ||
| "Spark `sha1` function requires 1 argument, got {}", | ||
| args.len() | ||
| assert_eq_or_internal_err!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above - not really an improvement
| if args.len() != 1 { | ||
| return internal_err!("`factorial` expects exactly one argument"); | ||
| } | ||
| assert_eq_or_internal_err!(args.len(), 1, "`factorial` expects exactly one argument"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use take_function_args here
| return internal_err!( | ||
| "Spark `date_add` function requires 2 arguments, got {}", | ||
| args.len() | ||
| assert_eq_or_internal_err!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be using take_function_args for patterns like this (unpacking expected no. of arguments), for reference:
datafusion/datafusion/spark/src/function/map/map_from_arrays.rs
Lines 66 to 72 in 377c0fc
| fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> { | |
| let [key_type, value_type] = take_function_args("map_from_arrays", arg_types)?; | |
| Ok(map_type_from_key_value_types( | |
| get_element_type(key_type)?, | |
| get_element_type(value_type)?, | |
| )) | |
| } |
| assert_or_internal_err!( | ||
| matches!(days_arg.data_type(), DataType::Int8 | DataType::Int16 | DataType::Int32), | ||
| "Spark `date_sub` function: argument must be int8, int16, int32, got {:?}", | ||
| days_arg.data_type() | ||
| ); | ||
| unreachable!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should keep the original instead of introducing unreachable!()
assert_or_internal_err should be replacing asserts, not existing internal_errs
|
thanks @Jefffrey @martin-g realised that using this macro is not an improvement for return types and etc. |
| args.len() | ||
| ); | ||
| }; | ||
| let [date_arg, days_arg] = take_function_args("date_add", args)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handled this with take_function_args as suggested. Noticed a few functions that have not used this, thinking ill include those in a separate PR.
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these refactors should use take_function_args as it is much more ergonomic and handles the error for us anyway
| internal_err!("if should have been simplified to case") | ||
| assert_or_internal_err!(false, "if should have been simplified to case"); | ||
| unreachable!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is little benefit to this refactor if we introduce an unreachable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, my apologies I forgot to revert this change (and the one below as well 🤦🏽 ). : )
| assert_eq_or_internal_err!( | ||
| args.len(), | ||
| 1, | ||
| "Spark `last_day` function requires 1 argument" | ||
| ); | ||
| unreachable!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be fixed to use take_function_args as well
| if args.len() != 1 { | ||
| return internal_err!("`factorial` expects exactly one argument"); | ||
| } | ||
| assert_eq_or_internal_err!(args.len(), 1, "`factorial` expects exactly one argument"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use take_function_args here
| return internal_err!( | ||
| "Spark `crc32` function requires 1 argument, got {}", | ||
| args.len() | ||
| assert_eq_or_internal_err!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll refactor this in #18662 anyway
| if args.len() != 1 { | ||
| return internal_err!("hex expects exactly one argument"); | ||
| } | ||
| assert_eq_or_internal_err!(args.len(), 1, "hex expects exactly one argument"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use take_function_args
|
this is good to go (CI failure due to |
Which issue does this PR close?
Part of #18613
Rationale for this change
What changes are included in this PR?
refactoring conditional checks with assert_or_internal_err!() or assert_eq_or_internal_err!()
Are these changes tested?
Are there any user-facing changes?