-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add Decimal128 support to Ceil and Floor #18979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Jefffrey
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for picking this up, left some comments (applicable to floor too)
| fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> { | ||
| match arg_types[0] { | ||
| DataType::Float32 => Ok(DataType::Float32), | ||
| DataType::Decimal128(precision, scale) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should include support for all decimal types, not just 128
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking of creating a shared module for this, would that be okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think this signature can be simplified to:
if arg_types[0].is_null() {
Ok(DataType::Float64)
else {
Ok(arg_types[0].clone())
}| # ceil with decimal argument | ||
| query RRRR | ||
| select | ||
| ceil(arrow_cast(1.23,'Decimal128(10,2)')), | ||
| ceil(arrow_cast(-1.23,'Decimal128(10,2)')), | ||
| ceil(arrow_cast(123.00,'Decimal128(10,2)')), | ||
| ceil(arrow_cast(-123.00,'Decimal128(10,2)')); | ||
| ---- | ||
| 2 -1 123 -123 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should check the output type using arrow_typeof to ensure we're actually getting decimals and not just floats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming it returns decimals: Why the numbers after the decimal point (.00) are not printed ?
I'd expect them to be printed for decimals but maybe it has been decided in the past to not do it for .0
| # ceil with decimal argument | ||
| query RRRR | ||
| select | ||
| ceil(arrow_cast(1.23,'Decimal128(10,2)')), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would the following will be valid:
ceil(arrow_cast(9.23,'Decimal128(3,2)')),
?
It would return 10.00 which is above the requested precision of 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should raise a compute error. I will handle this case.
| # ceil with decimal argument | ||
| query RRRR | ||
| select | ||
| ceil(arrow_cast(1.23,'Decimal128(10,2)')), | ||
| ceil(arrow_cast(-1.23,'Decimal128(10,2)')), | ||
| ceil(arrow_cast(123.00,'Decimal128(10,2)')), | ||
| ceil(arrow_cast(-123.00,'Decimal128(10,2)')); | ||
| ---- | ||
| 2 -1 123 -123 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming it returns decimals: Why the numbers after the decimal point (.00) are not printed ?
I'd expect them to be printed for decimals but maybe it has been decided in the past to not do it for .0
| arrow_typeof(ceil(arrow_cast('-9999999999999999999999999999999999.01','Decimal256(38,2)'))), | ||
| arrow_cast(ceil(arrow_cast('-9999999999999999999999999999999999.01','Decimal256(38,2)')), 'Utf8'); | ||
| ---- | ||
| Decimal256(38, 2) 10000000000000000000000000000000000.00 Decimal256(38, 2) -9999999999999999999999999999999999.00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martin-g is this the right way to handle this test?
| fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> { | ||
| match arg_types[0] { | ||
| DataType::Float32 => Ok(DataType::Float32), | ||
| DataType::Decimal128(precision, scale) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think this signature can be simplified to:
if arg_types[0].is_null() {
Ok(DataType::Float64)
else {
Ok(arg_types[0].clone())
}| use datafusion_common::{exec_err, Result}; | ||
|
|
||
| /// Operations required to manipulate the native representation of Arrow decimal arrays. | ||
| pub(super) trait DecimalNative: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't help but feel this trait is unnecessary; is there some other existing code (traits) we could use instead of implementing this? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use ArrowNativeTypeOp.
| Ok(Arc::new(result)) | ||
| } | ||
|
|
||
| fn decimal_scale_factor<T>(scale: i8, fn_name: &str) -> Result<T::Native> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we should just inline this function; doing so would make it clear that scale was already checked for being non-negative
| fn output_ordering(&self, input: &[ExprProperties]) -> Result<SortProperties> { | ||
| Ok(input[0].sort_properties) | ||
| } | ||
|
|
||
| fn evaluate_bounds(&self, inputs: &[&Interval]) -> Result<Interval> { | ||
| let data_type = inputs[0].data_type(); | ||
| Interval::make_unbounded(&data_type) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah maybe I was wrong about inlining for ordering & bounds actually 🤔
Because bounds was generic to all the other functions, and for ordering we have tests against it so inlining here leaves that ceil_order function dangling 🤔
| query TTTTTTTT | ||
| select | ||
| arrow_typeof(ceil(arrow_cast(9.01,'Decimal32(7,2)'))), | ||
| arrow_cast(ceil(arrow_cast(9.01,'Decimal32(7,2)')), 'Utf8'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think this cast to utf8 is necessary
Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
Which issue does this PR close?
Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?