Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

  • Added dedicated ceil/floor UDF implementations that keep existing float/int behavior but operate directly on Decimal128 arrays, including overflow checks and metadata preservation.
  • Updated the math module wiring plus sqllogictest coverage so decimal cases are executed and validated end to end.

Are these changes tested?

  • All existing tests pass
  • Added new tests for the changes

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Nov 28, 2025
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this up, left some comments (applicable to floor too)

fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
match arg_types[0] {
DataType::Float32 => Ok(DataType::Float32),
DataType::Decimal128(precision, scale) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should include support for all decimal types, not just 128

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of creating a shared module for this, would that be okay.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think this signature can be simplified to:

if arg_types[0].is_null() {
    Ok(DataType::Float64)
else {
    Ok(arg_types[0].clone())
}

Comment on lines +320 to +328
# ceil with decimal argument
query RRRR
select
ceil(arrow_cast(1.23,'Decimal128(10,2)')),
ceil(arrow_cast(-1.23,'Decimal128(10,2)')),
ceil(arrow_cast(123.00,'Decimal128(10,2)')),
ceil(arrow_cast(-123.00,'Decimal128(10,2)'));
----
2 -1 123 -123
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should check the output type using arrow_typeof to ensure we're actually getting decimals and not just floats

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming it returns decimals: Why the numbers after the decimal point (.00) are not printed ?
I'd expect them to be printed for decimals but maybe it has been decided in the past to not do it for .0

# ceil with decimal argument
query RRRR
select
ceil(arrow_cast(1.23,'Decimal128(10,2)')),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the following will be valid:

ceil(arrow_cast(9.23,'Decimal128(3,2)')),

?
It would return 10.00 which is above the requested precision of 3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should raise a compute error. I will handle this case.

Comment on lines +320 to +328
# ceil with decimal argument
query RRRR
select
ceil(arrow_cast(1.23,'Decimal128(10,2)')),
ceil(arrow_cast(-1.23,'Decimal128(10,2)')),
ceil(arrow_cast(123.00,'Decimal128(10,2)')),
ceil(arrow_cast(-123.00,'Decimal128(10,2)'));
----
2 -1 123 -123
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming it returns decimals: Why the numbers after the decimal point (.00) are not printed ?
I'd expect them to be printed for decimals but maybe it has been decided in the past to not do it for .0

arrow_typeof(ceil(arrow_cast('-9999999999999999999999999999999999.01','Decimal256(38,2)'))),
arrow_cast(ceil(arrow_cast('-9999999999999999999999999999999999.01','Decimal256(38,2)')), 'Utf8');
----
Decimal256(38, 2) 10000000000000000000000000000000000.00 Decimal256(38, 2) -9999999999999999999999999999999999.00
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-g is this the right way to handle this test?

fn return_type(&self, arg_types: &[DataType]) -> Result<DataType> {
match arg_types[0] {
DataType::Float32 => Ok(DataType::Float32),
DataType::Decimal128(precision, scale) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think this signature can be simplified to:

if arg_types[0].is_null() {
    Ok(DataType::Float64)
else {
    Ok(arg_types[0].clone())
}

use datafusion_common::{exec_err, Result};

/// Operations required to manipulate the native representation of Arrow decimal arrays.
pub(super) trait DecimalNative:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't help but feel this trait is unnecessary; is there some other existing code (traits) we could use instead of implementing this? 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use ArrowNativeTypeOp.

Ok(Arc::new(result))
}

fn decimal_scale_factor<T>(scale: i8, fn_name: &str) -> Result<T::Native>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel we should just inline this function; doing so would make it clear that scale was already checked for being non-negative

Comment on lines +161 to +168
fn output_ordering(&self, input: &[ExprProperties]) -> Result<SortProperties> {
Ok(input[0].sort_properties)
}

fn evaluate_bounds(&self, inputs: &[&Interval]) -> Result<Interval> {
let data_type = inputs[0].data_type();
Interval::make_unbounded(&data_type)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah maybe I was wrong about inlining for ordering & bounds actually 🤔

Because bounds was generic to all the other functions, and for ordering we have tests against it so inlining here leaves that ceil_order function dangling 🤔

query TTTTTTTT
select
arrow_typeof(ceil(arrow_cast(9.01,'Decimal32(7,2)'))),
arrow_cast(ceil(arrow_cast(9.01,'Decimal32(7,2)')), 'Utf8'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think this cast to utf8 is necessary

kumarUjjawal and others added 2 commits November 29, 2025 12:15
Co-authored-by: Jeffrey Vo <jeffrey.vo.australia@gmail.com>
@github-actions github-actions bot added the execution Related to the execution crate label Nov 29, 2025
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation execution Related to the execution crate functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Decimal128 support to Ceil and Floor

3 participants