-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
Is your feature request related to a problem or challenge?
Math UDFs in the datafusion-functions crate support integer and floating types, but not Decimals. This epic is dedicated to adding and improving decimal support for UDFs. It is a follow-up epic to adding Decimal support to the DataFusion core #3523.
So far, it is implemented for log, power binary UDFs; round, ceil unary UDF. Turns out, more code should be moved into helper functions to make UDFs leaner and abstracted from details (e.g., scalar vs. array cases, casting, etc).
Describe the solution you'd like
There are the following primary directions:
- Adding support for well-known
Decimal128andDecimal256to existing functions - Adding support for new
Decimal32andDecimal64, which are not yet fully supported - Refining coercion rules to work with mixtures of floats/decimals
- Ensuring it would work properly with the new
parse_float_as_decimalflag, forcing floats to be decimals after SQL parsing - Improving tests to validate correct behaviour for floats/decimals and corner cases
- Moving some core support to the Arrow libraries
I welcome thoughts and discussions about these directions.
Describe alternatives you've considered
The approach of coercing decimals to floats could work, but it loses precision and data and doesn't match the behaviour of existing SQL engines (Postgres, Spark). Decimals should be first-class citizens.
Additional context
Related tickets:
Function support:
- ROUND support decimal #17054
- feat: support decimal for math functions: pow #18031
- Native decimal 32/64/256 bit support for log #17555
- decimal support for agg function #1545
- Add Decimal128 support to Ceil and Floor #7689
- Decimal128 support for statistical aggregations #3572
Core support:
- Set default value of parse_float_as_decimal to true #14612
- Refactor away usage of
NUMERICS/INTEGERSindatafusion/expr-common/src/type_coercion/aggregates.rs#18092 - Replace
TypeSignature::NumericwithTypeSignature::Coercible#14760 - TypeSignature::Coercible for math functions #14763
- Decimal32/64 aren't as well supported as as the 128 and 256 bit variants #17489
- Support smaller decimal types through SQL interface #17747
Coercion and type issues: