Skip to content

[EPIC] Support Decimal for User Defined Functions #18889

@theirix

Description

@theirix

Is your feature request related to a problem or challenge?

Math UDFs in the datafusion-functions crate support integer and floating types, but not Decimals. This epic is dedicated to adding and improving decimal support for UDFs. It is a follow-up epic to adding Decimal support to the DataFusion core #3523.

So far, it is implemented for log, power binary UDFs; round, ceil unary UDF. Turns out, more code should be moved into helper functions to make UDFs leaner and abstracted from details (e.g., scalar vs. array cases, casting, etc).

Describe the solution you'd like

There are the following primary directions:

  1. Adding support for well-known Decimal128 and Decimal256 to existing functions
  2. Adding support for new Decimal32 and Decimal64, which are not yet fully supported
  3. Refining coercion rules to work with mixtures of floats/decimals
  4. Ensuring it would work properly with the new parse_float_as_decimal flag, forcing floats to be decimals after SQL parsing
  5. Improving tests to validate correct behaviour for floats/decimals and corner cases
  6. Moving some core support to the Arrow libraries

I welcome thoughts and discussions about these directions.

Describe alternatives you've considered

The approach of coercing decimals to floats could work, but it loses precision and data and doesn't match the behaviour of existing SQL engines (Postgres, Spark). Decimals should be first-class citizens.

Additional context

Related tickets:

Function support:

Core support:

Coercion and type issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    EPICA larger project, actively underway, with sub tasksenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions