Is your feature request related to a problem or challenge?
DataFusion has a default runtime inventory of scalar, higher-order, aggregate, and window functions, but there is no generated Substrait extension YAML (https://github.com/substrait-io/substrait/blob/main/text/simple_extensions_schema.yaml) that describes that function set.
Without generated extension YAML, Substrait function declarations can drift from the functions registered by a default DataFusion session, and downstream projects have no reusable DataFusion API for generating equivalent YAML for custom UDF registries.
Describe the solution you'd like
Add a generator that emits Substrait function YAML from the same default SessionStateDefaults inventory used by DataFusion's function docs generator.
The generator should:
- expose a public API that accepts an explicit function inventory and generator config
- allow custom URNs, output paths, and overrides for signatures or return types that cannot be inferred from DataFusion
Signature metadata
- include a binary with stdout,
--check, and --write modes
- commit generated scalar, aggregate, and window YAML files under
datafusion/substrait/extensions/
- fail if a default function cannot be emitted or explicitly overridden
Describe alternatives you've considered
The main alternative is to maintain the Substrait YAML by hand. That would be more error-prone because the source of truth for available DataFusion functions is the runtime registry, not static YAML files.
Additional context
No response
Is your feature request related to a problem or challenge?
DataFusion has a default runtime inventory of scalar, higher-order, aggregate, and window functions, but there is no generated Substrait extension YAML (https://github.com/substrait-io/substrait/blob/main/text/simple_extensions_schema.yaml) that describes that function set.
Without generated extension YAML, Substrait function declarations can drift from the functions registered by a default DataFusion session, and downstream projects have no reusable DataFusion API for generating equivalent YAML for custom UDF registries.
Describe the solution you'd like
Add a generator that emits Substrait function YAML from the same default
SessionStateDefaultsinventory used by DataFusion's function docs generator.The generator should:
Signaturemetadata--check, and--writemodesdatafusion/substrait/extensions/Describe alternatives you've considered
The main alternative is to maintain the Substrait YAML by hand. That would be more error-prone because the source of truth for available DataFusion functions is the runtime registry, not static YAML files.
Additional context
No response