Skip to content

feat: Use NativeType in get_example_type, information schema#21737

Open
theirix wants to merge 7 commits intoapache:mainfrom
theirix:type-signature-native-type
Open

feat: Use NativeType in get_example_type, information schema#21737
theirix wants to merge 7 commits intoapache:mainfrom
theirix:type-signature-native-type

Conversation

@theirix
Copy link
Copy Markdown
Contributor

@theirix theirix commented Apr 19, 2026

Which issue does this PR close?

Rationale for this change

Moving from physical types: get_example_types and the information schema use Arrow DataType, but it is sufficient to use Datafusion's NativeType instead.

Let's introduce a new API get_representative_types, based on NativeType, and deprecate public get_example_types API

It is a logical continuation of #15965

What changes are included in this PR?

Are these changes tested?

Tests are passing

Are there any user-facing changes?

  • Deprecation of the TypeSignature::get_example_types API
  • Removing the TypeSignature::get_possible_types (since v46)

@github-actions github-actions Bot added logical-expr Logical plan and expressions catalog Related to the catalog crate labels Apr 19, 2026
@theirix theirix marked this pull request as ready for review April 19, 2026 20:28
Comment thread datafusion/expr-common/src/signature.rs Outdated

#[deprecated(since = "46.0.0", note = "See get_example_types instead")]
pub fn get_possible_types(&self) -> Vec<Vec<DataType>> {
pub fn get_possible_types(&self) -> Vec<Vec<NativeType>> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. A public API break for a deprecated method ... Maybe it is time to just remove it ?!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. I will drop get_possible_types. However, get_example_types is also public - it is used in the information schema internally, and can be used by DF users.

Since it breaks the expr-common API, how about deprecating the get_example_types signature and adding a NativeType-based get_representative_types?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-g to avoid breaking changes in the get_example_types (my miss), I've introduced the new method, while deprecating get_example_types.

The Signature API is heavily DataType-based, so the get_example_types. Should we let them coexist, or even move the NativeType-based logic to the information schema? cc @jayzhan211

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the NativeType is not powerful enough to fully replace DataType here.
For example https://github.com/apache/datafusion/pull/21737/changes#diff-52d29120f24c2a01793f6d729fdb7898abaa8d2320db75aba5cbb06cd714930eR505-R516 shows that there are no counterparts for Union's UnionMode and Map's keys_sorted and defaults should be used. And this may lead to confusions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Undoubtedly, it's not a replacement now. The question for this improvement is whether we'd like to provide changes only to the information-schema, which benefits from native types (then we move code there), or also provide a public API for the Signature to support NativeType alongside DataType (as now).

At the same time, UDFs are migrating to TypeSignature, abstracting from physical data types.

Comment thread datafusion/expr-common/src/signature.rs Outdated
Comment thread datafusion/catalog/src/information_schema.rs
Comment thread datafusion/catalog/src/information_schema.rs Outdated
Comment thread datafusion/catalog/src/information_schema.rs Outdated
Comment thread datafusion/expr-common/src/signature.rs Outdated
theirix and others added 5 commits April 20, 2026 20:08
Co-authored-by: Martin Grigorov <martin-g@users.noreply.github.com>
Instead, add the new `get_representative_types` API with NativeType.
Deprecated old `get_example_types` and related helpers, left as-is
to avoid breaking API change.
Comment thread datafusion/expr-common/src/signature.rs Outdated
///
/// This is used for `information_schema` and can be used to generate
/// documentation or error messages.
/// Remove with `get_example_types`
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a comment (//) instead of rustdoc (///)

Comment thread datafusion/expr-common/src/signature.rs Outdated

#[deprecated(since = "46.0.0", note = "See get_example_types instead")]
pub fn get_possible_types(&self) -> Vec<Vec<DataType>> {
pub fn get_possible_types(&self) -> Vec<Vec<NativeType>> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems the NativeType is not powerful enough to fully replace DataType here.
For example https://github.com/apache/datafusion/pull/21737/changes#diff-52d29120f24c2a01793f6d729fdb7898abaa8d2320db75aba5cbb06cd714930eR505-R516 shows that there are no counterparts for Union's UnionMode and Map's keys_sorted and defaults should be used. And this may lead to confusions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

catalog Related to the catalog crate logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Return NativeType instead of DataType for get_example_types

2 participants