Skip to content

Conversation

@ujjwaltwri
Copy link

Fixes #19171

Problem

The concat scalar function currently reports its result as always nullable at
planning/schema time. However, its runtime semantics are:

  • concat ignores NULL inputs.
  • The result becomes NULL only if all input arguments are NULL.

This mismatch causes incorrect nullability in inferred schemas and can affect
optimizer behavior.

What this PR does

Implements return_field_from_args for ConcatFunc so that:

  1. The return DataType is derived using the existing return_type logic.
  2. The return field’s nullability is computed as:

(If there are no argument fields, the return is considered nullable defensively.)

This aligns schema-time nullability with runtime behavior and matches the
semantics used by other SQL engines.

Tests

  • All existing unit tests for concat pass.
  • Parquet & CSV tests verified locally after initializing test submodules.
  • No behavior changes to runtime concatenation: only planner-side metadata improved.

Notes

If CI finds any compatibility adjustments required across DataFusion crates,
I will update this PR accordingly.

@ujjwaltwri
Copy link
Author

Happy to adjust anything if reviewers prefer a different nullability rule or want
the implementation placed elsewhere. The change is planner-only and does not
modify runtime behavior.

@github-actions github-actions bot added the functions Changes to functions implementation label Dec 7, 2025
@Jefffrey
Copy link
Contributor

Jefffrey commented Dec 9, 2025

This seems to be modifying the DF version instead of the Spark version?

https://github.com/apache/datafusion/blob/83736efc4ad8865019b0809ac9d87e63eabbe0a8/datafusion/spark/src/function/string/concat.rs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

spark concat need to have custom nullability

2 participants