[SPARK-56840][SQL] Avoid unresolved NullIf type lookup by sunchao · Pull Request #55838 · apache/spark

sunchao · 2026-05-12T22:01:17Z

Why are the changes needed?

NULLIF builds its replacement expression before analysis has resolved all child expressions.
For nested field references, the existing implementation can read the left operand's data type
too early while constructing the null branch, which can fail analysis even though the SQL shape
is valid.

SPARK-56840 tracks this analyzer failure.

What changes were proposed in this PR?

Build the NULLIF null branch with a lazy typed-null placeholder so construction does not eagerly
read the unresolved left operand type, while NullIf.replacement.dataType remains valid once the
operand type is available.
Make that placeholder RuntimeReplaceable, so ReplaceExpressions restores an ordinary typed
Literal(null, ...) before later optimizer rules run and existing null-literal simplifications
continue to apply.
Add focused regressions for:
- nested struct-field nullif(c.provider, lower(...)) analysis in both
  ALWAYS_INLINE_COMMON_EXPR modes;
- NullIf replacement type reporting before type coercion;
- optimizer replacement back to a normal null literal;
- explain output avoiding exposure of the internal helper name.

Does this PR introduce any user-facing change?

Yes. Valid NULLIF expressions over unresolved nested field references that could fail during
analysis now resolve and execute successfully.

How was this patch tested?

build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.expressions.NullExpressionsSuite -- -z "NullIf replacement preserves its data type before type coercion"'
build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.optimizer.OptimizerSuite -- -z "NullIf typed null branch is replaced with a null literal"'
build/sbt 'sql/testOnly org.apache.spark.sql.DataFrameFunctionsSuite -- -z "nullif function"'
build/sbt 'sql/testOnly org.apache.spark.sql.ExplainSuite -- -z "explain for these functions; use range to avoid constant folding"'

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Codex (GPT-5.5)

dongjoon-hyun

+1, LGTM. Thank you, @sunchao .

cc @peter-toth .

### Why are the changes needed? `NULLIF` builds its replacement expression before analysis has resolved all child expressions. For nested field references, the existing implementation can read the left operand's data type too early while constructing the null branch, which can fail analysis even though the SQL shape is valid. SPARK-56840 tracks this analyzer failure. ### What changes were proposed in this PR? - Build the `NULLIF` null branch with a lazy typed-null placeholder so construction does not eagerly read the unresolved left operand type, while `NullIf.replacement.dataType` remains valid once the operand type is available. - Make that placeholder `RuntimeReplaceable`, so `ReplaceExpressions` restores an ordinary typed `Literal(null, ...)` before later optimizer rules run and existing null-literal simplifications continue to apply. - Add focused regressions for: - nested struct-field `nullif(c.provider, lower(...))` analysis in both `ALWAYS_INLINE_COMMON_EXPR` modes; - `NullIf` replacement type reporting before type coercion; - optimizer replacement back to a normal null literal; - explain output avoiding exposure of the internal helper name. ### Does this PR introduce _any_ user-facing change? Yes. Valid `NULLIF` expressions over unresolved nested field references that could fail during analysis now resolve and execute successfully. ### How was this patch tested? - `build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.expressions.NullExpressionsSuite -- -z "NullIf replacement preserves its data type before type coercion"'` - `build/sbt 'catalyst/testOnly org.apache.spark.sql.catalyst.optimizer.OptimizerSuite -- -z "NullIf typed null branch is replaced with a null literal"'` - `build/sbt 'sql/testOnly org.apache.spark.sql.DataFrameFunctionsSuite -- -z "nullif function"'` - `build/sbt 'sql/testOnly org.apache.spark.sql.ExplainSuite -- -z "explain for these functions; use range to avoid constant folding"'` ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Codex (GPT-5.5) Closes #55838 from sunchao/dev/chao/codex/oss-nullif-unresolved. Authored-by: Chao Sun <sunchao@apache.org> Signed-off-by: Peter Toth <peter.toth@gmail.com> (cherry picked from commit 5949ab3) Signed-off-by: Peter Toth <peter.toth@gmail.com>

peter-toth · 2026-05-14T11:01:52Z

@sunchao , sorry, I think I merged this PR a bit too early, can you please doublecheck the repro test as it seems to pass without the fix on master?

I think the issue is real, but we probably need a proper repro in a follow-up PR.

Also, I couldn't merge this to branch-3.5 due to conflicts, can you please open a backport PR as well?

sunchao · 2026-05-14T16:36:37Z

Thanks @peter-toth . Let me take a look on the repro test. And yes, I can open a PR against branch-3.5 also.

sunchao added 3 commits May 12, 2026 14:14

[SQL] Avoid unresolved NullIf type lookup

490a4f8

[SQL] Simplify NullIf null branch typing

0cbeb53

[SQL] Preserve NullIf typed null replacement

3444f07

dongjoon-hyun approved these changes May 13, 2026

View reviewed changes

peter-toth approved these changes May 13, 2026

View reviewed changes

peter-toth closed this in 5949ab3 May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56840][SQL] Avoid unresolved NullIf type lookup#55838

[SPARK-56840][SQL] Avoid unresolved NullIf type lookup#55838
sunchao wants to merge 3 commits into
apache:masterfrom
sunchao:dev/chao/codex/oss-nullif-unresolved

sunchao commented May 12, 2026 •

edited

Loading

Uh oh!

dongjoon-hyun left a comment

Uh oh!

peter-toth commented May 14, 2026 •

edited

Loading

Uh oh!

sunchao commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

sunchao commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are the changes needed?

What changes were proposed in this PR?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

dongjoon-hyun left a comment

Choose a reason for hiding this comment

Uh oh!

peter-toth commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sunchao commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sunchao commented May 12, 2026 •

edited

Loading

peter-toth commented May 14, 2026 •

edited

Loading