[SPARK-56045][SQL] Add flag for ignoring Parquet UNKNOWN type annotation and revert to old behavior by ZiyaZa · Pull Request #54870 · apache/spark

ZiyaZa · 2026-03-17T17:20:10Z

What changes were proposed in this pull request?

This PR introduces a new flag spark.sql.parquet.reader.respectUnknownTypeAnnotation.enabled for Parquet reader to control the behavior when it reads an external file with UNKNOWN logical type annotation:

(Default) When false, we infer the Spark type based on the physical type used in the Parquet file, as we did before Spark 4.1.
When true, we use NullType as the Spark type.

Why are the changes needed?

To fix the regression introduced by #52922, as we have been reading files differently since then.

Does this PR introduce any user-facing change?

Yes. With default flag value, when we read a Parquet file written by an external engine:

Before, we inferred NullType
Now, we'll infer a type based on the physical type (e.g. IntegerType)

How was this patch tested?

Added tests.

Was this patch authored or co-authored using generative AI tooling?

No.

…ld behavior

cloud-fan · 2026-03-18T03:48:40Z

LGTM if CI is green, please create a new JIRA ticket as the original commit is already released.

ZiyaZa · 2026-03-18T11:49:36Z

LGTM if CI is green, please create a new JIRA ticket as the original commit is already released.

CI is green, linked the new ticket in the title.

…ion and revert to old behavior ### What changes were proposed in this pull request? This PR introduces a new flag `spark.sql.parquet.reader.respectUnknownTypeAnnotation.enabled` for Parquet reader to control the behavior when it reads an external file with `UNKNOWN` logical type annotation: - (Default) When false, we infer the Spark type based on the physical type used in the Parquet file, as we did before Spark 4.1. - When true, we use NullType as the Spark type. ### Why are the changes needed? To fix the regression introduced by #52922, as we have been reading files differently since then. ### Does this PR introduce _any_ user-facing change? Yes. With default flag value, when we read a Parquet file written by an external engine: - Before, we inferred NullType - Now, we'll infer a type based on the physical type (e.g. IntegerType) ### How was this patch tested? Added tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #54870 from ZiyaZa/unknown-type-flag. Authored-by: Ziya Mukhtarov <ziya5muxtarov@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 50514c5) Signed-off-by: Wenchen Fan <wenchen@databricks.com>

cloud-fan · 2026-03-18T13:43:43Z

thanks, merging to master/4.1!

dongjoon-hyun · 2026-03-18T16:34:29Z

sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala

+        "inference and infers NullType. When disabled, ignores the UNKNOWN annotation " +
+        "and uses the physical type instead.")
+      .version("4.1.2")
+      .withBindingPolicy(ConfigBindingPolicy.SESSION)


Hi, @ZiyaZa and @cloud-fan .

This broken branch-4.1. Let me revert this from branch-4.1 only for now.

https://github.com/apache/spark/actions/runs/23247640394/job/67580471010

[error] /home/runner/work/spark/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:1627:8: value withBindingPolicy is not a member of org.apache.spark.internal.config.ConfigBuilder [error] possible cause: maybe a semicolon is missing before `value withBindingPolicy`? [error] .withBindingPolicy(ConfigBindingPolicy.SESSION) [error] ^ [error] /home/runner/work/spark/spark/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:1627:26: not found: value ConfigBindingPolicy [error] .withBindingPolicy(ConfigBindingPolicy.SESSION) [error] ^ [error] two errors found

Hi, it seems withBindingPolicy doesn't exist in 4.1. So deleting that line should solve it. I can create a PR deleting that line, or re-apply this PR as a whole after your revert. Either way works.

dongjoon-hyun · 2026-03-18T16:44:32Z

Yes, it's already reverted. Please make a new backporting PR to branch-4.1 now to make it sure that CI passes, @ZiyaZa .

dongjoon-hyun · 2026-03-18T16:44:46Z

BTW, thank you for the fix, @ZiyaZa .

ZiyaZa · 2026-03-18T16:46:22Z

Created a new PR here: #54885

Thanks for letting me know.

Add flag for ignoring Parquet UNKNOWN type annotation and revert to o…

f974953

…ld behavior

Add config binding policy

4cf20e9

ZiyaZa changed the title ~~[SPARK-54220][SQL][FOLLOWUP] Add flag for ignoring Parquet UNKNOWN type annotation and revert to old behavior~~ [SPARK-56045][SQL] Add flag for ignoring Parquet UNKNOWN type annotation and revert to old behavior Mar 18, 2026

cloud-fan closed this in 50514c5 Mar 18, 2026

dongjoon-hyun reviewed Mar 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-56045][SQL] Add flag for ignoring Parquet UNKNOWN type annotation and revert to old behavior#54870

[SPARK-56045][SQL] Add flag for ignoring Parquet UNKNOWN type annotation and revert to old behavior#54870
ZiyaZa wants to merge 2 commits intoapache:masterfrom
ZiyaZa:unknown-type-flag

ZiyaZa commented Mar 17, 2026

Uh oh!

cloud-fan commented Mar 18, 2026

Uh oh!

ZiyaZa commented Mar 18, 2026

Uh oh!

cloud-fan commented Mar 18, 2026

Uh oh!

dongjoon-hyun Mar 18, 2026 •

edited

Loading

Uh oh!

ZiyaZa Mar 18, 2026

Uh oh!

dongjoon-hyun commented Mar 18, 2026

Uh oh!

dongjoon-hyun commented Mar 18, 2026

Uh oh!

ZiyaZa commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ZiyaZa commented Mar 17, 2026

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

cloud-fan commented Mar 18, 2026

Uh oh!

ZiyaZa commented Mar 18, 2026

Uh oh!

cloud-fan commented Mar 18, 2026

Uh oh!

dongjoon-hyun Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ZiyaZa Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

dongjoon-hyun commented Mar 18, 2026

Uh oh!

dongjoon-hyun commented Mar 18, 2026

Uh oh!

ZiyaZa commented Mar 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dongjoon-hyun Mar 18, 2026 •

edited

Loading