[SPARK-55681][SQL] Fix singleton DataType equality after deserialization#54475
Closed
timlee0119 wants to merge 1 commit intoapache:masterfrom
Closed
[SPARK-55681][SQL] Fix singleton DataType equality after deserialization#54475timlee0119 wants to merge 1 commit intoapache:masterfrom
timlee0119 wants to merge 1 commit intoapache:masterfrom
Conversation
cloud-fan
approved these changes
Feb 25, 2026
Contributor
|
thanks, merging to master! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Override
equals()andhashCode()on 14 singletonDataTypeclasses so that non-singleton instances compare equal to the case object singletons:For each type:
getSimpleNameis used because Scala's auto-generated hashCode for 0-arity case objects returnsproductPrefix.hashCode(the simple class name). This preserves the exact same hash values, avoiding any change in hash-dependent code paths.Other DataTypes did not need this change:
=>) across the codebase
Why are the changes needed?
Scala case object pattern matching (e.g.,
case BinaryType =>) relies onequals(), which for case objects defaults to reference equality. If a non-singleton instance of a DataType class is created at runtime — through any serialization framework that bypassesreadResolve()— every caseBinaryType=> match in the codebase silently falls through, leading to errors like (code pointer):Although the constructors are private, this is a compile-time guard only — serialization frameworks
bypass constructors at runtime, so non-singleton instances can be created.
Does this PR introduce any user-facing change?
Yes. Before this change, if a non-singleton DataType instance was created through deserialization, pattern matches like
case BinaryType =>would silently fail, leading to non-deterministic runtime errors. After this change, non-singleton instances are correctly recognized as equal to the singleton, and pattern matching works as expected.How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.6)