Skip to content

Conversation

@panbingkun
Copy link
Contributor

@panbingkun panbingkun commented Sep 28, 2023

What changes were proposed in this pull request?

The pr aims to replace TreeNode.productHash with MurmurHash3.productHash.

Why are the changes needed?

1.Spark 4.0 no longer supports Scala 2.12.

2.Using MurmurHash3's class method productHash to reduce code redundancy.
https://github.com/scala/scala/blob/v2.13.11/src/library/scala/util/hashing/MurmurHash3.scala#L343
https://github.com/scala/scala/blob/v2.13.11/src/library/scala/util/hashing/MurmurHash3.scala#L64-L81

Does this PR introduce any user-facing change?

No.

How was this patch tested?

1.Pass GA.
2.Manually test:

(base) panbingkun:~/Developer/spark/spark-community$./build/sbt "catalyst/testOnly org.apache.spark.sql.catalyst.expressions.CanonicalizeSuite -- -t \"SPARK-30847: Take productPrefix into account in MurmurHash3.productHash\""

[info] CanonicalizeSuite:
[info] - SPARK-30847: Take productPrefix into account in MurmurHash3.productHash (249 milliseconds)
[info] Run completed in 1 second, 135 milliseconds.
[info] Total number of tests run: 1
[info] Suites: completed 1, aborted 0
[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
[info] All tests passed.
[success] Total time: 230 s (03:50), completed Sep 28, 2023, 12:54:06 PM

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Sep 28, 2023
@LuciferYang
Copy link
Contributor

cc @HyukjinKwon FYI

@LuciferYang
Copy link
Contributor

@panbingkun Could you resolve the conflict?

@panbingkun
Copy link
Contributor Author

@panbingkun Could you resolve the conflict?

@LuciferYang Done.

Copy link
Contributor

@beliefer beliefer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

// Copied from Scala 2.13.1
// github.com/scala/scala/blob/v2.13.1/src/library/scala/util/hashing/MurmurHash3.scala#L56-L73
// to prevent the issue https://github.com/scala/bug/issues/10495
// TODO(SPARK-30848): Remove this once we drop Scala 2.12.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was looking for this PR to learn about history. Thank you for the good record at that time! haha

@panbingkun panbingkun changed the title [SPARK-45366][SQL] Remove productHash from TreeNode [SPARK-30848][SQL] Remove productHash from TreeNode Oct 20, 2023
Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants