-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Remove ImmutableMap from SparkPartition for Kryo SerDe #3667
Spark: Remove ImmutableMap from SparkPartition for Kryo SerDe #3667
Conversation
7308de4
to
396cfdb
Compare
Hmmm it seems like when I rebased, it might have removed the co-authorship from @dubeme. Before we merge this in, can we please ensure that I can update it on my end somehow if need be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given this is a one-line change, could you also fix the other versions?
spark/v3.2/spark/src/test/java/org/apache/iceberg/spark/TestSparkPartitionSerialization.java
Outdated
Show resolved
Hide resolved
Sure thing. I almost did but I wasn’t sure where exactly we’re drawing the line. But this is small and if people need it, they need it for all versions (or can edit one or more out). EDIT: I have covered and added basic tests for Spark 2.4, 3.0, and 3.1. |
…(spark 3.2) Co-authored-by: Dubem Enyekwe <dubem@dubemenyekwe.com>
5d0a14b
to
44af3f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good to me, will merge on Monday if no one objects.
Thanks for getting this finished, @kbendick! |
Thanks @kbendick for stepping in in my absence |
Co-authored-by: Dubem Enyekwe <dubem@dubemenyekwe.com>
This PR replaces #3597, as @dubeme has not responded about some minor changes and we need this patch prior to the upcoming 0.13.0 release.
I have added them as a co-author on this PR so that they still get credit for their contribution 😄
Issue #3586 shows situation where Kyro is unable to serialize the SparkPartition object (@rdblue points it out from the stack trace). This commit replaces the ImmutableMap with HashMap and adds a test for round trip Kryo serialization and Java serialization for
SparkPartition
objects.I also tested this by running the
add_files
test suite with Kryo serialization enabled. The tests failed until this patch was applied. Unfortunately, I couldn't add a test there as the usualwithSQLConf
method did not pick up the change inspark.serializer
.Co-authored-by: Dubem Enyekwe dubem@dubemenyekwe.com
cc @rdblue @jackye1995 @dubeme