-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21445] Make IntWrapper and LongWrapper in UTF8String Serializable #18660
Conversation
Is it safe to just ignore them? Maybe we should recover them in |
I don't think we're actually trying to ship these values anywhere. We're failing to serialize the task during WholeStageCodegenExec:
|
cc @cloud-fan and @tejasapatil who have more info based on: |
I see. They are static classes. |
good catch! LGTM |
Test build #79689 has finished for PR 18660 at commit
|
## What changes were proposed in this pull request? Making those two classes will avoid Serialization issues like below: ``` Caused by: java.io.NotSerializableException: org.apache.spark.unsafe.types.UTF8String$IntWrapper Serialization stack: - object not serializable (class: org.apache.spark.unsafe.types.UTF8String$IntWrapper, value: org.apache.spark.unsafe.types.UTF8String$IntWrapper326450e) - field (class: org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToInt$1, name: result$2, type: class org.apache.spark.unsafe.types.UTF8String$IntWrapper) - object (class org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToInt$1, <function1>) ``` ## How was this patch tested? - [x] Manual testing - [ ] Unit test Author: Burak Yavuz <brkyvz@gmail.com> Closes #18660 from brkyvz/serializableutf8. (cherry picked from commit 26cd2ca) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
I couldn't write an easy reproduction for the bug :( |
thanks, merging to master! @brkyvz I think it's fine, this bug is very obvious. |
Also merged to branch-2.2 |
thanks @cloud-fan |
## What changes were proposed in this pull request? Making those two classes will avoid Serialization issues like below: ``` Caused by: java.io.NotSerializableException: org.apache.spark.unsafe.types.UTF8String$IntWrapper Serialization stack: - object not serializable (class: org.apache.spark.unsafe.types.UTF8String$IntWrapper, value: org.apache.spark.unsafe.types.UTF8String$IntWrapper326450e) - field (class: org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToInt$1, name: result$2, type: class org.apache.spark.unsafe.types.UTF8String$IntWrapper) - object (class org.apache.spark.sql.catalyst.expressions.Cast$$anonfun$castToInt$1, <function1>) ``` ## How was this patch tested? - [x] Manual testing - [ ] Unit test Author: Burak Yavuz <brkyvz@gmail.com> Closes apache#18660 from brkyvz/serializableutf8. (cherry picked from commit 26cd2ca) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
Making those two classes will avoid Serialization issues like below:
How was this patch tested?