New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Fix NPEs in spark value converter #4663
Spark: Fix NPEs in spark value converter #4663
Conversation
return convertedMap; | ||
} catch (NullPointerException npe) { | ||
// Scala 2.11 fix: Catch NPE as internal value could be null and scala wrapper does not | ||
// evaluate until iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this a bit more? I don't understand what is happening in Scala 2.11. Is it that the object returned by map.entrySet()
is non-null, but calling iterator
on it will result in NPE?
Is there an alternative to this? If the value returned by Iterable.iterator()
is null, can we catch that directly rather than using a catch block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, I don't think we can access the underlying map that is null
. The line throwing the NPE is https://github.com/scala/scala/blob/2.11.x/src/library/scala/collection/convert/Wrappers.scala#L184 - here the underlying.iterator
throws NPE as underlying
is null
, similar to any other Map method call that uses underlying
such as Map#size()
.
In fact, Map#entrySet
returns a valid anonymous set but when the iterator()
method is called the NPE is raised.
I wonder if with reflection you can check for nullity on underlying
but I think that's no better approach. Open to suggestions. Thanks!
@@ -95,6 +107,10 @@ public static Object convert(Type type, Object object) { | |||
} | |||
|
|||
private static Record convert(Types.StructType struct, Row row) { | |||
if (row == null) { | |||
return null; | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks fine, but I'd like to avoid catching NPE if possible.
Thanks, @edgarRd! I'll mark this for inclusion in 0.13.2 as well. |
This PR adds tests for
SparkValueConverter
across spark versions and fixed several NPE instances when usingnull
in a Spark row converting to an Iceberg record using different schemas, including: