-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-18891][SQL] Support for specific Java List subtypes #18009
Conversation
@@ -28,6 +28,8 @@ case class SeqClass(s: Seq[Int]) | |||
|
|||
case class ListClass(l: List[Int]) | |||
|
|||
case class JListClass(l: java.util.List[Int]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not supported intentionally. If we want to support java classes and scala classes mixedly, we should unify the ScalaRelfection
and JavaTypeInference
, which is a lot of effort.
For now I think it's more important to completely support specific collections type in scala, and then try to unify the ScalaRelfection
and JavaTypeInference
.
what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively you can improve JavaTypeInference
to support specific java list and test it with java bean in JavaDatasetSuite
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not notice that there is separate inference code for Java classes. It would certainly be nice to unify the code for Java and Scala classes. Is this already being worked on/planned?
I moved the List
support to JavaTypeInference
and rewrote tests accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's planned but no one is working on it yet. You can start by creating a JIRA ticket if you are interested :)
ping @michalsenkyr |
Remove specific Java List support from ScalaReflection Remove implicit encoder for Java Lists Add relevant tests to JavaDatasetSuite Remove tests from ScalaReflectionSuite and DatasetPrimitiveSuite
1edaf85
to
881e636
Compare
ok to test |
LGTM, pending test |
Test build #77879 has finished for PR 18009 at commit
|
thanks, merging to master! |
## What changes were proposed in this pull request? Add support for specific Java `List` subtypes in deserialization as well as a generic implicit encoder. All `List` subtypes are supported by using either the size-specifying constructor (one `int` parameter) or the default constructor. Interfaces/abstract classes use the following implementations: * `java.util.List`, `java.util.AbstractList` or `java.util.AbstractSequentialList` => `java.util.ArrayList` ## How was this patch tested? ```bash build/mvn -DskipTests clean package && dev/run-tests ``` Additionally in Spark shell: ``` scala> val jlist = new java.util.LinkedList[Int]; jlist.add(1) jlist: java.util.LinkedList[Int] = [1] res0: Boolean = true scala> Seq(jlist).toDS().map(_.element()).collect() res1: Array[Int] = Array(1) ``` Author: Michal Senkyr <mike.senkyr@gmail.com> Closes apache#18009 from michalsenkyr/dataset-java-lists.
What changes were proposed in this pull request?
Add support for specific Java
List
subtypes in deserialization as well as a generic implicit encoder.All
List
subtypes are supported by using either the size-specifying constructor (oneint
parameter) or the default constructor.Interfaces/abstract classes use the following implementations:
java.util.List
,java.util.AbstractList
orjava.util.AbstractSequentialList
=>java.util.ArrayList
How was this patch tested?
build/mvn -DskipTests clean package && dev/run-tests
Additionally in Spark shell: