-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12834] Change ser/de of JavaArray and JavaList #10772
Conversation
Test build #49459 has finished for PR 10772 at commit
|
@jkbradley @davies I know why we get error here. If the element of a I'll change the fix into my original one. |
@jkbradley I have changed the fix. However, here is a little tricky. If a developer wants to convert an For complext structure, an example from current code is in VectorIndexer. Do you have any recommendations for it? How about we add an assertion in the above code: case Array[Array[_]] => throw new XXXException("You need transform nested array into Java list yourself") |
Test build #49515 has finished for PR 10772 at commit
|
@jkbradley What do you think of the change? Or I believe that we can just ignore the |
Sorry for the wait. I think this fix is fine for now and will hopefully let most Python wrappers use |
I'll merge this after tests run again |
Test build #2458 has finished for PR 10772 at commit
|
Never mind, we can update the ser/de later when necessary. 2016年1月25日星期一,Apache Spark QA notifications@github.com 写道:
CheersXusen Yin (尹绪森) |
LGTM |
https://issues.apache.org/jira/browse/SPARK-12834 We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780 Author: Xusen Yin <yinxusen@gmail.com> Closes apache#10772 from yinxusen/SPARK-12834.
…vaList Backport of SPARK-12834 for branch-1.6 Original PR: #10772 Original commit message: We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780 Author: Xusen Yin <yinxusen@gmail.com> Closes #10941 from jkbradley/yinxusen-SPARK-12834-1.6.
https://issues.apache.org/jira/browse/SPARK-12834
We use
SerDe.dumps()
to serializeJavaArray
andJavaList
inPythonMLLibAPI
, then deserialize them withPickleSerializer
in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g.list(JavaArray)
orlist(JavaList)
. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780