Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML][PYTHON][SPARK-12834][BACKPORT] Change ser/de of JavaArray and JavaList #10941

Closed

Conversation

jkbradley
Copy link
Member

Backport of SPARK-12834 for branch-1.6

Original PR: #10772

Original commit message:
We use SerDe.dumps() to serialize JavaArray and JavaList in PythonMLLibAPI, then deserialize them with PickleSerializer in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. list(JavaArray) or list(JavaList). What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780

https://issues.apache.org/jira/browse/SPARK-12834

We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780

Author: Xusen Yin <yinxusen@gmail.com>

Closes apache#10772 from yinxusen/SPARK-12834.
@jkbradley
Copy link
Member Author

Ping @yinxusen I realized we should at least backport this to 1.6. I made this PR to make sure tests run. After this is merged, we can backport [https://github.com/apache/spark/commit/4db255c7aa756daa224d61905db745b6bccc9173] too.

@jkbradley jkbradley changed the title [ML][PYTHON][SPARK-12834] Change ser/de of JavaArray and JavaList [ML][PYTHON][SPARK-12834][BACKPORT] Change ser/de of JavaArray and JavaList Jan 27, 2016
@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #50172 has finished for PR 10941 at commit 4a621bc.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #2465 has finished for PR 10941 at commit 4a621bc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yinxusen
Copy link
Contributor

Sure, LGTM

@jkbradley
Copy link
Member Author

OK thanks! merging with branch-1.6

asfgit pushed a commit that referenced this pull request Jan 27, 2016
…vaList

Backport of SPARK-12834 for branch-1.6

Original PR: #10772

Original commit message:
We use `SerDe.dumps()` to serialize `JavaArray` and `JavaList` in `PythonMLLibAPI`, then deserialize them with `PickleSerializer` in Python side. However, there is no need to transform them in such an inefficient way. Instead of it, we can use type conversion to convert them, e.g. `list(JavaArray)` or `list(JavaList)`. What's more, there is an issue to Ser/De Scala Array as I said in https://issues.apache.org/jira/browse/SPARK-12780

Author: Xusen Yin <yinxusen@gmail.com>

Closes #10941 from jkbradley/yinxusen-SPARK-12834-1.6.
@marmbrus
Copy link
Contributor

marmbrus commented Feb 2, 2016

Can you close this PR? (PRs not against master don't auto close)

@asfgit asfgit closed this in 085f510 Feb 4, 2016
@jkbradley jkbradley deleted the yinxusen-SPARK-12834-1.6 branch March 10, 2016 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants