Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3971] [MLLib] [PySpark] hotfix: Customized pickler should work in cluster mode #2830

Closed
wants to merge 3 commits into from

Conversation

davies
Copy link
Contributor

@davies davies commented Oct 16, 2014

Customized pickler should be registered before unpickling, but in executor, there is no way to register the picklers before run the tasks.

So, we need to register the picklers in the tasks itself, duplicate the javaToPython() and pythonToJava() in MLlib, call SerDe.initialize() before pickling or unpickling.

@falaki
Copy link
Contributor

falaki commented Oct 16, 2014

Thanks a lot.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have started for PR 2830 at commit 0f02050.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have started for PR 2830 at commit 0f02050.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have started for PR 2830 at commit 6b94e15.

  • This patch merges cleanly.

@mengxr
Copy link
Contributor

mengxr commented Oct 16, 2014

LGTM. Thanks for fixing it! Waiting for Jenkins.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have finished for PR 2830 at commit 0f02050.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have finished for PR 2830 at commit 0f02050.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21815/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have started for PR 2830 at commit 0c85fb9.

  • This patch merges cleanly.

@davies
Copy link
Contributor Author

davies commented Oct 16, 2014

@mengxr @falaki it had passed all the tests, the last two commits are just refactor, I think it's ready to merge.

@mengxr
Copy link
Contributor

mengxr commented Oct 16, 2014

let's wait for the last build to finish.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have finished for PR 2830 at commit 6b94e15.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21816/
Test PASSed.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have finished for PR 2830 at commit 0c85fb9.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21821/
Test FAILed.

@davies
Copy link
Contributor Author

davies commented Oct 16, 2014

@mengxr The failed case is not related.

@mengxr
Copy link
Contributor

mengxr commented Oct 16, 2014

Yes, that's a known flakey test. I've merged this into master. Thanks!

davies added a commit that referenced this pull request Oct 16, 2014
… in cluster mode

Customized pickler should be registered before unpickling, but in executor, there is no way to register the picklers before run the tasks.

So, we need to register the picklers in the tasks itself, duplicate the javaToPython() and pythonToJava() in MLlib, call SerDe.initialize() before pickling or unpickling.

Author: Davies Liu <davies.liu@gmail.com>

Closes #2830 from davies/fix_pickle and squashes the following commits:

0c85fb9 [Davies Liu] revert the privacy change
6b94e15 [Davies Liu] use JavaConverters instead of JavaConversions
0f02050 [Davies Liu] hotfix: Customized pickler does not work in cluster
@mengxr
Copy link
Contributor

mengxr commented Oct 16, 2014

test this please

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have started for PR 2830 at commit 0c85fb9.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Oct 16, 2014

QA tests have finished for PR 2830 at commit 0c85fb9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/21823/
Test PASSed.

@davies davies closed this Oct 16, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants