Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-21958][ML] Word2VecModel save: transform data in the cluster #19191

Closed
wants to merge 1 commit into from
Closed

[SPARK-21958][ML] Word2VecModel save: transform data in the cluster #19191

wants to merge 1 commit into from

Conversation

travishegner
Copy link
Contributor

@travishegner travishegner commented Sep 11, 2017

What changes were proposed in this pull request?

Change a data transformation while saving a Word2VecModel to happen with distributed data instead of local driver data.

How was this patch tested?

Unit tests for the ML sub-component still pass.
Running this patch against v2.2.0 in a fully distributed production cluster allows a 4.0G model to save and load correctly, where it would not do so without the patch.

@MLnick
Copy link
Contributor

MLnick commented Sep 12, 2017

ok to test

@SparkQA
Copy link

SparkQA commented Sep 12, 2017

Test build #81667 has finished for PR 19191 at commit 5f4ce99.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@MLnick
Copy link
Contributor

MLnick commented Sep 15, 2017

LGTM

@MLnick
Copy link
Contributor

MLnick commented Sep 15, 2017

Merged to master. Thanks!

@asfgit asfgit closed this in 79a4dab Sep 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants