-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-1743][MLLIB] add loadLibSVMFile and saveAsLibSVMFile to pyspark #672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
python/pyspark/mllib/util.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should use @param and @return for Epydoc.. check pyspark/conf.py for example. Or have you tried generating the docs with this and seen it work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Epydoc doesn't work on my Mac. I will try to follow the syntax in conf.py.
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14749/ |
|
Jenkins, retest this please |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. |
|
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14750/ |
|
Jenkins, retest this please. |
|
Merged build triggered. |
|
Merged build started. |
|
Merged build finished. All automated tests passed. |
|
All automated tests passed. |
Make loading/saving labeled data easier for pyspark users. Also changed type check in `SparseVector` to allow numpy integers. Author: Xiangrui Meng <meng@databricks.com> Closes #672 from mengxr/pyspark-mllib-util and squashes the following commits: 2943fa7 [Xiangrui Meng] format docs d61668d [Xiangrui Meng] add loadLibSVMFile and saveAsLibSVMFile to pyspark (cherry picked from commit 3188553) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Make loading/saving labeled data easier for pyspark users. Also changed type check in `SparseVector` to allow numpy integers. Author: Xiangrui Meng <meng@databricks.com> Closes apache#672 from mengxr/pyspark-mllib-util and squashes the following commits: 2943fa7 [Xiangrui Meng] format docs d61668d [Xiangrui Meng] add loadLibSVMFile and saveAsLibSVMFile to pyspark
…pache#672) [SPARK-49783][YARN] Fix resource leak of yarn allocator Fix the resource leak of yarn allocator When the target < running containers number, the assigned containers from the resource manager will be skipped, but these containers are not released by invoking the amClient.releaseAssignedContainer , that will make these containers reserved into the Yarn resourceManager at least 10 minutes. And so, the cluster resource will be wasted at a high ratio. And this will reflect that the vcore * seconds statistics from yarn side will be greater than the result from the spark event logs. From my statistics, the cluster resource waste ratio is ~25% if the spark jobs are exclusive in this cluster. No In our internal hadoop cluster No Closes apache#48238 from zuston/patch-1. Authored-by: Junfan Zhang <zuston@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> Co-authored-by: Junfan Zhang <zuston@apache.org>
Make loading/saving labeled data easier for pyspark users.
Also changed type check in
SparseVectorto allow numpy integers.