Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13590] [ML] [Doc] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference #12731

Closed
wants to merge 5 commits into from

Conversation

yanboliang
Copy link
Contributor

@yanboliang yanboliang commented Apr 27, 2016

What changes were proposed in this pull request?

When fitting LinearRegressionModel(by "l-bfgs" solver) and LogisticRegressionModel w/o intercept on dataset with constant nonzero column, spark.ml produce same model as R glmnet but different from LIBSVM.

When fitting AFTSurvivalRegressionModel w/o intercept on dataset with constant nonzero column, spark.ml produce different model compared with R survival::survreg.

We should output a warning message and clarify in document for this condition.

How was this patch tested?

Document change, no unit test.

cc @mengxr

@SparkQA
Copy link

SparkQA commented Apr 27, 2016

Test build #57113 has finished for PR 12731 at commit 1e01735.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 27, 2016

Test build #57116 has finished for PR 12731 at commit 63ed407.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

(This is a super minor but I think cc @mengxr should be removed because the PR description explains the PR itself and the names of reviewers might not be related with the PR itself. @ will be removed by Python merge script when merging but cc and mengxr will still remain, eg. 4514aeb. Also I remember I was told so by one of committers)

@SparkQA
Copy link

SparkQA commented May 3, 2016

Test build #57671 has finished for PR 12731 at commit 63ed407.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -62,6 +62,8 @@ For more background and more details about the implementation, refer to the docu

> The current implementation of logistic regression in `spark.ml` only supports binary classes. Support for multiclass regression will be added in the future.

> When fitting LogisticRegressionModel without intercept on dataset with constant nonzero column, Spark ML produce same model as R glmnet but different from LIBSVM.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more explicit since we know the corresponding coefficients are zeros.

@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #58994 has finished for PR 12731 at commit 6fb5844.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented May 20, 2016

@yanboliang Please rename "Spark ML" to "MLlib". "Spark ML" is not an official name of the component. Thanks!

@SparkQA
Copy link

SparkQA commented May 21, 2016

Test build #59062 has finished for PR 12731 at commit ea52bbf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yanboliang
Copy link
Contributor Author

ping @mengxr

@mengxr
Copy link
Contributor

mengxr commented Jun 7, 2016

LGTM

asfgit pushed a commit that referenced this pull request Jun 7, 2016
…ession behavior difference

## What changes were proposed in this pull request?
When fitting ```LinearRegressionModel```(by "l-bfgs" solver) and ```LogisticRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce same model as R glmnet but different from LIBSVM.

When fitting ```AFTSurvivalRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce different model compared with R survival::survreg.

We should output a warning message and clarify in document for this condition.

## How was this patch tested?
Document change, no unit test.

cc mengxr

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #12731 from yanboliang/spark-13590.

(cherry picked from commit 6ecedf3)
Signed-off-by: Yanbo Liang <ybliang8@gmail.com>
@asfgit asfgit closed this in 6ecedf3 Jun 7, 2016
@yanboliang yanboliang deleted the spark-13590 branch June 8, 2016 04:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants