-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-13590] [ML] [Doc] Document spark.ml LiR, LoR and AFTSurvivalRegression behavior difference #12731
Conversation
…with constant nonzero column
Test build #57113 has finished for PR 12731 at commit
|
Test build #57116 has finished for PR 12731 at commit
|
(This is a super minor but I think |
Test build #57671 has finished for PR 12731 at commit
|
@@ -62,6 +62,8 @@ For more background and more details about the implementation, refer to the docu | |||
|
|||
> The current implementation of logistic regression in `spark.ml` only supports binary classes. Support for multiclass regression will be added in the future. | |||
|
|||
> When fitting LogisticRegressionModel without intercept on dataset with constant nonzero column, Spark ML produce same model as R glmnet but different from LIBSVM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be more explicit since we know the corresponding coefficients are zeros.
Test build #58994 has finished for PR 12731 at commit
|
@yanboliang Please rename "Spark ML" to "MLlib". "Spark ML" is not an official name of the component. Thanks! |
Test build #59062 has finished for PR 12731 at commit
|
ping @mengxr |
LGTM |
…ession behavior difference ## What changes were proposed in this pull request? When fitting ```LinearRegressionModel```(by "l-bfgs" solver) and ```LogisticRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce same model as R glmnet but different from LIBSVM. When fitting ```AFTSurvivalRegressionModel``` w/o intercept on dataset with constant nonzero column, spark.ml produce different model compared with R survival::survreg. We should output a warning message and clarify in document for this condition. ## How was this patch tested? Document change, no unit test. cc mengxr Author: Yanbo Liang <ybliang8@gmail.com> Closes #12731 from yanboliang/spark-13590. (cherry picked from commit 6ecedf3) Signed-off-by: Yanbo Liang <ybliang8@gmail.com>
What changes were proposed in this pull request?
When fitting
LinearRegressionModel
(by "l-bfgs" solver) andLogisticRegressionModel
w/o intercept on dataset with constant nonzero column, spark.ml produce same model as R glmnet but different from LIBSVM.When fitting
AFTSurvivalRegressionModel
w/o intercept on dataset with constant nonzero column, spark.ml produce different model compared with R survival::survreg.We should output a warning message and clarify in document for this condition.
How was this patch tested?
Document change, no unit test.
cc @mengxr