Skip to content

[SPARK-17138][ML][MLib] Add Python API for multinomial logistic regression#14852

Closed
WeichenXu123 wants to merge 3 commits intoapache:masterfrom
WeichenXu123:add_MLOR_python
Closed

[SPARK-17138][ML][MLib] Add Python API for multinomial logistic regression#14852
WeichenXu123 wants to merge 3 commits intoapache:masterfrom
WeichenXu123:add_MLOR_python

Conversation

@WeichenXu123
Copy link
Contributor

@WeichenXu123 WeichenXu123 commented Aug 28, 2016

What changes were proposed in this pull request?

Add Python API for multinomial logistic regression.

  • add family param in python api.
  • expose coefficientMatrix and interceptVector for LogisticRegressionModel
  • add python-side testcase for multinomial logistic regression
  • update python doc.

How was this patch tested?

existing and added doc tests.

@SparkQA
Copy link

SparkQA commented Aug 28, 2016

Test build #64543 has finished for PR 14852 at commit 2ddb948.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class MultinomialLogisticRegression(JavaEstimator, HasFeaturesCol, HasLabelCol, HasPredictionCol, HasMaxIter,
    • class MultinomialLogisticRegressionModel(JavaModel, JavaClassificationModel, JavaMLWritable, JavaMLReadable):

@sethah
Copy link
Contributor

sethah commented Sep 20, 2016

Now that #14834 has been merged, we can make the updates to Python API. There is no new interface to implement, but it would be great if this PR could take care of updating the Python side to reflect that LOR supports multiclass now.

@WeichenXu123 WeichenXu123 changed the title [WIP][SPARK-17138][ML][MLib] Add Python API for multinomial logistic regression [SPARK-17138][ML][MLib] Add Python API for multinomial logistic regression Sep 21, 2016
@SparkQA
Copy link

SparkQA commented Sep 21, 2016

Test build #65721 has finished for PR 14852 at commit 0e9b423.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@WeichenXu123
Copy link
Contributor Author

cc @sethah @yanboliang thanks!

@yanboliang
Copy link
Contributor

@WeichenXu123 We should also expose coefficientMatrix and interceptVector for LogisticRegressionModel. Meanwhile, update the Python docs for supporting multinomial logistic (softmax) regression. Thanks.

@WeichenXu123
Copy link
Contributor Author

Done. thanks! @yanboliang

@SparkQA
Copy link

SparkQA commented Sep 23, 2016

Test build #65828 has finished for PR 14852 at commit d7b9967.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@sethah sethah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, only minor comments. Thanks!

"""
Logistic regression.
Currently, this class only supports binary classification.
This class supports binary and multinomial classification.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"This supports multinomial logistic (softmax) and binomial logistic regression."

return self._call_java("intercept")

@property
@since("2.1.0")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should document the behavior of coefficients and intercept for the multinomial case, which is that they will throw an error. You can see the scaladoc for reference.

DenseVector([5.5...])
>>> model.intercept
-2.68...
>>> lrm = LogisticRegression(maxIter=5, regParam=0.01, weightCol="weight",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use mlor/mlorModel for the multinomial and blor/blorModel for the variable names.

-2.68...
>>> lrm = LogisticRegression(maxIter=5, regParam=0.01, weightCol="weight",
... family="multinomial")
>>> modelm = lrm.fit(df)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be nice to fit this to an actual multiclass dataset, that way we can show that it works and produces a model with numClasses > 2.

@WeichenXu123
Copy link
Contributor Author

Done. thanks for careful review :) @sethah

@SparkQA
Copy link

SparkQA commented Sep 25, 2016

Test build #65885 has finished for PR 14852 at commit c47ac07.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • This class supports multinomial logistic (softmax) and binomial logistic regression.

@sethah
Copy link
Contributor

sethah commented Sep 26, 2016

LGTM. @yanboliang what do you think?

@yanboliang
Copy link
Contributor

LGTM2, merged into master. Thanks! @WeichenXu123 @sethah

@asfgit asfgit closed this in 7f16aff Sep 27, 2016
@WeichenXu123 WeichenXu123 deleted the add_MLOR_python branch April 24, 2019 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants