New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-21241][MLlib]- Add setIntercept to StreamingLinearRegressionWi… #18457
Conversation
…thSGD in Pyspark. StreamingLinearRegressionWithSGD class in PySpark is missing the setIntercept Method which offers the possibility to turn on/off the intercept value. API parity is not respected between Python and Scala. We add the setIntercept Method to StreamingLinearRegressionWithSGD class which calls setIntercept Method in LinearRegressionModel class in order to turn on/off the intercept. A big thanks to Matthieu CANEILL for his precious help in solving the issue. This patch was tested by running all tests with ./dev/run-tests and by manual tests.
So we're only really doing bug fixes on the old MLlib stuff, but I guess we haven't finalized the new ML streaming stuff so I'm not sure what we want to do here. cc @MLnick . |
In the meantime Jenkins OK to test. |
Jenkins, test this plase. |
err Jenkins test this please. |
Test build #83986 has finished for PR 18457 at commit
|
ping @SoulGuedria. Seems we should fix the python style. |
Yes i will do it very soon :) Thanks |
So MLlib is even further into maintenance mode, do we want to take improvement parity patches like these? cc @MLnick @jkbradley @HyukjinKwon ? I'm leaning towards no and focusing on ML but it's a mild lean. |
cc @HyukjinKwon can you take a look at this please. Thanks :) |
No, we should not merge this. The Model class should not let you set the intercept. This doesn't exist in Scala either. The algorithm implementation does though. |
Sounds like we're not going to change this @SoulGuedria but we'd love your contributions in Spark ML where things are actively being developed. |
Would you be OK closing this PR @SoulGuedria |
Can one of the admins verify this patch? |
…thSGD in Pyspark.
What changes were proposed in this pull request?
StreamingLinearRegressionWithSGD class in PySpark is missing the setIntercept Method which offers the possibility to turn on/off the intercept value. API parity is not respected between Python and Scala. We added the setIntercept Method to StreamingLinearRegressionWithSGD class which calls setIntercept Method in LinearRegressionModel class in order to turn on/off the intercept. A big thanks to Matthieu CANEILL for his precious help in solving this issue.
How was this patch tested?
This patch was tested by running all tests with ./dev/run-tests.