Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13686][MLLIB][STREAMING] Add a constructor parameter reqParam to (Streaming)LinearRegressionWithSGD #11527

Closed
wants to merge 1 commit into from

Conversation

dongjoon-hyun
Copy link
Member

What changes were proposed in this pull request?

LinearRegressionWithSGD and StreamingLinearRegressionWithSGD does not have regParam as their constructor arguments. They just depends on GradientDescent's default reqParam values.
To be consistent with other algorithms, we had better add them. The same default value is used.

How was this patch tested?

Pass the existing unit test.

@@ -141,7 +143,7 @@ object LinearRegressionWithSGD {
stepSize: Double,
miniBatchFraction: Double,
initialWeights: Vector): LinearRegressionModel = {
new LinearRegressionWithSGD(stepSize, numIterations, miniBatchFraction)
new LinearRegressionWithSGD(stepSize, numIterations, 0.0, miniBatchFraction)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same way with LogisticRegressionWithSGD. The signature of train function is not changed.

@srowen
Copy link
Member

srowen commented Mar 5, 2016

I think that's good, but would like @jkbradley or @mengxr to approve

@srowen
Copy link
Member

srowen commented Mar 5, 2016

Jenkins test this please

@SparkQA
Copy link

SparkQA commented Mar 5, 2016

Test build #52514 has finished for PR 11527 at commit 4dd34c4.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @srowen .
I want to fix MiMa failures, but ./dev/mima passes locally whenever I run it locally.
So far, I didn't touch project/MimaExcludes.scala yet.
How can I check MiMa properly?

Jenkins, retest this please.

@dongjoon-hyun
Copy link
Member Author

Oh, sorry. I found that. Thanks. :)

@dongjoon-hyun
Copy link
Member Author

retest this please

2 similar comments
@dongjoon-hyun
Copy link
Member Author

retest this please

@dongjoon-hyun
Copy link
Member Author

retest this please

@srowen
Copy link
Member

srowen commented Mar 6, 2016

Jenkins add to whitelist

@srowen
Copy link
Member

srowen commented Mar 6, 2016

Jenkins test this please

@dongjoon-hyun
Copy link
Member Author

Thank you, @srowen !

@SparkQA
Copy link

SparkQA commented Mar 6, 2016

Test build #52530 has finished for PR 11527 at commit 92be84f.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

Hi, @mengxr and @jkbradley .
Could you review this PR, please?

@dongjoon-hyun
Copy link
Member Author

Hi, @mengxr .
Could you review this PR, please?
After #11519 , I found that the missing reqParam argument in some APIs.

@dongjoon-hyun
Copy link
Member Author

Rebased.

@SparkQA
Copy link

SparkQA commented Mar 11, 2016

Test build #52870 has finished for PR 11527 at commit 2b1b8a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Mar 13, 2016

@mengxr @jkbradley was the intent that you just set the reg param on the optimizer object? or do you agree this should be exposed as a setter? I am not sure enough to know this should be added.

@dongjoon-hyun
Copy link
Member Author

Right, as you said, we can use optimizer object directly. I thought that's the main reason why this PR doesn't get any response from ML part committers.

At the first time, what I focused was the consistency of API among algorithms. Other algorithms having reqParam provides in this manner and also maintains its own reqParam value . You may remember the list of reqParam values I made before. So, I want to make it complete by adding two missing part.

But, what I'm not sure here is Spark MLLib's direction. As you said, the use of optimizer is more recommended way now and in the future. I think this PR should be closed. If I can get some clear advice for this before closing this PR, I'll really be happy. :)

@SparkQA
Copy link

SparkQA commented Mar 13, 2016

Test build #53030 has finished for PR 11527 at commit 6b6a11f.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

…` to (Streaming)LinearRegressionWithSGD

`LinearRegressionWithSGD` and `StreamingLinearRegressionWithSGD` does not have `regParam` as their constructor arguments. They just depends on GradientDescent's default reqParam values.
To be consistent with other algorithms, we had better add them.
@SparkQA
Copy link

SparkQA commented Mar 13, 2016

Test build #53031 has finished for PR 11527 at commit f0deb4a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented Mar 14, 2016

We discussed this during the pipeline API design and agreed that we shouldn't separate model hyper-parameters from training parameters. So this is the right place to set regParam. I'm going to merge this PR.

@@ -318,6 +318,9 @@ object MimaExcludes {
ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.mllib.evaluation.MultilabelMetrics.this"),
ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.classification.LogisticRegressionSummary.predictions"),
ProblemFilters.exclude[MissingMethodProblem]("org.apache.spark.ml.classification.LogisticRegressionSummary.predictions")
) ++ Seq(
// [SPARK-13686][MLLIB][STREAMING] Add a constructor parameter `reqParam` to (Streaming)LinearRegressionWithSGD
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: reg instead of req.

@mengxr
Copy link
Contributor

mengxr commented Mar 14, 2016

Merged into master. Thanks! Btw, it should be regParam instead of reqParam. It only appears in the PR title and MiMa excludes. So I thought it is a minor issue and merged this PR directly.

@dongjoon-hyun
Copy link
Member Author

Oh, thank you so much for your attention/reviewing/merging, @mengxr !

@asfgit asfgit closed this in a48296f Mar 14, 2016
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Mar 17, 2016
…` to (Streaming)LinearRegressionWithSGD

## What changes were proposed in this pull request?

`LinearRegressionWithSGD` and `StreamingLinearRegressionWithSGD` does not have `regParam` as their constructor arguments. They just depends on GradientDescent's default reqParam values.
To be consistent with other algorithms, we had better add them. The same default value is used.

## How was this patch tested?

Pass the existing unit test.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes apache#11527 from dongjoon-hyun/SPARK-13686.
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
…` to (Streaming)LinearRegressionWithSGD

## What changes were proposed in this pull request?

`LinearRegressionWithSGD` and `StreamingLinearRegressionWithSGD` does not have `regParam` as their constructor arguments. They just depends on GradientDescent's default reqParam values.
To be consistent with other algorithms, we had better add them. The same default value is used.

## How was this patch tested?

Pass the existing unit test.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes apache#11527 from dongjoon-hyun/SPARK-13686.
@dongjoon-hyun dongjoon-hyun deleted the SPARK-13686 branch March 29, 2016 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants