Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-9005][MLlib]Fix RegressionMetrics computation of explainedVariance #7361

Conversation

feynmanliang
Copy link
Contributor

Fixes implementation of explainedVariance and r2 to be consistent with their definitions as described in SPARK-9005.

@SparkQA
Copy link

SparkQA commented Jul 13, 2015

Test build #37104 has finished for PR 7361 at commit 4c4e56f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 13, 2015

Test build #37106 has finished for PR 7361 at commit bde9761.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 13, 2015

Test build #37109 has finished for PR 7361 at commit db8605a.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 13, 2015

Test build #37118 has finished for PR 7361 at commit 08a0e1b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val yMean = summary.mean(0)
predictionAndObservations.map {
case (prediction, _) => math.pow(prediction - yMean, 2)
}.reduce(_ + _)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.sum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@SparkQA
Copy link

SparkQA commented Jul 13, 2015

Test build #37140 has finished for PR 7361 at commit 1a3d098.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@feynmanliang
Copy link
Contributor Author

jenkins retest this please

@SparkQA
Copy link

SparkQA commented Jul 13, 2015

Test build #37149 has finished for PR 7361 at commit 1a3d098.

  • This patch fails Python style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

* explainedVariance = 1 - variance(y - \hat{y}) / variance(y)
* Reference: [[http://en.wikipedia.org/wiki/Explained_variation]]
* Returns the variance explained by regression.
* @see [[https://en.wikipedia.org/wiki/Fraction_of_variance_unexplained]]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you put the formula here please?

@jkbradley
Copy link
Member

LGTM pending tests (except for missing doc)

@SparkQA
Copy link

SparkQA commented Jul 14, 2015

Test build #37174 has finished for PR 7361 at commit f1112fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jul 15, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Jul 15, 2015

Test build #1070 has finished for PR 7361 at commit f1112fc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jkbradley
Copy link
Member

@feynmanliang Could you please modify the title and description so they just say that explainedVariance is being fixed? (r2 is not changing, right?)

@feynmanliang feynmanliang changed the title [SPARK-9005][MLlib]Fix RegressionMetrics computation of explainedVariance and r2 [SPARK-9005][MLlib]Fix RegressionMetrics computation of explainedVariance Jul 15, 2015
@jkbradley
Copy link
Member

Merging with master
Thanks!

@asfgit asfgit closed this in 536533c Jul 15, 2015
@feynmanliang feynmanliang deleted the SPARK-9005-RegressionMetrics-bugs branch July 19, 2015 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants