Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7279] Removed diffSum which is theoretical zero in LinearRegression and coding formating #5809

Closed
wants to merge 2 commits into from

Conversation

dbtsai
Copy link
Member

@dbtsai dbtsai commented Apr 30, 2015

No description provided.

@dbtsai
Copy link
Member Author

dbtsai commented Apr 30, 2015

Jenkins, please test again.

@mengxr
Copy link
Contributor

mengxr commented Apr 30, 2015

test this please

@@ -350,7 +345,6 @@ private class LeastSquaresAggregator(
}
}
lossSum += diff * diff / 2.0
diffSum += diff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically you're saying this isn't read at all, which seems correct. I don't know if it's 0? (Haven't thought about it.) The rest looks like style changes which seems OK, but why remove the Array type, but keep the Long type, when you are making the literal a long explicitly?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I started to implement this code, I thought diffSum will not be zero. @mengxr caught that I don't add the correction terms back to the gradient correctly, but I still get the right models compared with R that made me go through the math again, and I found diffSum is always zero theoretically. Here is the proof added in the comment
15995c8

Array type is obvious since it's Array.ofDim. I did consider to remove the Long type, but in the end, I worry about someone can mistakenly remove L from 0L, so I decided to keep it.

@SparkQA
Copy link

SparkQA commented Apr 30, 2015

Test build #31455 has finished for PR 5809 at commit 03511c7.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@SparkQA
Copy link

SparkQA commented Apr 30, 2015

Test build #31456 has finished for PR 5809 at commit 6904eed.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.
  • This patch does not change any dependencies.

@mengxr
Copy link
Contributor

mengxr commented Apr 30, 2015

LGTM. Merged into master. Thanks!

@asfgit asfgit closed this in 1c3e402 Apr 30, 2015
@dbtsai dbtsai deleted the format branch April 30, 2015 23:57
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…ssion and coding formating

Author: DB Tsai <dbt@netflix.com>

Closes apache#5809 from dbtsai/format and squashes the following commits:

6904eed [DB Tsai] triger jenkins
9146e19 [DB Tsai] initial commit
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…ssion and coding formating

Author: DB Tsai <dbt@netflix.com>

Closes apache#5809 from dbtsai/format and squashes the following commits:

6904eed [DB Tsai] triger jenkins
9146e19 [DB Tsai] initial commit
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…ssion and coding formating

Author: DB Tsai <dbt@netflix.com>

Closes apache#5809 from dbtsai/format and squashes the following commits:

6904eed [DB Tsai] triger jenkins
9146e19 [DB Tsai] initial commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants