Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to Math.exp, Math.log #1659

Closed
wants to merge 1 commit into from

Conversation

srowen
Copy link
Member

@srowen srowen commented Jul 30, 2014

In a few places in MLlib, an expression of the form log(1.0 + p) is evaluated. When p is so small that 1.0 + p == 1.0, the result is 0.0. However the correct answer is very near p. This is why Math.log1p exists.

Similarly for one instance of exp(m) - 1 in GraphX; there's a special Math.expm1 method.

While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible.

Also note the related PR for Python: #1652

@SparkQA
Copy link

SparkQA commented Jul 30, 2014

QA tests have started for PR 1659. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17444/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 30, 2014

QA results for PR 1659:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17444/consoleFull

@mengxr
Copy link
Contributor

mengxr commented Jul 30, 2014

LGTM. Merged into master. Thanks!!

@asfgit asfgit closed this in ee07541 Jul 30, 2014
@srowen srowen deleted the SPARK-2748 branch July 30, 2014 16:45
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…Math.exp, Math.log

In a few places in MLlib, an expression of the form `log(1.0 + p)` is evaluated. When p is so small that `1.0 + p == 1.0`, the result is 0.0. However the correct answer is very near `p`. This is why `Math.log1p` exists.

Similarly for one instance of `exp(m) - 1` in GraphX; there's a special `Math.expm1` method.

While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible.

Also note the related PR for Python: apache#1652

Author: Sean Owen <srowen@gmail.com>

Closes apache#1659 from srowen/SPARK-2748 and squashes the following commits:

c5926d4 [Sean Owen] Use log1p, expm1 for better precision for tiny arguments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants