SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to Math.exp, Math.log #1659

srowen · 2014-07-30T11:40:33Z

In a few places in MLlib, an expression of the form log(1.0 + p) is evaluated. When p is so small that 1.0 + p == 1.0, the result is 0.0. However the correct answer is very near p. This is why Math.log1p exists.

Similarly for one instance of exp(m) - 1 in GraphX; there's a special Math.expm1 method.

While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible.

Also note the related PR for Python: #1652

SparkQA · 2014-07-30T11:44:00Z

QA tests have started for PR 1659. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17444/consoleFull

SparkQA · 2014-07-30T12:34:24Z

QA results for PR 1659:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17444/consoleFull

mengxr · 2014-07-30T15:55:26Z

LGTM. Merged into master. Thanks!!

…Math.exp, Math.log In a few places in MLlib, an expression of the form `log(1.0 + p)` is evaluated. When p is so small that `1.0 + p == 1.0`, the result is 0.0. However the correct answer is very near `p`. This is why `Math.log1p` exists. Similarly for one instance of `exp(m) - 1` in GraphX; there's a special `Math.expm1` method. While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible. Also note the related PR for Python: apache#1652 Author: Sean Owen <srowen@gmail.com> Closes apache#1659 from srowen/SPARK-2748 and squashes the following commits: c5926d4 [Sean Owen] Use log1p, expm1 for better precision for tiny arguments

Use log1p, expm1 for better precision for tiny arguments

c5926d4

srowen mentioned this pull request Jul 30, 2014

Avoid numerical instability #1652

Closed

asfgit closed this in ee07541 Jul 30, 2014

srowen deleted the SPARK-2748 branch July 30, 2014 16:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to Math.exp, Math.log #1659

SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to Math.exp, Math.log #1659

srowen commented Jul 30, 2014

SparkQA commented Jul 30, 2014

SparkQA commented Jul 30, 2014

mengxr commented Jul 30, 2014

SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to Math.exp, Math.log #1659

SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to Math.exp, Math.log #1659

Conversation

srowen commented Jul 30, 2014

SparkQA commented Jul 30, 2014

SparkQA commented Jul 30, 2014

mengxr commented Jul 30, 2014