-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid numerical instability #1652
Conversation
This avoids basically doing 1 - 1, for example: >>> from math import exp >>> margin = -40 >>> 1 - 1 / (1 + exp(margin)) 0.0 >>> exp(margin) / (1 + exp(margin)) 4.248354255291589e-18 >>>
Can one of the admins verify this patch? |
Y'know, there's a similar issue in
For -40, this gives 0.0, when really it should be about math.exp(-40) = 4.248354255291589e-18, since log(1+x) ~= x for very small x. This one can be fixed up with
I'll have a look for other instances beyond the 4 I see and open a JIRA? I could mention this PR too to bring it under one umbrella. |
See also https://issues.apache.org/jira/browse/SPARK-2748 and #1659 . This could be considered part of SPARK-2748. |
Jenkins, add to whitelist. |
Jenkins, test this please. |
QA tests have started for PR 1652. This patch merges cleanly. |
…Math.exp, Math.log In a few places in MLlib, an expression of the form `log(1.0 + p)` is evaluated. When p is so small that `1.0 + p == 1.0`, the result is 0.0. However the correct answer is very near `p`. This is why `Math.log1p` exists. Similarly for one instance of `exp(m) - 1` in GraphX; there's a special `Math.expm1` method. While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible. Also note the related PR for Python: #1652 Author: Sean Owen <srowen@gmail.com> Closes #1659 from srowen/SPARK-2748 and squashes the following commits: c5926d4 [Sean Owen] Use log1p, expm1 for better precision for tiny arguments
QA results for PR 1652: |
LGTM. Merged into master. Thanks! |
Awesome, thank you! :-) |
…Math.exp, Math.log In a few places in MLlib, an expression of the form `log(1.0 + p)` is evaluated. When p is so small that `1.0 + p == 1.0`, the result is 0.0. However the correct answer is very near `p`. This is why `Math.log1p` exists. Similarly for one instance of `exp(m) - 1` in GraphX; there's a special `Math.expm1` method. While the errors occur only for very small arguments, given their use in machine learning algorithms, this is entirely possible. Also note the related PR for Python: apache#1652 Author: Sean Owen <srowen@gmail.com> Closes apache#1659 from srowen/SPARK-2748 and squashes the following commits: c5926d4 [Sean Owen] Use log1p, expm1 for better precision for tiny arguments
This avoids basically doing 1 - 1, for example: ```python >>> from math import exp >>> margin = -40 >>> 1 - 1 / (1 + exp(margin)) 0.0 >>> exp(margin) / (1 + exp(margin)) 4.248354255291589e-18 >>> ``` Author: Naftali Harris <naftaliharris@gmail.com> Closes apache#1652 from naftaliharris/patch-2 and squashes the following commits: 0d55a9f [Naftali Harris] Avoid numerical instability
This avoids basically doing 1 - 1, for example: