Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: Use expm1(x) instead of exp(x) - 1 #15346

Closed
wants to merge 5 commits into from

Conversation

gaurav1086
Copy link

np.exp(1e-99) - 1
0.0
np.expm1(1e-99)
1e-99

@eric-wieser
Copy link
Member

@bashtage, can you comment on stream compatibility here?

@bashtage
Copy link
Contributor

Good idea in distributions but legacy should not change unless something isn't compiling. Doesn't really affect the stream except that in very rare cases one might see a different number. I'm not sure that standard exponential would ever produce a value where the precision difference would appear since there are only 2**53 ish distinct values possible.

@charris charris changed the title Use expm1(x) instead of exp(x) - 1 for precision MAINT: Use expm1(x) instead of exp(x) - 1 Jan 19, 2020
@bashtage
Copy link
Contributor

Are there numerical difference in reasonable sample sizes? As in any different values generated?

@miccoli
Copy link
Contributor

miccoli commented Jan 19, 2020

Doesn't really affect the stream except that in very rare cases one might see a different number. I'm not sure that standard exponential would ever produce a value where the precision difference would appear

I do not fully understand what this means, but in double precision, almost 50% of the samples drawn by pareto(a=1.1) would differ by 1 ULP or more, with respect to the old implementation.

@bashtage
Copy link
Contributor

I do not fully understand what this means, but in double precision, almost 50% of the samples drawn by pareto(a=1.1) would differ by 1 ULP or more, with respect to the old implementation.

I didn't have any idea for which values of expm1(x) and exp(x)-1 differ. The example at the top (1e-99)doesn't seem useful for random pareto, although It could matter for large values of a. I did a quick check and for a=1 the difference is there and so this seems like a good idea to me in distributions (but not for legacy_distributions).

@seberg
Copy link
Member

seberg commented Jan 21, 2020

@gaurav1086 would it be possible to add a test that would notice the loss of numerical precision if we changed things back?

@gaurav1086
Copy link
Author

@seberg , sure, I think that would be a good idea.

@seberg
Copy link
Member

seberg commented Jan 28, 2020

@gaurav1086 could you add the small test? Would be nice to finish this up, since it is an obvious improvement. Let me know if you need pointers for doing that.

@gaurav1086
Copy link
Author

@seberg , sorry for the delay. Will add it today. Thank you.

@gaurav1086
Copy link
Author

@seberg , added the test in numpy/numpy/random/tests/test_random.py

def test_pareto_expm1(self):
assert_(np.random.pareto(1e99) > 0.0)

Currently,

np.random.pareto(1e99)
0.0

After the change, it should be non-zero of very small magnitude. Please correct me if I am wrong. Thank you.

@seberg
Copy link
Member

seberg commented Jan 30, 2020

@gaurav1086 thanks, the test actually fails on master. On the up-side, it is supposed to fail, since np.random.pareto is the legacy implementation, which you hopefully did not change. You should test rng = np.random.default_rng() (or whatever is commonly used in the tests, it may be a different test file).
Also make sure that the test will never fail, so if there is a chance that 1e99 actually does return 0, please set a seed.

@gaurav1086
Copy link
Author

@seberg , I changed random_pareto() in distribution.c . The legacy was another change which I reverted back.

@seberg
Copy link
Member

seberg commented Feb 3, 2020

@gaurav1086 yes, but your test tests the legacy one, so it fails. Can you fix the test?

@mattip
Copy link
Member

mattip commented Feb 6, 2020

You should try running tests locally before pushing. Your fix does not work:
SeedSequence expects int or sequence of ints for entropy not 1e+99
You can run tests locally with:

python3 -mvenv /tmp/py_to_test
/tmp/py_to_test/bin/python -mpip install -r test_requirements.txt
/tmp/py_to_test/bin/python runtests.py

@gaurav1086 gaurav1086 closed this Feb 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants