Numerical issue with random walk for Dirichlet #12

xukai92 · 2019-01-03T01:07:28Z

I've noticed a numeric issue when using this package with random walk algorithm. The code below is the minimal example to produce the bug:

using Distributions, Bijectors
dim = 10
dist = Dirichlet(dim, 2.0)
stepsize = 1e10
# Below gives DomainError with high probability
[logpdf_with_trans(dist, invlink(dist, link(dist, rand(dist)) .+ randn(dim) .* stepsize), true) for _ in 1:1_000]

I don't know if a stepsize of 10 is too high or not but it seems to be possible to have this kind of values during adaptation.

yebai · 2019-01-03T15:54:21Z

stepsize of 10 is quite big actually, did you look at adapted epsilon in HMC?

xukai92 · 2019-01-03T17:17:44Z

Just checked - this happens when the step size is tuned to something like 5.75.

yebai · 2019-01-03T17:24:09Z

I think NUTS adaption needs some further improvement. Before that happens, perhaps we can heuristically upper bound step size, e.g. < 2?

Dirichlet distribution is a very nasty one for transforms, I suspect it'll be hard to make it stable in all cases. If it occasionally throws off the edges, we can try to reject a Hamiltonian simulation.

xukai92 · 2019-01-03T17:38:49Z

I think NUTS adaption needs some further improvement. Before that happens, perhaps we can heuristically upper bound step size, e.g. < 2?

2 still seems to be too high. 1 works.

Dirichlet distribution is a very nasty one for transforms, I suspect it'll be hard to make it stable in all cases. If it occasionally throws off the edges, we can try to reject a Hamiltonian simulation.

Then we need to do a catch-exception which is slow. Alternatively we can make logpdf_with_trans return -Inf by checking terms inside log before evaluation?

yebai · 2019-01-03T17:50:22Z

Then we need to do a catch-exception which is slow. Alternatively, we can make logpdf_with_trans return -Inf by checking terms inside log before evaluation?

Sounds good to me!

mohamed82008 · 2019-01-04T11:10:45Z

We can just do log(max(0, x)) everywhere there is log(x). The output may then be -Inf or NaN if we divide 2 Infs by each other for some reason. So outside we can check the output with isnan and isinf.

yebai · 2019-01-09T22:04:26Z

This sounds good to me. Perhaps give it a try and see whether it can fix TuringLang/Turing.jl#621?

xukai92 · 2019-01-10T14:23:48Z

@yebai I will take a try today

xukai92 · 2019-01-11T02:34:55Z

@mohamed82008 @yebai great it works

mohamed82008 · 2019-01-12T08:22:00Z

Ok the root cause of this error AFAICT is https://github.com/TuringLang/Bijectors.jl/blob/master/src/Bijectors.jl#L221. The StatsFuns.logistic(x) function is way too small even for x = -1e2, and is pretty much 0 for -1e4 and higher. The easiest fix is to offset it by eps. I can try that and see if it replaces the need for the workaround in #13.

mohamed82008 · 2019-01-12T16:24:04Z

Ok, so after trying so many things, I came to a realization that we cannot support all of the Eucledian space with the invlink transformation while keeping the inverse relationship between link and invlink. As of #9 , we have been using ϵ to help stay away from the ugly -Inf and Inf of logit. This inevitably means that we are mapping all valid points in the support of the Dirichlet distribution to a finite subspace of the Eucledian space. So then what happens when you try to walk back from a point very far in the Eucledian space to the support of the distribution? The unthinkable. If that point is far beyond the friendly part of the Eucledian space, the logistic function starts yelling at us by returning 0 and 1. This is alarming, because inverting that with logit will give us the ugly -Inf and Inf back. In order words, one fix to the error in this issue is to truncate the output of logistic to be between ϵ and 1 - ϵ. And we have to truncate, not recenter, if we want to keep the inverse relationship between link and invlink valid for the friendly part of the Eucledian space. If we are willing to make the inverse property of link and invlink approximate, then we can recenter the output of logistic using z = z * (1 - 2ϵ) + ϵ. More directly, and I found that it works better in practice, is to truncate all x[k]s to be between 0 and 1. This condition is a direct consequence of sticking to the friendly part of the Eucledian space, but can be violated when using values outside the friendly region. Excuse my layman terms.

I will make a PR to get your feedback.

Fix numerical error in #12

xukai92 mentioned this issue Jan 3, 2019

Add more tests for single distributions TuringLang/Turing.jl#621

Merged

3 tasks

xukai92 added a commit that referenced this issue Jan 11, 2019

implement mohamed's solution (#12#issuecomment-451416440)

521c9aa

xukai92 mentioned this issue Jan 11, 2019

implement mohamed's solution (#12#issuecomment-451416440) #13

Closed

mohamed82008 mentioned this issue Jan 12, 2019

Fix numerical error in #12 #14

Merged

yebai closed this as completed in ecb7e94 Jan 15, 2019

yebai added a commit that referenced this issue Jan 15, 2019

Merge pull request #14 from TuringLang/mt/fix_12

dbdac12

Fix numerical error in #12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numerical issue with random walk for Dirichlet #12

Numerical issue with random walk for Dirichlet #12

xukai92 commented Jan 3, 2019

yebai commented Jan 3, 2019

xukai92 commented Jan 3, 2019

yebai commented Jan 3, 2019 •

edited

Loading

xukai92 commented Jan 3, 2019

yebai commented Jan 3, 2019 •

edited

Loading

mohamed82008 commented Jan 4, 2019

yebai commented Jan 9, 2019 •

edited

Loading

xukai92 commented Jan 10, 2019

xukai92 commented Jan 11, 2019

mohamed82008 commented Jan 12, 2019 •

edited

Loading

mohamed82008 commented Jan 12, 2019 •

edited

Loading

Numerical issue with random walk for Dirichlet #12

Numerical issue with random walk for Dirichlet #12

Comments

xukai92 commented Jan 3, 2019

yebai commented Jan 3, 2019

xukai92 commented Jan 3, 2019

yebai commented Jan 3, 2019 • edited Loading

xukai92 commented Jan 3, 2019

yebai commented Jan 3, 2019 • edited Loading

mohamed82008 commented Jan 4, 2019

yebai commented Jan 9, 2019 • edited Loading

xukai92 commented Jan 10, 2019

xukai92 commented Jan 11, 2019

mohamed82008 commented Jan 12, 2019 • edited Loading

mohamed82008 commented Jan 12, 2019 • edited Loading

yebai commented Jan 3, 2019 •

edited

Loading

yebai commented Jan 3, 2019 •

edited

Loading

yebai commented Jan 9, 2019 •

edited

Loading

mohamed82008 commented Jan 12, 2019 •

edited

Loading

mohamed82008 commented Jan 12, 2019 •

edited

Loading