-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dual Averaging (with default setting suggested by NUTS paper) doesn't work well in transformed space #182
Comments
@xukai92 Is it possible to isolate the numerical issue for transformed variables first? This would allow us to first fix the transform (ie. r = link(x, Dirichlet(...)) # apply transform
# perform some changes to `r`, e.g. simulating Hamiltonian dynamics on `r`
r2 = ...
x2 = invlink(r, Dirichlet(...)) # some elements of `x2` are `NaN` |
Yes. I guess if we solve the numerical problem here then we are fine at the moment. I further looked into the issue and found that the numerical issue only happens to the dual part of dual numbers. The below is the function invlink(d::SimplexDistribution, y::Vector)
K = length(y)
T = typeof(y[1])
z = [invlogit(y[k] + log(1 / (K - k))) for k in 1:K-1]
x = Vector{T}(K)
for k in 1:K-1
x[k] = (1 - sum(x[1:k-1])) * z[k]
end
x[K] = 1 - sum(x[1:K-1])
x
end My guess is that |
I further locate the problem should origin from |
LATEST The original cause of this issue is due to the fact that between link and invlink the constrained variable lose some precision, which then make the probability vector to be something like [1, 0, 0, 0], which is actually invalid to Dirichlet because there shouldn't be 0s in the probability vector of Dirichlet. When we call |
using Turing, Distributions
# Our version of invlogit
invlogit(x) = 1.0 ./ (exp(-x) + 1.0)
# log invlogit
loginvlogit_unstable(x) = log(invlogit(x))
# Numerically stable version of log invlogit
# See e.g. https://lingpipe-blog.com/2012/02/16/howprevent-overflow-underflow-logistic-regression/
loginvlogit(x) = -logsumexp([0, -x])
# julia> loginvlogit_unstable(-750)
# -Inf
# julia> loginvlogit(-750)
# -750.0
function loginvlink(d::Dirichlet, y::Vector)
K = length(y)
T = typeof(y[1])
z = [loginvlogit(y[k] - log(K - k)) for k in 1:K-1]
x = Vector{T}(K)
for k in 1:K-1
x[k] = log(-expm1(logsumexp(x[1:k-1]))) + z[k]
end
x[K] = log(-expm1(logsumexp(x[1:K-1])))
x
end
# loginvlink(Dirichlet([1/3, 1/3, 1/3]), [-1000.0, -1000, -1000])
# 3-element Array{Float64,1}:
# -1000.69
# -1000.0
# 0.0
# julia> log(Turing.invlink(Dirichlet([1/3, 1/3, 1/3]), [-1000.0, -1000, -1000]))
# 3-element Array{Float64,1}:
# -Inf
# -Inf
# 0.0 |
Looks great! But we still need to |
NOTE The logpdf of the Dirichlet distribution from the Distributions package function _logpdf{T<:Real}(d::Dirichlet, x::AbstractVector{T})
a = d.alpha
s = 0.
for i in 1:length(a)
@inbounds s += (a[i] - 1.0) * log(x[i])
end
return s - d.lmnB
end |
It's OK to take the We might also need a special version of the |
I see. I guess another reason here we have the If there is a s += (a[i] - 1.0) * log(x[i]) We will have |
Closed by PR #199 |
The default setting for HMC with Dual Averaging method seems not work well with constraints. 49b90e9 adds a file to reproduce the situation in
test/hmcda_nb.jl
.To see the situation, it's simple way is to changed the Line 110 of
samplers/hmcda.jl
to a normal print functionprintln("leapfrog for $τ steps with step size $ϵ")
and run the test withjulia test/hmcda_nb.jl
. It will then show the step size before each leap frog loop, i.e. the one before the error. It can be seen that the first step size adapted is always around 2.7, which is controlled by the initial Dual Averaging parameters and the target accept rate 0.95 (which is already high to reduce the step size).The text was updated successfully, but these errors were encountered: