Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invlink for simplex distributions #17

Closed
trappmartin opened this issue Feb 22, 2019 · 9 comments
Closed

invlink for simplex distributions #17

trappmartin opened this issue Feb 22, 2019 · 9 comments

Comments

@trappmartin
Copy link
Member

Hi,

I've played around with the invlink function for simplex distributions and checked the correctness of the current implementation against the code used in Stan. (https://github.com/stan-dev/math/blob/502487c511594ccb93eb979df5b8fe163becb417/stan/math/rev/mat/fun/simplex_constrain.hpp)

The reimplementation I used is:

function invlink_simplex(y::AbstractVector{T}) where {T<:Real}
    N = length(y)
    stick = one(T)
    x = zeros(T, length(y))
    for k in 1:(N-1)
        lk = map(T, log(N - k))
        z = logistic(y[k] - lk)
        x[k] = stick * z
        stick -= x[k]
    end
    x[N] = stick
    return x
end

and I found quite a bit of speed improvement if this implementation is used...

K = 1000
y = rand(K)
d = Dirichlet(K, 1.0)
​
@btime invlink_simplex(y);
@btime invlink(d, y);
  77.077 μs (1 allocation: 7.94 KiB)
  369.305 μs (1 allocation: 7.94 KiB)

Should I make a PR for this or do we think there are stability issues with the code?

@cpfiffer
Copy link
Member

cpfiffer commented Feb 22, 2019

It looks to me like it's stable, but Mohamed the Master of Type Stability probably has a far better eye for that than me.

@mohamed82008
Copy link
Member

mohamed82008 commented Feb 22, 2019

Hi @trappmartin !

Your implementation looks type stable but I think we will run into numerical issues when inverting this. The main numerical stability problems were from link, invlink was just modified accordingly to be its inverse. I do think however there is a performance issue in Bijectors. I looked into it, and it seems the @debug sentences are causing some type instability. When removing them, and benchmarking I get:

julia> @btime invlink_simplex($y);
39.384 μs (1 allocation: 7.94 KiB)

julia> @btime invlink($d, $y);
43.395 μs (1 allocation: 7.94 KiB)

I assume the extra time is from the extra epsilon arithmetic that is done. The real difference is probably more when you use @inbounds. So I guess what it boils down to is if we are willing to throw away the inverse property between link and invlink, then we can gain some extra performance with your implementation of invlink.

@mohamed82008
Copy link
Member

I will also make a PR to fix the @debug issues.

@trappmartin
Copy link
Member Author

trappmartin commented Feb 23, 2019

Sounds good. I wasn’t aiming for performance with my implementation but was surprised that the invlink is so much slower.

@trappmartin
Copy link
Member Author

Maybe as a side note, it might be good to have internal functions that do not require to pass a distribution object. This would allow a knowledgeable user to use the transformations without instantiating a Distributions object. Similar to the StatsFuns package.

@trappmartin
Copy link
Member Author

trappmartin commented Feb 23, 2019

One last note. I think the invlink should be at least as fast as my code as I don’t optimise anything and invlink looks rather tuned. Your test shows that even after removing the @debug the invlink is still slower. I guess we should think about improving the code as this gets called frequently during sampling.

@mohamed82008
Copy link
Member

@trappmartin The slowdown is from the epsilons to make invlink a proper inverse of the numerically stable link. Are you proposing making invlink as fast as possible even if it is not a good inverse of the numerically stable link?

@trappmartin
Copy link
Member Author

No, I propose we try to get to code faster while keeping it numerical stable.

@mohamed82008
Copy link
Member

mohamed82008 commented Feb 24, 2019

Done, #20. The main improvement came from replacing log(1/x) with -log(x) thus eliminating an unnecessary division. New benchmarks:

julia> @btime invlink_simplex($y);
39.384 μs (1 allocation: 7.94 KiB)

julia> @btime invlink($d, $y);
39.020 μs (1 allocation: 7.94 KiB)                                                                                                                                                                                                                                                                                          

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants