Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Composed improvements #46

Closed
torfjelde opened this issue Sep 15, 2019 · 6 comments
Closed

Composed improvements #46

torfjelde opened this issue Sep 15, 2019 · 6 comments

Comments

@torfjelde
Copy link
Member

torfjelde commented Sep 15, 2019

I've re-visted some ideas I had earlier on but didn't have quite the knowledge nor time to explore. As a result I think I've come up with some really nice improvements to the Composed bijector.

The changes suggested here already exists on a local branch of mine and tests are passing without issues. So if people seem happy with this I'll make a PR asap.

Nested compositions is bleh

We're currently not doing anything to "flatten" the compositions, e.g.

julia> using Bijectors

julia> b = Bijectors.Exp()
Bijectors.Exp()                                                                                                                                                                                                                                                                
                                                                                                                                                                                                                                                                               
julia> map(typeof, (b  b  b).ts)
(Bijectors.Exp, Composed{Tuple{Bijectors.Exp,Bijectors.Exp}})

Notice that it we get nested compositions here.

Personally I'd like to encourage a flat structure, so we instead get:

julia> map(typeof, (b  b  b).ts)
(Bijectors.Exp, Bijectors.Exp, Bijectors.Exp)

The above was accomplished by simply introducing the following lines:

(b1::Bijector, b2::Bijector) = composel(b2, b1)  # <= already exists
(b1::Composed, b2::Bijector) = composel(b2, b1.ts...)       # new 
(b1::Bijector, b2::Composed) = composel(b2.ts..., b1)       # new
(b1::Composed, b2::Composed) = composel(b2.ts..., b1.ts...) # new

If the user really wants to use nested compositions rather than this flat structure, they can call composel and/or composer explicitly. I struggle to see a case where you'd like to group bijectors together but wouldn't already have access to this grouping outside of the composition but I'd still like to allow both possibilities. This solution does exactly that.

Other than just being "nice", it has two advantages over nesting:

  1. Dispatching on different compositions is easier since you only need to consider Composed{Tuple{B1, B2, B3}} rather than all combinations of nestedness.
  2. Helps with performance

Recursive implementations ain't so good

Currently we're using a recursive implementation for the following methods for Composed:

  • (cb::Composed)(x)
  • forward(cb::Composed, x)
  • logabsdetjac(cb::Composed, x)

For small compositions, this isn't really an issue but still sub-optimal. Especially in forward and logabsdetjac where we need intermediate values to split apart the forward result. This can't be done recursively without memory allocation for these intermediate values.

One alternative, which is further improved by encouraging a flat structure as described earlier, is to unroll the recursion using @generated:

@generated function forward_generated(cb::Composed{T}, x) where {T<:Tuple}
    expr = Expr(:block)
    push!(expr.args, :((y, logjac) = forward(cb.ts[1], x)))
    for i = 2:length(T.parameters)
        push!(expr.args, :(res = forward(cb.ts[$i], y)))
        push!(expr.args, :(y = res.rv))
        push!(expr.args, :(logjac += res.logabsdetjac))
    end
    push!(expr.args, :(return (rv = y, logabsdetjac = logjac)))

    return expr
end

For the code

b = Bijectors.Exp()
forward(b  b  b, randn())

we get the generated code

quote
    (y, logjac) = forward(cb.ts[1], x)
    res = forward(cb.ts[2], y)
    y = res.rv
    logjac += res.logabsdetjac
    res = forward(cb.ts[3], y)
    y = res.rv
    logjac += res.logabsdetjac
    return (rv = y, logabsdetjac = logjac)
end

Comparing this to the current recursive implementation to new forward_generated:

julia> using Revise, Bijectors, BenchmarkTools

julia> using Bijectors: Log, Exp

julia> b = Bijectors.Exp()
Exp()

julia> cb = inv(b)  b
Composed{Tuple{Exp,Log}}((Exp(), Log()))

julia> x = randn()
1.59358111703819

julia> @btime Bijectors.forward($cb, $x)
  28.426 ns (0 allocations: 0 bytes)
(rv = 1.59358111703819, logabsdetjac = 0.0)

julia> @btime Bijectors.forward_gen($cb, $x)
  28.788 ns (0 allocations: 0 bytes)
(rv = 1.59358111703819, logabsdetjac = 0.0)

So not much gained here. But if we consider deeply nested compositions the difference becomes quite singificant:

julia> cb = foldl(, [inv(b)  b for i = 1:50])
Composed{Tuple{Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log,Exp,Log}}((Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log(), Exp(), Log()))

julia> @btime Bijectors.forward($cb, $x)
  431.207 μs (201 allocations: 6.25 KiB)
(rv = 1.59358111703819, logabsdetjac = 0.0)

julia> @btime Bijectors.forward_gen($cb, $x)
  1.934 μs (0 allocations: 0 bytes)
(rv = 1.59358111703819, logabsdetjac = 0.0)

I've also consider @inline for the recursive implementation but for large cases compile-times are quite bad. So I think it's pretty clear which is the winner:)

Neat composition rules for extra neatness

# For extra neat-ness for a lot of bijectors:)
(b1::B, b2::Inversed{B}) where {B<:Bijector} = Identity()
(b1::Inversed{B}, b2::B) where {B<:Bijector} = Identity()

# Can also implement on case-by-case basis where we don't use `Inversed`
macro inverses(B1, B2)
    expr = Expr(:block)
    push!(expr.args, :(Base.:(b1::$B1, b2::$B2) = Identity()))
    push!(expr.args, :(Base.:(b1::$B2, b2::$B1) = Identity()))
    return expr
end

@inverses Log Exp  # super-easy to declare the inverses

# And these
(::Identity, ::Identity) = Identity()
(::Identity, b::Bijector) = b
(b::Bijector, ::Identity) = b

then

julia> using Bijectors: Log, Exp

julia> Log()  Exp()
Bijectors.Identity()

EDIT: forgot to use @btime forward($cb, $x), etc. as @mohamed82008 pointed out.

@torfjelde torfjelde changed the title Composed design Composed improvements Sep 15, 2019
@mohamed82008
Copy link
Member

When you time with @btime use $, e.g. @btime Bijectors.forward($cb, $x)

@xukai92
Copy link
Member

xukai92 commented Sep 15, 2019

Looks great to me.

BTW can someone explain a bit on the benchmark time difference with and without $?

@willtebbutt
Copy link
Member

Nice work @torfjelde . Could you provide some benchmarks for vector-valued transformations? I would imagine that these are where we're most likely to see deeply nested structures, so it would be interesting to know if the performance gains persist here or not.

@torfjelde
Copy link
Member Author

It does but actual transformation-cost of course reduces the difference:

julia> using Bijectors: PlanarLayer

julia> b = PlanarLayer(10);

julia> x = randn(10);

julia> @btime Bijectors.forward($cb, $x)
  2.912 μs (47 allocations: 4.63 KiB)
(rv = [-0.226818, -5.05146, -1.45391, 2.3923, 2.71439, 1.76895, -0.988026, 1.12392, 3.00605, -2.77136], logabsdetjac = -0.019665352458356412)

julia> @btime Bijectors.forward_gen($cb, $x)
  2.922 μs (47 allocations: 4.63 KiB)
(rv = [-0.226818, -5.05146, -1.45391, 2.3923, 2.71439, 1.76895, -0.988026, 1.12392, 3.00605, -2.77136], logabsdetjac = -0.019665352458356412)

julia> @btime Bijectors.forward_it($cb, $x)
  3.305 μs (49 allocations: 4.73 KiB)
(rv = [-0.226818, -5.05146, -1.45391, 2.3923, 2.71439, 1.76895, -0.988026, 1.12392, 3.00605, -2.77136], logabsdetjac = -0.019665352458356412)

and for deep ones where recursion is no longer inlined:

julia> cb = foldl(, [b for i = 1:50]);

julia> @btime Bijectors.forward($cb, $x)
  116.340 μs (1299 allocations: 128.05 KiB)
(rv = [0.56373, -8.38962, -2.83878, 3.42109, 4.44716, 2.13511, -0.910896, 2.02499, 4.82258, -3.82649], logabsdetjac = -168.86605036964167)

julia> @btime Bijectors.forward_gen($cb, $x)
  70.403 μs (1151 allocations: 114.88 KiB)
(rv = [0.56373, -8.38962, -2.83878, 3.42109, 4.44716, 2.13511, -0.910896, 2.02499, 4.82258, -3.82649], logabsdetjac = -168.86605036964167)

julia> @btime Bijectors.forward_it($cb, $x)
  75.723 μs (1153 allocations: 115.84 KiB)
(rv = [0.56373, -8.38962, -2.83878, 3.42109, 4.44716, 2.13511, -0.910896, 2.02499, 4.82258, -3.82649], logabsdetjac = -168.86605036964167)

A further test, also including an extreme case of Stacked from #36 (which also uses @generated for forward to get type-stability)

julia> using Bijectors, BenchmarkTools

julia> using Bijectors: Logit

julia> D = 100
100

julia> sb = Stacked(tuple([Logit(0.0, 1.0) for i = 1:D]...));

julia> bs = [
           ("PlanarLayer($D)", PlanarLayer(D)),
           ("RadialLayer($D)", RadialLayer(D)),
           ("Stacked{Logit ∘ inv(Logit), $D}", sb  inv(sb))
       ];

julia> x = randn(D);

julia> for (name, b) in bs
           @info "$(name): b ∘ b"
           cb = b  b
           @btime Bijectors.forward($cb, $x)
           @btime Bijectors.forward_gen($cb, $x)
           @btime Bijectors.forward_it($cb, $x)

           N = 50
           @info "$(name): b ∘ b ∘ ... ∘ b ($N times)"
           cb = foldl(, [b for i = 1:N])
           @btime Bijectors.forward($cb, $x)
           @btime Bijectors.forward_gen($cb, $x)
           @btime Bijectors.forward_it($cb, $x)
       end
[ Info: PlanarLayer(100): b  b
  4.746 μs (47 allocations: 14.69 KiB)
  4.758 μs (47 allocations: 14.69 KiB)
  5.091 μs (49 allocations: 14.80 KiB)
[ Info: PlanarLayer(100): b  b  ...  b (50 times)
  162.648 μs (1299 allocations: 379.61 KiB)
  116.781 μs (1151 allocations: 366.44 KiB)
  119.156 μs (1153 allocations: 367.41 KiB)
[ Info: RadialLayer(100): b  b
  3.263 μs (35 allocations: 8.53 KiB)
  3.287 μs (35 allocations: 8.53 KiB)
  3.642 μs (37 allocations: 8.64 KiB)
[ Info: RadialLayer(100): b  b  ...  b (50 times)
  131.128 μs (999 allocations: 225.70 KiB)
  82.406 μs (851 allocations: 212.53 KiB)
  85.703 μs (853 allocations: 213.50 KiB)
[ Info: Stacked{Logit  inv(Logit), 100}: b  b
  69.882 μs (2009 allocations: 194.28 KiB)
  70.363 μs (2009 allocations: 194.28 KiB)
  82.146 μs (2028 allocations: 260.78 KiB)
[ Info: Stacked{Logit  inv(Logit), 100}: b  b  ...  b (50 times)
  9.063 ms (55549 allocations: 35.88 MiB)
  1.830 ms (50201 allocations: 4.74 MiB)
  7.016 ms (50700 allocations: 35.88 MiB)

@torfjelde
Copy link
Member Author

Also, we'll still keep the "simple" iterative implementation. So we have two different ways of constructing a Composed; using a Tuple or Array. Using Tuple gives type-stable implementations using @generated, using Array gives unstable iterative implementations.

torfjelde added a commit to torfjelde/Bijectors.jl that referenced this issue Nov 7, 2019
xukai92 pushed a commit that referenced this issue Nov 28, 2019
* converted types of vals into instances

* added TruncatedBijector as suggested in #40

* improved compositions as discussed in #46

* added a bunch of type-stability testing throughout

* added a bunch of type-stability tests and fixed some instabilities

* fixed type-stability issue with inv for Stacked

* dropped generated inv for Stacked in favour of using map

* added type-stable implementations of inv for Stacked and Composed

* forget to reverse order previously

* improved docstring for Composed

* restructuring of project structure

* forgot a file in last commit

* combined forward and inverse jacobians for ADBijector

* fixed typo thanks to @kai

* fixed Scale custom adjoints and added proper testing for these

* Operations involving Composed with array now returns Composed using array, etc.

* now throwing error when composing stable and unstable Composed
@torfjelde
Copy link
Member Author

Closed by #52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants