Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing a custom bijector is a hassle: solve by adding macro? #137

Open
torfjelde opened this issue Sep 12, 2020 · 3 comments
Open

Comments

@torfjelde
Copy link
Member

Currently there a couple of annoyances when implementing a new Bijector:

  1. Difficult to share implementation between bijectors, mostly because of the fact that callable types (b::Bijector)(x::T) cannot be implemented for abstract types on Julia <1.3. This means that we have to implement batch-computations on a case-by-case basis, which is both annoying and sometimes difficult to do in a AD-friendly + type-stable way (we have a bunch of mapvcat and eachcolmaphcat methods to do this, which is an unnecessary complication for a newcomer).
  • When we started out with the re-design, we were also considering using a transform(b, x) method as the "evaluation" method, as this would allow us to have more generic implementations for batching, etc. But we decided not to do that, as it also felt clunky.
  1. forward(b::Bijector, x) is supposed to allow the user to share computation between the evaluation, i.e. (b::MyBijector)(x), and logabsdetjac(b, x). Buuut it's annoying to have to first implement (b::MyBijector)(x) and logabsdetjac(b, x), which are mandatory, and then have to go through these methods to figure out what is shared and then copy-paste certain parts to a transform method, etc.

Since we're in Julia, my first idea is of course to throw a macro at problem! I'm thinking introduce transfrom but make it -super-easy for the user to define everything in one go. I.e. something along the lines of:

struct MyBijector <: Bijector{0} end

@bijector function forward(b::MyBijector, x::Real)
    # Shared computation
    z = exp(x)
    
    rv = begin
        # `(b::Bijector)(x)` goes here
        z
    end
    logabsdetjac = begin
        # `logabsdetjac(b, x)` goes here
        log(z)
    end
end

which is then transformed into something along the lines of

quote
    function (Bijectors).transform(b::MyBijector, x::Real)
        z = exp(x)
        return z
    end
    (b::MyBijector)(x) = (Bijectors).transform(b::MyBijector, x::Real)
    function (Bijectors).logabsdetjac(b::MyBijector, x::Real)
        z = exp(x)
        return log(z)
    end
    function (Bijectors).forward(b::MyBijector, x::Real)
        z = exp(x)
        rv = z
        logabsdetjac = log(z)
        return (rv = rv, logabsdetjac = logabsdetjac)
    end
end

Then the only thing that is left for the user to implement is the inverse evaluation.

Also, I do have a somewhat "dirty" implementation ready (from which the above output was generated + MacroTools.prettify): https://gist.github.com/torfjelde/8675bba686afdf693476ae1c70f516d3.

This would then allow us to easily transition to transform, thus ensuring compatibility with Julia <1.3 but still using more generic methods, i.e. transform(b::Bijector{0}, x::AbstractVector) = b.(x). It would make it super-easy to share computation in forward. Finally, we could start thinking about adding in complementary inplace methods, e.g. transform!(b::Bijector, x, out), logabsdetjac!(b::Bijector, x, out), etc, as a next step.

The only question is: are we overcomplicating things here? Is there an easier way of achieving what we want?

@torfjelde torfjelde mentioned this issue Sep 12, 2020
1 task
@devmotion
Copy link
Member

IMO a macro seems to complicated and leads to un-julian syntax. I also don't think it is necessarily easier for users to figure out how to write the macro than just implementing the three functions currently. I'm not sure if forward is part of the API or just an implementation detail, and how often it happens that . In any case, I think maybe the following could work:

(f::Bijector)(x) = unthunk(forward(f, x).rv)
logabsdetjac(f::Bijector, x) = unthunk(forward(f, x).logabsdetjac)

(BTW I'm not sure about these names, maybe just make it a tuple or use something else than rv?)

  • Let users (mainly) implement
function Bijectors.forward(f::MyCoolBijector, x)
	....
	return (rv = ..., logabsdetjac = ...)
end

possibly using @thunk

@torfjelde
Copy link
Member Author

IMO a macro seems to complicated and leads to un-julian syntax.

But the "un-Julian syntax" is mostly due to the fact that we're dropping the return statement, right? If so, we could just make the user add it manually, i.e. return (rv = rv, logabsdetjac = logabsdetjac), or make this optional. Other than that, there's not much un-julian about the macro, IMO.

I also don't think it is necessarily easier for users to figure out how to write the macro than just implementing the three functions currently.

That's true but the goal here isn't to make it easier to understand, but easier to go from "I want this bijector" to "I have this bijector". Using a macro we could make it so that there is a minimal amount of work on the user, in addition to getting the most efficient implementation for all the necessary functions. E.g. RadialLayer and PlanarLayer would require waaay less code in addition to being more efficient than the current implementation (the fact that these unnecessarily compute logabsdetjac in the (b::Bijector)(x) method, kind of proves the point that people aren't bothered to implement all the different methods, haha).

I'm not sure if forward is part of the API or just an implementation detail, and how often it happens that .

It's part of the API 👍 And there are cases where it's definitively worth it, e.g. RadialLayer for high-dimensional input. It's not particularly useful for stuff like Exp, but once you start working with 500-dimensional normalizing flows this becomes very important.

Define all methods that apply bijectors to multiple inputs (i.e., vectors, matrices, etc.) generically for all bijectors since we do not support Julia < 1.3 anymore.

I'm potentially for this. But it's worth noting that Bijectors.jl still works for Julia <1.3, it's just that we don't test properly + certain AD-backends doesn't work. This introduction would completely break Bijectors.jl for Julia <1.3.

Use something like @thunk in ChainRulesCore

You're thinking along the lines of

function _forward(b, x)
    rv = @thunk ...
    logabsdetjac = @thunk ...
    return (rv = rv, logabsdetjac = logabsdetjac)
end

forward(b, x) = unthunk(_forward(b, x))
(b::Bijector)(x) = unthunk(_forward(b, x).rv)
logabsdetjac(b, x) = unthunk(_forward(b, x).logabsdetjac)

right?

I'd argue that this is both a) more complicated to understand for the user, b) way worse performance as closures have comparatively significant overhead.

Add implementations of a reduced logabsdetjac version for and loglikelihood_with_trans (see #120) that avoid the intermediate allocations of arrays in logabsdetjac

I'm in favour of the suggestion, but it seems like a slightly different issue, no?

@torfjelde
Copy link
Member Author

(BTW I'm not sure about these names, maybe just make it a tuple or use something else than rv?)

Agree, but also separate issue. We discussed renaming rv to res (#41) but the issue sort of lost it's momentum and we ended up getting stuck with it.. And I'm not certain about just making it a tuple, since NamedTuple means you can access it both using indexing and .varname.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants