Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add norm flows #1

Merged
merged 25 commits into from
Aug 22, 2019
Merged

Add norm flows #1

merged 25 commits into from
Aug 22, 2019

Conversation

sharanry
Copy link

@sharanry sharanry commented Jul 23, 2019

@sharanry
Copy link
Author

@torfjelde could you give me push access to this fork?

@torfjelde
Copy link
Owner

Nice work man; looks good!
Maybe it's a good idea to keep the PR for the interface and the PR for the normalizing flows separate? Also, do we actually want to put normalizing flows in Bijectors.jl? Might be better for it to be its own package which depends on the Bijectors.jl package. Seems like a natural approach for this. This also means that Bijectors doesn't have to depend on the entirety of Flux.

What do you think? Want to ask the rest of the team what they think?

Also, is this "finished"? I see there's no implementation of Inversed{<: PlanarFlow}; Is it not necessary to implement the inverse? And if it's finished and you want someone to review the code, feel free to ask me :)

@sharanry
Copy link
Author

sharanry commented Jul 23, 2019

Thanks!

Maybe it's a good idea to keep the PR for the interface and the PR for the normalizing flows separate? Also, do we actually want to put normalizing flows in Bijectors.jl? Might be better for it to be its own package which depends on the Bijectors.jl package. Seems like a natural approach for this. This also means that Bijectors doesn't have to depend on the entirety of Flux.

Not sure about this. Separate repository seems easier to maintain during initial development? @xukai92 what do you think?

Also, is this "finished"? I see there's no implementation of Inversed{<: PlanarFlow}; Is it not necessary to implement the inverse? And if it's finished and you want someone to review the code, feel free to ask me :)

There is no analytical solution to the inverse of planar flows. But we know it exists because we make sure of it with a "maintenance step". We should be able to solve it using an iterative solution. But this is not a priority. And we do not need inverse step for VI (This is a summary of my discussion with @xukai92 regarding this issue)
I think the plan was to review this together with TuringLang#27, hence the PR to the PR.

It would be great if you could review it too.

Here is a demonstration of planar flows, https://gist.github.com/sharanry/79d646fdc2f1677dbf141db649442e85

@xukai92
Copy link

xukai92 commented Jul 23, 2019

Re. "seperate package"

Normalizing flows are nothing but a base distribution + some invertible transformations - the latter of which should be fit into the Bijectors.jl package. I think it's OK to also have a flow type defined in Bijectors.jl for now (e.g. struct Flow base::Distributions transformation::InvertibleTransformation end). I blieve this would make our life easier because doing the development in two packages usually takes more time (on sync stuff, etc).

In the long term, when we come to a stage where we implement more "utility functions" for normalizing flows specifically (e.g. maximum likelihood fit, some popular normalizing flows that stacking specific transfomrations, etc), we could do that in a seprate package and shift some codes there.

Re. "inverse no implemented"

@sharanry is right here. Basically the inverse for this transformation does not have closed form. We could do an interative version but this is not required for this guy to be used in a VI setting. This is simply, consdiering the transformation is doing: u -> z, all the p(z) we need to evaluate in VI are those we transform from u, and we do not need to evaluate arbitary z.

Re. "why this PR is here"

So as this transformation is dependent on @torfjelde's current interface change, it seems to be the best way is to merging this into interface PR. @sharanry's job is to make sure the transformation correctly implemented in the new interface and tested. We will need some test based on ForwardDiff.jl to make sure the logabsdetjacob is correctly implemented.

Does it sound like a plan?

@xukai92
Copy link

xukai92 commented Jul 23, 2019

TODOs

@sharanry

  • add tests
    • Minimal is using ForwardDiff.jl to test your logabsdetjacob implementation. Let me know if you need any help for this.

@torfjelde

  • review the code when the tests are ready and passed
    • Just make sure the interface fit your current PR.

src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
@torfjelde
Copy link
Owner

Re: everything @xukai92 said

Great! Sounds like a plan 👍 Tell me when you want me to review @sharanry, and I'll do that and we'll merge after:)

@sharanry
Copy link
Author

sharanry commented Jul 25, 2019

@torfjelde I think it should be ready now for you to review.

The logabsdetjacob tests on radial flows fails at the moment. I am able to find the root cause of this yet.

@sharanry
Copy link
Author

@xukai92 @torfjelde
I have added an iterative inverse implementation for planar flow. Here is a notebook demonstrating the same. Not sure if it is the fault of the root solver, but it isn't giving a perfect inverse.

Copy link
Owner

@torfjelde torfjelde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work!

There's a some things we need to fix, but otherwise it's mainly just style-comments 👍 The main thing being a typo in forward for PlanarFlow and the failing test for the RadialFlow (I think it's more than a numerical issue; see my comments).

It would also be awesome if inside the implementations of the transform and inv if you leave a comment specifying the equation number in the paper; makes it so much easier to understand:)

Finally, if you could compare numerically to some other established packages implementing these transforms, that would be awesome. Just take some other package in Python or whatever, run the forward pass on a particular array of values, and then you test your implementation on the same values. If you then put that in a @test, that would be perfect!

src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
test/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
@xukai92
Copy link

xukai92 commented Jul 31, 2019

run the forward pass on a particular array of values, and then you test your implementation on the same values. If you then put that in a @test, that would be perfect!

This sounds like a good test. And the gist @sharanry post is already doing so.

I left a comment on the gist on maybe trying different orders.

src/Bijectors.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
test/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
test/norm_flows.jl Outdated Show resolved Hide resolved
test/norm_flows.jl Outdated Show resolved Hide resolved
test/norm_flows.jl Outdated Show resolved Hide resolved
test/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
src/norm_flows.jl Outdated Show resolved Hide resolved
@xukai92
Copy link

xukai92 commented Aug 19, 2019

@sharanry Please take a look at sharanry#1 on how do I fix the two remaining issues in this PR.

For the transform thing, I define a _transform which is only for internal use. This function returns the intermediate variables as well. transform then is just a wrapper that returns the transformed variable and ignore other variables, and _transform can be used for the computation of logabsdetjac which needs intermediate variables.

For the randomised tests, I added a rtol for the approx. equality check, and also fixed the random seed. This should be fine for now. One can figure out how to make the optimisation for the inverse accurate and stable (apprently some times it doesn't converge even) later.

* add dep and update deps

* update test and fix the randomness

* replace transpose(x) with x'

* use tab to align comments

* replace transpose(x) with x'

* unify transform code for RadialLayer

* unify transform code for PlanarLayer

* improve code style and add comments
@sharanry
Copy link
Author

@torfjelde Could you do a final review and merge it?

Copy link
Owner

@torfjelde torfjelde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, really nice work! Just a couple of more points and this will be merged. See in particular my point on the inv functions.

I'd also suggest just merging my latest commit. Those changes does not really affect your branch and I can do this upon merge, so it's optional:)

@@ -162,6 +170,11 @@ function forward(cb::Composed{<:Bijector}, x)
return res
end

function rand(flow::Composed, dims::Integer, shape::Integer=1)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite certain why this is necessary. Is it supposed to be used as the "source distribution" for the transformation? Is there a reason why this cannot be done with a MultivariateTransformed <: Distribution?

Even if we have to keep this in, this is not supported by any Bijector but is specific to flows, right? If so, I'd suggest adding an abstract type Flows and let the normalizing flows subtype from this. This way we could make this method dispatch on Composed{<: AbstractArray{<: Flow}} or something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rand for the transformed distribution is a requirement of Normalised Flows VI.

Wouldn't this in general work for any composition of bijections?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I said it wouldn't work, I was referring to the fact that transform is no longer supported.
But I agree, with the change below, it would work not just for any composition, but for any Bijector I believe.

But I think it makes more sense to make this more general, allowing one to attach any Distribution to a Bijector, composed or not. The MultivariateTransformed is the approach I've taken for this, and it seems to work nicely. Your function would then simply be a Composed bijector attached to a MvNormal.

With your approach you might run into an issue where a function dispatches on d::Distribution, but PlanarFlow() isa Distribution is false. I think what I'm proposing with the transformed distributions is more modular and will "just work" in a lot of packages that uses Distributions.jl. For example, if you look at my PR for Bijectors.jl you can see an example where because of this MultivariateTransformed I could simply plug a transformed distribution into another package I'd implemend ages ago, and it just worked :)

Does this make sense?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we don't need rand in Bijectors.jl

function rand(flow::Composed, dims::Integer, shape::Integer=1)
dims = [dims]
append!(dims, shape)
return transform(flow, randn(dims...))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we keep this in; should this be flow(randn(dims...)) now?

src/interface.jl Outdated
@@ -151,6 +151,14 @@ function transform(cb::Composed{<: Bijector}, x)
return res
end

function inv(cb::Composed{<: Bijector}, y)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sould this be Composed{<: AbstractArray{<: Bijector}}?
Also, I'd suggest just dispatching on Composed if so, as this also allows using Tuple as the container.

EDIT: I see I've probably previously made the same mistake below! This has been fixed in the more recent commits.

src/interface.jl Outdated
function inv(cb::Composed{<: Bijector}, y)
res = y
for b ∈ reverse(cb.ts)
res = inv(b, res)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This inv(b, res) does not exist for other bijectors. Just add something like

inv(b::Bijector, y) = inv(b)(y)

as a default implementation just before this function.

EDIT: See my comment regarding the inv for the flows. If that is true, then we don't even need this function.

return (rv=transformed, logabsdetjac=log_det_jacobian) # from eq(10)
end

function inv(flow::PlanarLayer, y)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we cannot do this using Inverted{<: PlanarLayer}? Seems to me that we could just implement this as

function (ib::Inversed{<: PlanarLayer})(y)
    flow = ib.orig
    # same as in this inv(flow, y) ...
end

return (rv=transformed, logabsdetjac=log_det_jacobian)
end

function inv(flow::RadialLayer, y)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment on inv; same applies here.

@xukai92
Copy link

xukai92 commented Aug 22, 2019

I'd suggest we merge this PR first and @torfjelde can work on fixing the interface part directly in his branch, which would be more efficient.

@torfjelde
Copy link
Owner

torfjelde commented Aug 22, 2019

I'd suggest we merge this PR first and @torfjelde can work on fixing the interface part directly in his branch, which would be more efficient.

Sounds like a good idea 👍 I'll do that then

@torfjelde torfjelde merged commit d903d18 into torfjelde:tor/interface Aug 22, 2019
yebai pushed a commit to TuringLang/Bijectors.jl that referenced this pull request Aug 29, 2019
* initial implementation of new interface

* fixed the testing

* now adheres to the style guide

* cant have type templates on next line...

* made more similar to the interface suggested by @xukai92

* fixed typo in error message

* added composition of Bijectors

* improved composing compositions and added overloading of circ

* composition with its own inverse now results in Identity

* made a constant IdentityBijector

* added Logit bijector and specialization for Beta distribution

* fixed a typo in transform of inv(Logit)

* fixed typo

* added jacobian function for bijectors, with AD implementations

* added jacobian using AD for inverses of ADBijector

* added simple AD-types and removed dep on Turing

* Add norm flows

* minor changes

* fix bugs

* fix more bugs

* fix bugs and follow style guide

* add tests on logabsdetjacob

* fix spaces

* defaults to using Tuple but allow any containers

* fixing style-issues

* add iterative norm for planar flows

* fix minor bug

* adhere to stylecode, add radial inverse, remove tracker dependency, restructure code

* fix radius bug

* fix param dependency

* fix inv() for radial flow; follow style guidelines

* minor change to test

* minor fix

* add ref to paper for each equation

* fix forward and remove redundant

* remove update_u_hat!() requirement

* update tests and ifx bug

* update tests

* implement Bijector call. We can now transform using BijectorName(x)

* Add inv and rand functions for composed bijectors

* Add direct calls for norm flows

* added docstrings and moved away from transform to callables

* Fix the two remaining issues in torfjelde#1 (#1)

* add dep and update deps

* update test and fix the randomness

* replace transpose(x) with x'

* use tab to align comments

* replace transpose(x) with x'

* unify transform code for RadialLayer

* unify transform code for PlanarLayer

* improve code style and add comments

* adapted flows to new interface

* removed random Revise import

* introduced TransformedDistribution and MatrixTransformed as subtype

* fixed typo

* added proper implementation of bijector(d) for UnitDistribution

* added Scale and Shift together with bijector for Truncated

* removed redundant line in logabsdetjac

* fixed a typo and added some more tests

* removed some unecessary commented code

* addded some comments

* added SimplexBijector and forward(flow, ::Vector) now returns vector

* removed left-over commented out code

* updated Manifest

* fixed issue with _transform for RadialLayer when using Tracked

* forgot a sqrt in previous commit

* removed now redundant hack for Dirichlet

* fixed stackoverflow on forward(::RadialLayer, x::AbstractArray) due to typo

* added recursive implementation for forward(cb::Composed, x)

* support for batch computation using forward(b::Bijector, x, logjac)

* edited some comments

* fixed a typo in RadialLayer forward

* replaced forward(b, x, logjac) with fused logjac instead

* initializing recursive forward call using first result

* fixed logabsdetjac of Shift for batch, though it is ambiguous imo

* added test for transform and inverse for univariate transformed

* MvLogMvNormal no longer uses DistributionBijector

* captialized comments

* added my name to a comment

* fixed a typo leading to Kolmogorov being treated as unit-contrained

* added SingularJacobianException to logabsdetjac of ADBijector

* made changes to adhere to style-guide

* mode style-changes

* added a more useful forward(td::TransformedDistribution)

* replaced constant variables log_b and exp_b with Log and Exp

* compose replaced by composel and composer

* fixed sign typo and added AD verification

* added special logpdf_forward for dirichlet in forward

* added docstring to forward(d::Distribution)

* change variable name in forward(d::Distribution)

* increment version number to 0.4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants