Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: add gradcheck #235

Closed
wants to merge 13 commits into from
Closed

Conversation

Roger-luo
Copy link
Contributor

@Roger-luo Roger-luo commented Jun 18, 2019

This PR provides three functions:

  1. njacobian: calculate the numerical jacobian
  2. jacobian: use Zygote to calculate the analytical jacobian
  3. gradcheck: a torch-like gradcheck it compares the jacobian instead of calculating the summation

TODO:

  • use gradcheck instead of the one in test
  • make some tests for jacobian

@Roger-luo
Copy link
Contributor Author

Roger-luo commented Jun 18, 2019

This PR is implemented based on torch.gradcheck, I'm not sure if we need to check the reentrant of jacobians in Zygote? so I didn't include it here. Anyone has idea on this?

@willtebbutt
Copy link
Member

willtebbutt commented Jun 19, 2019

This is cool. I totally agree that Zygote needs improved testing facilities, but we've already got a similar thing implemented in FDM.jl that we're utilising in ChainRules.jl and Nabla.jl. Is there a reason that Zygote can't just use that?

edit: in particular see here

edit 2: you could be forgiven for not realising this exists as our documentation for this is quite poor it seems. Will try to get this improved.

@Roger-luo
Copy link
Contributor Author

Roger-luo commented Jun 19, 2019

I see. I didn't know there's such implementation in test. However, the jacobian function provides the analytical jacobian calculated by Zygote itself, I think I could use the jacobian function in FDM instead, but it seems the rrule_test only works for ChainRules functions? I think the gradcheck function defined here is not only useful for unit tests for Zygote itself, but also useful for users to test their own gradients/programs which compares a more explicit jacobian matrix directly. We don't have any implementation for jacobians in Zygote at the moment I think.

Edit: Another issue is the numerical jacobian doesn't provide jacobians over multiple variables, e.g

f(2, rand(2, 2)), it requires all the variable to be a vector, which doesn't seems generic enough. The implementation here can deal with arbitrary user defined function by returning each variables' jacobian.

https://github.com/invenia/FDM.jl/blob/574136a76fbc9182fe8e5195b4a17acd686732b0/src/grad.jl#L28

I think maybe I should make PRs to FDM for the numerical part. But I still think it is necessary to provide a gradcheck that check the jacobian of an arbitrary julia function explicitly.

@Roger-luo
Copy link
Contributor Author

What's the plan of using ChainRules for Zygote BTW?

@MikeInnes
Copy link
Member

I think rrule_test is just a usage example for FDM. We can keep the Zygote utilities, which are indeed useful, but should reuse FDM as the numerical gradients engine.

RE chainrules it's in the pipeline but needs someone to bring it in.

@willtebbutt
Copy link
Member

We can keep the Zygote utilities, which are indeed useful, but should reuse FDM as the numerical gradients engine.

@MikeInnes More or less. Which utilities are you referring to here? In particular, we do have a jacobian function, a j′vp utility function which just computes a vector^T-Jacobian product (i.e. approximates reverse-mode AD), and a load of to_vec functionality to make it possible to work with non-vector inputs (see here)

@Roger-luo :

I think the gradcheck function defined here is not only useful for unit tests for Zygote itself, but also useful for users to test their own gradients/programs which compares a more explicit jacobian matrix directly.

The jvp (Jacobian-vector product i.e. forwards-mode AD) and j′vp (Jacobian^T-vector product i.e. reverse-mode AD) in FDM are designed for precisely this, in addition to testing AD implementations. On a separate note, you don't really want to be testing the gradient of e.g. Zygote functions as they're really an edge case of the things that Zygote really computes (j′vps).

f(2, rand(2, 2)), it requires all the variable to be a vector, which doesn't seems generic enough. The implementation here can deal with arbitrary user defined function by returning each variables' jacobian.

I partly take your point here, but this isn't the complete picture in FDM. The jvp and j′vp functionality here definitely does allow for arbitrary numbers of arguments, and the to_vec functionality below makes it possible to work with a reasonably large number of different types. (see its see in the jvp and j′vp functions)

That being said, you are correct that our jacobian functionality does only support single argument functions at the minute, but it would be great to extend this to the multi-argument version (where each argument is a vector) and then utilise the to_vec functionality to make it work straightforwardly with a variety of input types. This would be really quite straightforward to do, and the existing jvp and j′vp functionality provide a good template for doing this. Moreover, we could simplify the j′vp implementation slightly once we had this.

(We didn't originally make it easy to work with jacobians of arbitrary functions because we were building a tool designed specifically to test AD implementations, which literally just compute the jvp and j′vp functions. If you're interested in working with Jacobians in Zygote now, then I can see how you might want additional functionality for testing them though)

In any case, we would be very open to collaborating on this to improve FDM to ensure that it is sufficient for Zygote's needs 🙂.

@MikeInnes
Copy link
Member

MikeInnes commented Jun 20, 2019

Which utilities are you referring to here?

I'm more or less thinking of gradcheck or whatever our equivalent of rrule_test ends up being. We probably want to redesign that for the reasons you've pointed out, but in any case it's handy to have a quick check that a custom adjoint is correct and/or that Zygote is behaving properly.

It probably makes sense to start simple here, just reworking gradcheck and Roger's utilities here to use FDM, and then later making the tests do something smarter.

@Roger-luo
Copy link
Contributor Author

I'm willing to contribute to upstream and make use of FDM here. But I just think in this PR we only to make use of the finite diff from FDM and replace that njacobian function later when a more generic version of jacobian is implemented and tagged.

And one issue for this PR is I was trying to make sure we don't need to check reentrants like torch does in their gradcheck, or do we need to? I'm not sure what is this for.

@willtebbutt
Copy link
Member

All sounds reasonable to me :)

@Roger-luo
Copy link
Contributor Author

Roger-luo commented Jun 21, 2019

Yeah, but anyone has idea on the reentrant thing? I don't quite get it why they check this in torch, is this because of the tape (or high order derivatives?)

Update: in case you want to look at the code: https://github.com/pytorch/pytorch/blob/d4119f8fcbde15569b8f2993973439f2961ea22a/torch/autograd/gradcheck.py#L291

I didn't find any docs for this.

@willtebbutt
Copy link
Member

I don't quite get it why they check this in torch, is this because of the tape (or high order derivatives?)

Is this not just checking for non-determinism / stateful behaviour that changes each time you evaluate the function / do AD? Seems like a sensible thing to check for in any case.

@Roger-luo Roger-luo changed the title add gradcheck WIP: add gradcheck Jun 21, 2019
@Roger-luo
Copy link
Contributor Author

If it is non-deterministic, why just check it only once? and I guess we don't need it at the moment? or there is a use case in Zygote at the moment?

@MikeInnes
Copy link
Member

I think we can probably ignore that for now. There are cases where this might be worth checking (e.g. that back can be called multiple times), but those things are mostly fairly unlikely to go wrong in the first place.

@Roger-luo
Copy link
Contributor Author

@willtebbutt could you tag a new version of FDM so I could finish this PR? thanks

@willtebbutt
Copy link
Member

Done -- JuliaRegistries/General#4055 :)

@Roger-luo
Copy link
Contributor Author

I removed the njacobian and start using the new jacobian in FiniteDifferences. But I'll need to wait JuliaDiff/FiniteDifferences.jl#51 be tagged so things in this branch will pass CI.

@MikeInnes
Copy link
Member

It's fine to have a branch of FD in the manifest, for CI. We could even merge that as long as we're fairly confident FD will get tagged soonish.

@willtebbutt
Copy link
Member

It's fine to have a branch of FD in the manifest, for CI. We could even merge that as long as we're fairly confident FD will get tagged soonish.

I'm very happy to tag as soon as the PR is complete.

@willtebbutt
Copy link
Member

@Roger-luo 0.9.0 of FiniteDifferences is now available 🙂

@Roger-luo Roger-luo changed the title WIP: add gradcheck add gradcheck Nov 11, 2019
@Roger-luo
Copy link
Contributor Author

Roger-luo commented Nov 11, 2019

I think should be good once CI passed. I also fixed exp(::AbstractMatrix) to let it return a real matrix when input is real. A few tests are broken due to JuliaDiff/FiniteDifferences.jl#53 but I think we could fix it in another PR later.

@Roger-luo
Copy link
Contributor Author

err, it seems the test timeout?

@willtebbutt
Copy link
Member

One reason that your tests might be timing out is because gradcheck always works with the jacobian, rather than the vector-jacobian product-like thing that pullbacks are good for. Would it not be sufficient to ensure that

  1. pullback is correct for every sensitivity, by testing that back seeded randomly yields the correct thing when compared to a finite difference estimator of the vjp
  2. jacobian works correctly for each input / output type?

The point is that jacobians are very expensive to compute, and your jacobian function is implemented in terms of pullback anyway. So provided that you can ensure that pullback is implemented correctly, you just have to check that jacobian operates correctly on whatever types you want it to work for.

@Roger-luo
Copy link
Contributor Author

so do you think I should just check say the vjp instead of jacobian then? I use jacobian here because I simply just "copy" the gradcheck in pytorch, I'm not sure why they are using it.

@willtebbutt
Copy link
Member

so do you think I should just check say the vjp instead of jacobian then? I use jacobian here because I simply just "copy" the gradcheck in pytorch, I'm not sure why they are using it.

Yup, that's my opinion on the matter. IMHO the jacobian is the wrong thing to check, as it's essentially just computing a bunch of special cases of a vjp, which can fail to catch bugs. Better just to randomize the input to back, and any edge cases that you happen to feel are important.

@Roger-luo Roger-luo changed the title add gradcheck WIP: add gradcheck Nov 17, 2019
@Roger-luo
Copy link
Contributor Author

the eigen value test still fails with this FDM PR (JuliaDiff/FiniteDifferences.jl#54) somwhow

any idea why?

eigvals(::Hermitian{Complex{Float64},S} where S<:(AbstractArray{#s617,2} where #s617<:Complex{Float64}))

@mcabbott mcabbott mentioned this pull request Mar 27, 2020
@mcabbott mcabbott mentioned this pull request Jan 27, 2021
bors bot added a commit that referenced this pull request Feb 2, 2021
890: Add `jacobian`, at last? r=CarloLucibello a=mcabbott

This adds a Jacobian function. 

Compared to #747 this one:
* has tests
* accepts multiple arguments
* shouldn't fail if `back` returns `nothing`
* inserts `vec` a few more places
* has a method which accepts implicit `Params`, and returns `Grads`. (This was the only part I actually needed in real life.)
* now works on the GPU too!

Compared to #414 this one:
* always inserts `vec`, never makes higher-dimensional arrays
* runs on current Zygote
* has tests.

Compared to #235 this one:
* doesn't try to provide numerical jacobian
* doesn't alter testing infrastructure
* doesn't provide `jacobian!`, nor any code for structured matrices.

This does not address #564's concerns about functions which return a tuple of arrays. Only functions returning an array (or a scalar) are permitted. Similar considerations might give sensible jacobians when the argument of a function is a tuple, or some other struct, but for now these are handled by putting up a giant warning sign.

Nothing in the file `utils.jl` seems to have any tests at all. So I also added tests for `hessian`. 

And, while I was there, made `hessian` actually accept a real number like its docstring promises. (Hence this closes #891.) And, made a version that is reverse-over-reverse, using this `jacobian`, which works less well of course but may as well exist to test things. (See for example #865.) Ideally there would be a pure-Zygote version using its own forward mode, but I didn't write that.

Fixes #51, fixes #98, fixes #413. Closes #747.

Co-authored-by: Michael Abbott <me@escbook>
Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
@CarloLucibello
Copy link
Member

closing as stale and also because I would prefer to rely on FiniteDifferences.jl (as #464 attempts, also stale though)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants