lax.root, a primitive for differentiable root finding #1339

shoyer · 2019-09-12T04:52:12Z

This should solve the issue with closed over variables not being handled properly in _custom_implicit_solve.

Does implementation look sane? It's a little messier than I would like, because we need to deal with inputs into provided functions.

See the tests for an example of what using this API looks like. There are lots of ways to make this more user-friendly, but this is intended as a low-level API for defining implicit derivatives. The user facing APIs will be routines like scipy.optimize.root, which won't require providing functions for solve and tangent_solve.

TODOs:

replace asserts with errors
more test coverage
- something checking higher dimensional arrays / nested structures
- check grads with jtu.check_grads
- verify the jit works properly

Co-authored-by: Stephan Hoyer <shoyer@google.com>

jax/lax/lax_control_flow.py

shoyer · 2019-09-25T05:49:16Z

jax/lax_linalg.py

@@ -72,6 +72,9 @@ def svd(x, full_matrices=True, compute_uv=True):

 def triangular_solve(a, b, left_side=False, lower=False, transpose_a=False,
                     conjugate_a=False, unit_diagonal=False):
+  # TODO(shoyer): remove this hack!


The need for this suggests that I'm doing something wrong...

Zeros shouldn't escape the AD system, i.e. they should only appear in JVP rules. Maybe a check for ad_util.zero just needs to be added upstream, i.e. in the caller (assuming this is called from a JVP rule).

shoyer · 2019-09-25T05:49:48Z

jax/lax/lax_control_flow.py

+  # F(u(m), m) = 0  # system of equations in m
+  # ∂_0 F(u(m), m) ∂ u(m) + ∂_1 F(u(m), m) = 0
+  # ∂ u(m) = - (∂_0 F(u*, m))^{-1} ∂_1 F(u*, m)
+  unchecked_zeros, f_jvp = api.linearize(


I tried using ad.linearize, but the jaxpr it returns isn't a TypedJaxpr, so I couldn't figure out how to evaluate it.

I think we might want to use ad.jvp_jaxpr here, basically following the logic of scan, though I'm not sure yet if we need the fixed-point logic. I can explain more about that in chat.

Nevermind about that fixed-point stuff; that's not relevant, because the variables we're differentiating with respect to are all in the closure of the function being passed in.

tests/lax_control_flow_test.py

mattjj · 2019-09-25T20:16:41Z

I'm paging this stuff back in, and more exited about it than ever.

Here are some things I want to check with you: my current best understanding is we want a primitive

root :: (a -> a) -> a -> a

where the first argument is a function which we can map to math as f : R^n -> R^n, the second argument is an initial guess x_0 (a point in R^n), and the output x* is a point in R^n satisfying f(x*) = 0 (where the RHS is the zero vector in R^n).

We expect to apply root to functions that have nontrivial closures, and in particular that might close over values involved in differentiation. So while the above is a description at the API level, really we should think of the mathematical function in a closure-converted way as f : R^n x R^p -> R^n, where the first argument is some set of parameters. Then we can say x*(a) solves f(x*(a), a) = 0.secondI think we want the JVP rule to look something like

root_jvp f x0 a adot =
  let x_star = root (lambda x: f(x, a)) x0
      x_star_dot = linear_solve (∂_0 f (x_star, a)) (∂_1 f(x_star, a)[-a_dot])
  in (x_star, x_star_dot)

where I'm using square brackets to denote the application of the linear function ∂_1 f(x_star, a) to a vector.

Does that sound right?

This reverts commit 7eaca4d.

shoyer · 2019-09-25T22:55:13Z

Yes, this looks right to me, with one minor correction:

So while the above is a description at the API level, really we should think of the mathematical function in a closure-converted way as f : R^n x R^p -> R^n, where the ~~first~~second argument is some set of parameters

mattjj

LGTM! Thanks for pushing on this. We've got some more polishing work to do, but this is a big step forward: it actually works!

(We've been discussing extensively offline what follow-up stuff we want to do.)

mattjj · 2019-09-26T20:49:57Z

jax/lax/lax_control_flow.py

+  out = solve(f, initial_guess)
+
+  out_flat, out_tree = tree_flatten(out)
+  if out_tree != tree:


I think this check is redundant because you already checked it when you formed jaxpr. It doesn't hurt to include though, other than taking up precious vertical space :)

We evaluated f() when we formed the jaxpr, but not solve(). So I think we do need this. Actually I even wrote a test for that catches this error message :)

mattjj · 2019-09-27T19:16:38Z

jax/lax/lax_control_flow.py

+      core.jaxpr_as_fun(jaxpr), *(params + solution)
+  )
+
+  params_zeros = tuple(_map(ad_util.zeros_like_jaxval, params))


We think we can avoid instantiating zeros here, and otherwise be more conservative about how much work we do (I currently think we do want to run a fixed-point on ad.jvp_jaxpr!), but we can leave that for follow-up work. Land and iterate!

mattjj and others added 5 commits September 11, 2019 21:32

sketch of root w/ parameterized solvers

0c3e9ce

Co-authored-by: Stephan Hoyer <shoyer@google.com>

Finish filling out lax.root

4952304

Remove jax.api._custom_implicit_solve

12509c9

root docstring

ec192d0

disclaimer about root as a low-level API

d737cc9

googlebot added the cla: yes label Sep 12, 2019

shoyer added 6 commits September 12, 2019 12:22

Fix encoding on Python 2

e7b2037

docstring clarification

9584dbe

Error checking for tree structures in root

0e5b5d3

Tests for error checking in lax.root

5d3910f

Test and fix higher order derivatives in root

db5b922

Relax tolerance (apparently needed on Travis)

bfd70b9

shoyer mentioned this pull request Sep 13, 2019

WIP: use jaxprs for custom_implicit_solve #1256

Closed

shoyer commented Sep 25, 2019

View reviewed changes

jax/lax/lax_control_flow.py Outdated Show resolved Hide resolved

shoyer added 5 commits September 24, 2019 21:00

Add a linear solve test

ac90492

fixes

b9ff208

fix abstract eval rule

0f69e45

Merge branch 'master' into solvers

fd975b6

restore altered jit test

a44d8e9

shoyer commented Sep 25, 2019

View reviewed changes

WIP: remove solve

7eaca4d

shoyer added 3 commits September 25, 2019 15:43

Revert "WIP: remove solve"

5ca9f0a

This reverts commit 7eaca4d.

revert structure of lax.root

6fda7be

Revert hack from lax_linalg.py

8b1a60e

shoyer added 4 commits September 25, 2019 16:02

notation

854aa28

error messages

760d696

slightly cleaner

4b53ea2

whitespace

41d8a72

This was referenced Sep 26, 2019

BFGS/Quasi-Newton optimizers? #1400

Open

lax.custom_linear_solve primitive #1402

Merged

mattjj self-assigned this Sep 27, 2019

mattjj approved these changes Sep 27, 2019

View reviewed changes

mattjj merged commit c33e8cb into google:master Sep 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lax.root, a primitive for differentiable root finding #1339

lax.root, a primitive for differentiable root finding #1339

shoyer commented Sep 12, 2019 •

edited

Loading

shoyer Sep 25, 2019

mattjj Sep 25, 2019

shoyer Sep 25, 2019

mattjj Sep 25, 2019

mattjj Sep 25, 2019

mattjj commented Sep 25, 2019

shoyer commented Sep 25, 2019

mattjj left a comment

mattjj Sep 26, 2019

shoyer Sep 27, 2019

mattjj Sep 27, 2019

lax.root, a primitive for differentiable root finding #1339

lax.root, a primitive for differentiable root finding #1339

Conversation

shoyer commented Sep 12, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mattjj commented Sep 25, 2019

shoyer commented Sep 25, 2019

mattjj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shoyer commented Sep 12, 2019 •

edited

Loading