Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to compute Jacobian of outputs w.r.t. inputs #224

Closed
smao-astro opened this issue Jun 25, 2021 · 3 comments
Closed

How to compute Jacobian of outputs w.r.t. inputs #224

smao-astro opened this issue Jun 25, 2021 · 3 comments

Comments

@smao-astro
Copy link

Hi,

I am new to JAX and Objax, and I would like to compute the "partial derivative" of outputs w.r.t. inputs, below is a piece of code

import objax
import jax.numpy as jnp
import jax

m = objax.nn.Sequential([
    objax.nn.Linear(1, 10),
    objax.functional.elu
])

key = jax.random.PRNGKey(0)
x = jax.random.normal(key, (10, 1))

dydx = jax.vmap(jax.jacfwd(m))(x)

The doc suggests do not mix JAX and Objax's transformation, and my question is:

  1. I can not find a API in Objax do jacfwd or jacrev, so what is the standard way to calculate Jacobian?
  2. Why mixing JAX and Objax is discouraged, is it always a bad idea or in some case it is allowed and beneficial?
  3. I understand Objax is more object-oriented and stateful while Jax is stateless, but what's the difference of vmap and objax.Vectorize?

Thanks.

@AlexeyKurakin
Copy link
Member

  1. Compute Jacobian.

You can reimplement object-oriented version of jacfwd or jacrev similarly how objax.GradValues is implemented.

Another option is to use combination of objax.Vectorize and objax.GradValues, in other words vectorize computation of gradient. DPSGD gradient module to some extend does it:

def _make_clipped_grad_fn_simple(self, f: Callable, gv: GradValues, vc: VarCollection) -> Callable:

  1. Why not mixing JAX and Objax

You can not use functional jax transformations with Objax (like jax.vmap, jax.pmap, jax.grad, etc...). In other words those transformations which takes function and returns a new function.
You can use other jax operations (for example all stuff from jax.numpy.*) safely with Objax.

All JAX primitives are stateless and pure functional (i.e. don't have and don't assume side-effects). Objax provides wrappers for JAX primitives to simplify state management and make is more natural for machine learning applications.

So if you try to mix JAX functional tranformations with Objax primitives it will break the state management and either code won't work at all or will work incorrectly.

  1. Difference between JAX and Objax primitives

As I mentioned above, Objax provides wrappers which simplify state management.
So for example objax.Vectorize is a wrapper over jax.vmap which does the state management and enables usage of stateful Objax primitives with stateless JAX.

@smao-astro
Copy link
Author

I see, thank you for your explanation!

@chaoming0625
Copy link

  1. Compute Jacobian.

You can reimplement object-oriented version of jacfwd or jacrev similarly how objax.GradValues is implemented.

It is hard to implement object-oriented version of jacfwd or jacrev, because jacfwd or jacrev do not support return auxiliary data. Is there solution? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants