Multi-level derivatives (and seeding) #21

kewp · 2019-03-17T10:54:25Z

I'm getting strange behavior when trying to calculate a gradient. I wonder if it's because in my objective function I use calculate derivatives.

To explain: I see the way one calculates involves seeding autodiff with the target variable:

VectorXdual u(5);
u << 1.0, 1.1, 1.2, 1.3, 1.4;

dual w = energy(u);

VectorXd dwdu(5);
int i; for (i=0; i<5; i++)
{
    seed(wrt(u[i]));
    dwdu[i] = energy(u).grad;
    unseed(wrt(u[i]));
}

So if I understand, seeding is a way to (globally?) track the variable of interest. But what happens if my target function also calculates derivatives?

dual f(VectorXdual u)
{
   // ....
}

dual energy(VectorXdual u)
{
  double dfdx = derivative(f, wrt(x), u);
  // ...
}

I see in forward.hpp that derivative() also uses seeding. Do the two clash with one another? Can a nested derivative calculation like this work?

The text was updated successfully, but these errors were encountered:

allanleal · 2019-03-18T07:24:44Z

I think you should avoid this usage. Define your function energy using dual as if you'd use double.

Why do you need to call derivative inside energy? As soon as you start mixing double and dual above, there is a danger of calculating wrong derivatives, since the double variables will not have the introspection mechanism of the dual variable.

kewp · 2019-03-18T07:34:45Z

Because my energy calculations are based on derivatives of other functions.

I’m coding up finite elements. I’m calculating the strain energy of the system, which is based on the strain which is the partial derivative of displacement!

allanleal · 2019-03-18T07:41:57Z

Because my energy calculations are based on derivatives of other functions.

How would you compute energy without automatic differentiation variables? Would you be able to easily compute these internal derivatives by manually implementing them?

I also happen to define functions that compute thermodynamic properties which themselves require derivatives of other properties. I have to implement these "internal" derivatives manually (in general not difficult) to keep a consistent usage of automatic differentiation types.

allanleal · 2019-03-18T07:44:59Z

So if I understand, seeding is a way to (globally?) track the variable of interest.

Forgot to reply this. Seeding set the grad data member of dual to 1.0. This permits the calculation of derivative of a function with respect to that variable (while all other dual arguments in the function are unseeded, with grad equals to 0.0.

kewp · 2019-03-18T07:51:29Z

I can do them manually. I was hoping to not have to :)

So nested derivatives are out in forward mode? In one of your reverse mode examples you calculate a derivative and then a further derivative of that. Does this restriction only apply to forward autodiff?

allanleal · 2019-03-18T08:12:34Z

So nested derivatives are out in forward mode?

Not necessarily - but I would always recommend a consistent usage of dual throughout. Use double only when the value is a constant. If you are computing a dependent variable, make sure you use dual (or var, if using reverse.hpp).

In one of your reverse mode examples you calculate a derivative and then a further derivative of that. Does this restriction only apply to forward autodiff?

You speak about higher order derivatives I guess. This is also possible with dual, but I would suggest you not to go this route now - focus on first order, and later, if absolutely needed, increase the order of derivatives. Reason: higher order derivatives can dramatically increase the computation effort.

kewp · 2019-03-18T08:34:26Z

In your first reply above you said I’m mixing dual and double. Where do I do that in the code I posted?

allanleal · 2019-03-18T08:41:11Z

Here:

dual energy(VectorXdual u)
{
  double dfdx = derivative(f, wrt(x), u);
  // ...
}

Since dfdx depends on u, any usage of dfdx afterwards will not be autodiff-enabled.

kewp · 2019-03-18T08:50:06Z

Do you know newton raphson? It’s a technique for finding the zero of a function.

My purpose is to find the set of degrees of freedom which minimizes my objective function. Here I call the objective energy. My degrees of freedom are my duals(or vars). And to solve I take the derivative in terms of the vars - this gives me a vector. Then I take the derivatives of this vector - giving a matrix. Solving for Ax = b where A is this matrix and b the aforementioned vector, you should get a new vector x which shows how much to modify your original degrees of freedom to get to the minimum energy. Do this iteratively and you have a generic minimization procedure for any function based on a set of variables.

Does this all sound plausible / possible with autodiff? I’m just worried because of our discussing nested derivatives (though what I describe does seem like your example of taking derivatives of a derivative)

kewp · 2019-03-18T08:51:46Z

Since dfdx depends on u, any usage of dfdx afterwards will not be autodiff-enabled.

Ok - I see what you mean here now

allanleal · 2019-03-18T08:59:38Z

Do you know newton raphson? It’s a technique for finding the zero of a function.

It's in the core of everything I do! ;)

So, you have a minimization problem; you know the objective function; and you want to go up to the Hessian of this objective function to perform the minimization with a Newton-based numerical optimization algorithm?

Can you easily implement the gradient of energy function using dual (or var)? If you can, then you can use jacobian method (forward) on this gradient function (defined using dual) to compute the Hessian (matrix).

This is what I do to minimize Gibbs energy when computing chemical equilibrium - the gradient of the Gibbs energy are the chemical potentials, which I can define directly, instead of defining the Gibbs energy, and then using autodiff to compute them.

From the chemical potentials, I can compute their gradient, which is the Hessian of Gibbs energy. Important message -- all of this using only first-order automatic differentiation.

kewp · 2019-03-18T09:02:16Z

Yes, as you describe, I’m doing a minimization. Good to hear you are using it for similar purposes. I just need to calculate my inner derivatives manually and I should be all set.

kewp · 2019-03-18T09:03:27Z

How do you mean “all only first order autodiff” - surely something like the hessian is second order?

kewp · 2019-03-18T09:26:12Z

By the way - I actually typedef dual and var as ‘real’ and use that throughout so that I can switch between forward and reverse easily. Works well.

kewp · 2019-03-18T09:27:01Z

Sorry I meant define, not typedef. Though I guess typedef would works too?

allanleal · 2019-03-18T09:27:08Z

How do you mean “all only first order autodiff” - surely something like the hessian is second order?

Hessian are second-order derivatives of the energy function. However, they are also the first-order derivatives of the gradient of the energy function. If you can implement these gradients using dual, you'll be able to easily construct the Hessian matrix (using first-order automatic differentiation).

allanleal · 2019-03-18T09:27:57Z

By the way - I actually typedef dual and var as ‘real’ and use that throughout so that I can switch between forward and reverse easily. Works well.

This is perfectly fine.

allanleal · 2019-06-19T09:58:25Z

I'll close this issue, but feel free to reopen it in the future if further discussions are still needed. Thanks.

allanleal added the question Further information is requested label Mar 18, 2019

allanleal closed this as completed Jun 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-level derivatives (and seeding) #21

Multi-level derivatives (and seeding) #21

kewp commented Mar 17, 2019 •

edited

allanleal commented Mar 18, 2019 •

edited

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 •

edited

allanleal commented Mar 18, 2019

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 •

edited

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 •

edited

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 •

edited

allanleal commented Mar 18, 2019

allanleal commented Jun 19, 2019

Multi-level derivatives (and seeding) #21

Multi-level derivatives (and seeding) #21

Comments

kewp commented Mar 17, 2019 • edited

allanleal commented Mar 18, 2019 • edited

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 • edited

allanleal commented Mar 18, 2019

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 • edited

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 • edited

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

kewp commented Mar 18, 2019

allanleal commented Mar 18, 2019 • edited

allanleal commented Mar 18, 2019

allanleal commented Jun 19, 2019

kewp commented Mar 17, 2019 •

edited

allanleal commented Mar 18, 2019 •

edited

allanleal commented Mar 18, 2019 •

edited

allanleal commented Mar 18, 2019 •

edited

allanleal commented Mar 18, 2019 •

edited

allanleal commented Mar 18, 2019 •

edited