Why is Y_TLM ignored using Adjoint methods #3

sh-cau · 2020-01-20T15:33:10Z

In the continuous adjoint equations, the gradient of a functional w.r.t. parameters is also depending on how the initial states of the system ODE depend on those parameters. If I understand correctly, this is the option/parameter Y_TLM. But why is this ignored for the adjoint case? Is this somehow implicitly calculated?
Thank you

The text was updated successfully, but these errors were encountered:

Steven-Roberts · 2020-01-24T22:20:32Z

The correct option is Mu here. Y_TLM is only used for the TLM, while Mu and Lambda are the adjoint variables.

sh-cau · 2020-01-27T10:13:36Z

Ok, maybe its just not clear to me what is meant with what. I'm sorry. Let's see if we are on the same side notation-wise. If I have the functional

subject to the ODE
,
the corresponding adjoint equations become

And thus the sensitivity w.r.t. p is evaluated with
.
Now the following input arguments are clear to me:

Lambda argument:
Mu argument (passed as Jacobian):
DRDY and DRDP are self-explanatory
Jacp => function handle to dfdp
Jacobian => function handle to dfdy
QFun => function handle to r(y,p)
Quadrature => function handle that returns initial value of the integral (typically zeros)

I understand that after the solver call, the output contains

Lambda (value at t=0, i.e. sensitivity of Psi w.r.t. initital states)
Mu (sensitivity of cost functional w.r.t. parameters)

but in essence without the term that I expected would be Y_TLM, namely

so if I get this right, I would have to add this term manually if

to obtain the "true" sensitivity? Unfortunatley, I'm not able to validate this, because I can't obtain a sensible output for a minimal working example, see Issue adjoint sensitivitites wrong? #4

Steven-Roberts · 2020-01-27T20:33:58Z

I think λ^T(0) dy_0/dp = dΨ/dy_0 * dy_0/dp = dΨ/dp = μ^T. I'm not sure I completely understand your question.

sh-cau · 2020-01-29T10:47:54Z

@Steven-Roberts I am confused. I think I am mixing up the continuous and the discrete adjoint formulations... Therefore this is not really an issue in the code but rather an issue of my comprehension. Though I think the result should be equivalent when the adjoining the continuous equations and discretizing afterwards compared to discretizing the problem and ajoining afterwards?

The mu that I have in mind and the output 'Mu' are therefore not the same? "My" mu is the (constant) lagrange multiplier obtained by adjoining the system ODE and initial state to the cost functional (with integral/running costs included in Psi as above)

Now after taking the derivative of L with respect to p and after integration by part one has to set the adjoint ODE with final value as shown above to get rid calculating dy/dp at any time. In the process, one also has to set

to avoid calculating dy/dp@t=0 which leads to the above-mentioned Jacobian/sensitivity but in continuous time. This still leaves me confused with your comment, though. Because the lambda(0) ( and therefore mu, too) is still the sensitivity w.r.t. the initial state and inserting lambda(0)=dPsi/dy0 above would not make sense, or at least it would mean that

I understand if you don't have time to look into this too deeply.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is Y_TLM ignored using Adjoint methods #3

Why is Y_TLM ignored using Adjoint methods #3

sh-cau commented Jan 20, 2020

Steven-Roberts commented Jan 24, 2020

sh-cau commented Jan 27, 2020

Steven-Roberts commented Jan 27, 2020

sh-cau commented Jan 29, 2020

Why is Y_TLM ignored using Adjoint methods #3

Why is Y_TLM ignored using Adjoint methods #3

Comments

sh-cau commented Jan 20, 2020

Steven-Roberts commented Jan 24, 2020

sh-cau commented Jan 27, 2020

Steven-Roberts commented Jan 27, 2020

sh-cau commented Jan 29, 2020