Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing Derivatives of solutions of optimization Problems #157

Closed
a-jp opened this issue Jul 25, 2022 · 9 comments
Closed

Computing Derivatives of solutions of optimization Problems #157

a-jp opened this issue Jul 25, 2022 · 9 comments

Comments

@a-jp
Copy link

a-jp commented Jul 25, 2022

Hi, I'm new to cppad, but not new to ipopt or the concept of AD. In trying to explain my question, I was going to use this example. My question (sorry if title is not quite right), let's assume that the objective also depended on another scalar:

fg[0] = scalar*x1 * x4 * (x1 + x2 + x3) + x3;

I'd like to understand how to obtain the jacobian representing d(xi)/d(scalar), but here the xi is only obtained after minimising via ipopt. Is that possible, does my question make sense? scalar can be any type as needed. So I guess in finite difference terms, I'd compute xi(scalar+eps) via an ipopt call and then xi(scalar) via another ipopt call, then compute the FD Jac as (xi(scalar+eps) - xi(scalar))/(eps). I'm hoping instead to do this by AD...

Many thanks,
Andy

@bradbell
Copy link
Contributor

The problem here is that the minimizer w.r.t x will depend on scalar, so if you change the value of scalar, you will have to recompute the minimizer in order to evaluate your objective ?

@a-jp
Copy link
Author

a-jp commented Jul 26, 2022

Thanks. That's ok. I appreciate it's very expensive but allows me to solve something without the worry on whether I've chosen an eps for finite difference that is overly affecting the answer. Would you be able to show me how to do it?

@bradbell
Copy link
Contributor

bradbell commented Jul 26, 2022

I am still not sure I understand our problem. I think that you want to solve the problem

minimize f(x) w.r.t. x
subject to L6 <= x6 <= U6

and then compute the partial of f(x) w.r.t. x6. This will just be the positive (negative) of the Lagrange multiplier for x6 at the optimal solution if L6 (U6) is active at the solution

@a-jp
Copy link
Author

a-jp commented Jul 26, 2022

I think what I'm trying to do, is the following:

$$x = guess, x is a vector minimise Func(x, R) w.r.t to x, here R is a constant scalar for any given minimisation$$

Once I've solved the minimisation of Func, which is a scalar value, and obtained x, then I need to solve the following using the optimised x:

Given:
$$B = \sum_{i} x_{i}(R) b_{i}(R) $$

Compute:

$$ \frac{\partial}{\partial R} B = \sum_{i} x_{i}(R) \frac{\partial b_{i}(R) }{\partial R} + \sum_{i} b_{i}(R) \frac{\partial x_{i}(R) }{\partial R} $$

I've computed the first partial derivative using cppAD and compared this to a known result and it's exact, code to do so as follows:

        const std::size_t n = 1;
        std::vector<ADNumber> ax(n);
        ax[0] = R; // constant known scalar as noted above

        CppAD::Independent(ax);

        const std::size_t m = 1;
        std::vector<ADNumber> B(m);
        B[0] = 0.0;

        for (std::size_t s = 0; s < size; s++)
        {
            ADNumber b = 0.0;
            Compute_b_term(ax[0], b); // this takes ax[0] as a const&, and b is taken as a non-const& and is assigned inside the function
            // Note! x comes from solution.x (after ipopt call completes) and is simply a std::vector<double>, not an AD type
            B[0] += x[s] * b; // here x is the solution vector coming from ipopt after minimization
        }

        CppAD::ADFun<double> f(ax, B);

        // compute derivative using operation sequence stored in f
        // std::vector<double> jac(m * n); // Jacobian of f (m by n matrix)
        std::vector<double> val(n);        // domain space vector
        val[0] = R;                        // argument value for computing derivative
        const auto &first_partial_derivative = f.Jacobian(val); // Jacobian for operation sequence

I've no idea how to compute the second partial derivative above...could you provide any advice? I assume it requires me to use similar code to the above, but to use Func somehow? I don't know if I can just compute the value of Func one more time with the optimised values of x, outside of the optimiser and record this in AD types, and then use similar code to the above but now with B as the size of x? Bit stuck...

@bradbell
Copy link
Contributor

bradbell commented Jul 26, 2022

Suppose we are given the unconstrained poblem

$$ {\rm minimize} ~ f(x, R) ~ {\rm w.r.t} ~ x $$

Define $x(R)$$ as the solution corresponding to $R$, then this function can be defined implicitly by

$$ 0 = \partial_x f[ x(R), R] $$

because the derivative is zero at an unconstrained optimum.

The problem here is computing the derivatives of x(R) w.r.t. R. I suggest you see
https://www.seanet.com/~bradbell/newton_step.htm

@a-jp
Copy link
Author

a-jp commented Jul 27, 2022

Looks like you're missing double-$ at the end of the line?

Many thanks for this, that's going to take some time to digest. I should have added, and omitted because I was unaware of where this was going, that my problem is a constrained optimisation problem (equality and inequality).

Can I ask another question, which is related as it came up setting up the finite difference version of this, which I should add seems to work and produces answers that are reported elsewhere for my problem. Using my nomenclature above, I am doing:

$$ \frac{\partial}{\partial R} B \approx \frac{ Func(x, R+eps) - Func(x, R) }{eps} $$

To get this to work using as a starting point this example and since I need to call the optimiser twice per derivative that I need, and I need those derivatives at a lot of Rs, I moved a lot of the code as possible to a one-off set up function, and then call the minimum amount when I need the derivative. Ideally no construction and destruction of the big objects, was my aim. I found two thing out:

  • To vary R which I must do, not only for perturbing by eps (~1.0e-08), and for different R's, I need to ensure retape = true because within the functions that get called in the optimiser I do things like if R < R_1. I believe the way to fix this is to use CondExpLt, which I'm working on to fix. Am I correct?
  • This second one confused me. I do initialise x_i to a guess, as required. I then as per the above reasoning, do a lot of one-off set up. Outside of computing derivatives and just normal optimisation calls, I've found previously with my python implementation using ipopt, that I do sometimes need to make a better initial guess. There is also a use case for a better initial guess after setup in the above partial derivative, since the result of the first optimiser call can be used as the initial guess for the second call, to reduce the number of iteration in ipopt. I've found, that once the Ipopt::TNLP object is constructed, no further changes to the initial values of x_i can be made use of. The values used for x_i are the values used on construction of Ipopt::TNLP. Have I figured that out correctly? Is that integral to how it works, or something that can be changed?

Many thanks,

@bradbell
Copy link
Contributor

  • To vary R which I must do, not only for perturbing by eps (~1.0e-08), and for different R's, I need to ensure retape = true because within the functions that get called in the optimiser I do things like if R < R_1. I believe the way to fix this is to use CondExpLt, which I'm working on to fix. Am I correct?

Yes

@bradbell
Copy link
Contributor

  • This second one confused me. I do initialise x_i to a guess, as required. I then as per the above reasoning, do a lot of one-off set up. Outside of computing derivatives and just normal optimisation calls, I've found previously with my python implementation using ipopt, that I do sometimes need to make a better initial guess. There is also a use case for a better initial guess after setup in the above partial derivative, since the result of the first optimiser call can be used as the initial guess for the second call, to reduce the number of iteration in ipopt. I've found, that once the Ipopt::TNLP object is constructed, no further changes to the initial values of x_i can be made use of. The values used for x_i are the values used on construction of Ipopt::TNLP. Have I figured that out correctly? Is that integral to how it works, or something that can be changed?

Are you using ipopt_solve for your optimization ?
https://coin-or.github.io/CppAD/doc/ipopt_solve.htm

@bradbell bradbell changed the title New User: Jacobian of independent vector Computing Derivatives of solutions of optimization Problems Jul 27, 2022
@bradbell
Copy link
Contributor

I am going to move this issue to a discussion.

@coin-or coin-or locked and limited conversation to collaborators Jul 27, 2022
@bradbell bradbell converted this issue into discussion #159 Jul 27, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants