Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derivative test for objective function #487

Closed
IoannisDadiotis opened this issue Jun 8, 2021 · 3 comments
Closed

Derivative test for objective function #487

IoannisDadiotis opened this issue Jun 8, 2021 · 3 comments

Comments

@IoannisDadiotis
Copy link

Hi,

I am using ipopt through ifopt and I am doing some derivative tests by setting the options:

    solver->SetOption("jacobian_approximation", "exact"); // or "finite difference-values"
    solver->SetOption("derivative_test_tol", 1.0e-2);
    solver->SetOption("derivative_test", "first-order");

I am comparing the results for two cases: the first case has an objective function with a single cost term with weight 10^-4 while the second case has the same cost term with weight 1. The rest of the problem formulation (variables etc.) are the same.

The results from the derivative checker are shown below:
For weight 10^-4:

Starting derivative checker for first derivatives.

* grad_f[        420] =  4.7617540487957744e-02    ~  1.9047019416026542e-01  [ 1.429e-01]
* grad_f[        426] =  4.4739772270356151e-02    ~  1.7895908735936888e-01  [ 1.342e-01]
* grad_f[        432] =  4.7976466917872079e-02    ~  1.9190590299554014e-01  [ 1.439e-01]
* grad_f[        438] =  4.5660384384473472e-02    ~  1.8264157104828069e-01  [ 1.370e-01]
* grad_f[        444] =  4.6179821397722552e-02    ~  1.8471932467127913e-01  [ 1.385e-01]
* grad_f[        450] =  4.6840283915501199e-02    ~  1.8736113198078982e-01  [ 1.405e-01]
* grad_f[        456] =  4.7494238659630185e-02    ~  1.4248273394713876e-01  [ 9.499e-02]
* grad_f[        462] =  4.4754106109401760e-02    ~  1.3426232810525213e-01  [ 8.951e-02]
* grad_f[        468] =  4.5392279538534926e-02    ~  1.3617681808375881e-01  [ 9.078e-02]
* grad_f[        474] =  4.7625979158451258e-02    ~  1.4287794010391297e-01  [ 9.525e-02]
* grad_f[        480] =  4.8275919895357844e-02    ~  1.4482774821407091e-01  [ 9.655e-02]
* grad_f[        486] =  4.6354199334318787e-02    ~  1.3906258204989308e-01  [ 9.271e-02]
* grad_f[        492] =  4.5689516518942243e-02    ~  9.1379033272701263e-02  [ 4.569e-02]
* grad_f[        498] =  4.7955001840006360e-02    ~  9.5909986773661401e-02  [ 4.795e-02]
* grad_f[        504] =  4.5531061003782784e-02    ~  9.1062108653460969e-02  [ 4.553e-02]
* grad_f[        510] =  4.6844609776584459e-02    ~  9.3689218811716560e-02  [ 4.684e-02]
* grad_f[        516] =  4.7861429206997402e-02    ~  9.5722860350577532e-02  [ 4.786e-02]
* grad_f[        522] =  4.4923717375541253e-02    ~  8.9847432021441534e-02  [ 4.492e-02]

Derivative checker detected 18 error(s).

For weight 1.0:

Starting derivative checker for first derivatives.

* grad_f[        420] =  4.7617540487957740e+02    ~  1.9047017383787261e+03  [ 7.500e-01]
* grad_f[        426] =  4.4739772270356150e+02    ~  1.7895908158606067e+03  [ 7.500e-01]
* grad_f[        432] =  4.7976466917872074e+02    ~  1.9190586879693096e+03  [ 7.500e-01]
* grad_f[        438] =  4.5660384384473468e+02    ~  1.8264154810195839e+03  [ 7.500e-01]
* grad_f[        444] =  4.6179821397722549e+02    ~  1.8471929032383664e+03  [ 7.500e-01]
* grad_f[        450] =  4.6840283915501197e+02    ~  1.8736114293194059e+03  [ 7.500e-01]
* grad_f[        456] =  4.7494238659630184e+02    ~  1.4248269610756954e+03  [ 6.667e-01]
* grad_f[        462] =  4.4754106109401755e+02    ~  1.3426231461141804e+03  [ 6.667e-01]
* grad_f[        468] =  4.5392279538534922e+02    ~  1.3617683555543877e+03  [ 6.667e-01]
* grad_f[        474] =  4.7625979158451253e+02    ~  1.4287794166983490e+03  [ 6.667e-01]
* grad_f[        480] =  4.8275919895357845e+02    ~  1.4482774919372546e+03  [ 6.667e-01]
* grad_f[        486] =  4.6354199334318787e+02    ~  1.3906259460704255e+03  [ 6.667e-01]
* grad_f[        492] =  4.5689516518942241e+02    ~  9.1379019537573561e+02  [ 5.000e-01]
* grad_f[        498] =  4.7955001840006355e+02    ~  9.5910005170279624e+02  [ 5.000e-01]
* grad_f[        504] =  4.5531061003782781e+02    ~  9.1062120638615920e+02  [ 5.000e-01]
* grad_f[        510] =  4.6844609776584457e+02    ~  9.3689227043736321e+02  [ 5.000e-01]
* grad_f[        516] =  4.7861429206997400e+02    ~  9.5722857956241751e+02  [ 5.000e-01]
* grad_f[        522] =  4.4923717375541253e+02    ~  8.9847424976072750e+02  [ 5.000e-01]

Derivative checker detected 18 error(s).

As expected, the floating point numbers of the second case are the same as the respective of the first case multiplied by 10^4 (expected as the weight is passed to the derivatives). However, the difference between them (that is the number in brackets) does not follow this trend. Shouldn't the differences (number in brackets) for the second case be the same as the respective from the first case multiplied by 10^4?

Another observation: I looked into the documentation where it is stated that "The first floating point number is the value given by the user code, and the second number (after "~") is the finite differences estimation. Finally, the number in square brackets is the relative difference between these two numbers."

However in the example of the documentation this rule is not satisfied for the grad_f (see photo below).
image
The difference between -6.5159999999999991e+02 and -6.5559997134793468e+02 is not equal to [ 6.101e-03].

@svigerske
Copy link
Member

svigerske commented Jun 8, 2021

The error in the bracket is actually absolute if the absolute value of the approximated gradient is below 1:

Number deriv_approx = (fpert - fref) / this_perturbation;
Number deriv_exact = grad_f[ivar];
Number rel_error = std::abs(deriv_approx - deriv_exact) / Max(std::abs(deriv_approx), Number(1.));

So for weight 10^-4 you get absolute difference:

* grad_f[        420] =  4.7617540487957744e-02    ~  1.9047019416026542e-01  [ 1.429e-01]
because 4.7617540487957744e-02 - 1.9047019416026542e-01 = -1.429e-01

For weight 1 you get a relative difference

* grad_f[        420] =  4.7617540487957740e+02    ~  1.9047017383787261e+03  [ 7.500e-01]
because (4.7617540487957740e+02 - 1.9047017383787261e+03)/1.9047017383787261e+03 = -7.500e-01

The same for the other observation:
The relative difference between -6.5159999999999991e+02 and -6.5559997134793468e+02 is

(-651.59999999999991+655.59997134793468)/max(1,|-655.59997134793468|) = 6.101e-03

@IoannisDadiotis
Copy link
Author

Thank you for the immediate explanation.

Why there is need for absolute difference? Wouldn't it be more representative to compute the relative difference for all cases in the sense that the number in the brackets would be the same?

@svigerske
Copy link
Member

I think it is quite common to not scale up the absolute difference when computing a relative difference. Also the case that deriv_approx is 0 needs to be accounted somehow.

But I don't really mind changing this formula a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants