You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I either have a usage problem or there is something wrong with the source code. Because I'm unsure which is the case I both opened a stackoverflow question and this Issue. I will just copy the text I've already written on stackoverflow to here.
tl;dr
I want to find the local minimum for a function that depends on 2 variables. For that my plan was to use the scipy.optimize.minimize function with the "newton-cg" method because I can calculate the jacobian and hessian analytically. However, when my starting guesses are on a local maximum, the function terminates successfully in the first iteration step on top of the local maximum, even though the hessian is negative.
Reproducing code
The reproducing code and the wrongly successfull termination message can be seen below.
The same outcome occurs if I use hess=None, hess='cs', hess='2-point' or hess='3-point' instead of my custom hess function. Also if I use other methods like 'dogleg', 'trust-ncg', 'trust-krylov', 'trust-exact' or 'trust-constr' I basically get the same outcome except nhev = 1 but the outcome is still wrong with x = [0,0].
Either I am doing something terribly wrong here (definitely very likely) or there is a major problem with the minimize function, specifically the "newton-cg" method (very unlikely?).
Regarding the latter case I also checked the source code to see if something's wrong there and stumpled upon something kinda weird(?). However, I don't completely understand the whole code, so I am a bit unsure if my worries are legitimate:
Taking a look at the source code
When the minimize function is called with method="newton-cg" it jumps into the _minimize_newtoncg function (see source code here). I want to go into detail what I believe happens here (so you can tell where I am potentially wrong):
On line 2168A = sf.hess(xk) the hessian is first calculated in dependence of xk which is at first the start guess x0. For my test case the hessian is of course
A = [[f_xx, f_xy], [f_xy, f_yy]]
with f_ij being the derivatives of f after i and j. In my case f_xy = f_yx is also true.
Next on line 2183Ap = A.dot(psupi) the product of the hessian A and psupi is calculated. psupi is basically equal to b which is the negative gradient of f at xk. So Ap = A.dot(psupi) results in
Ap = [f_xx f_x + f_xy f_y, f_xy f_x + f_yy f_y].
Now to the (possible) problem
Next, the curvature curv is calculated on line 2186 by np.dot(psupi, Ap). As explained above, psupis the negative gradient of f so this results in
curv = f_xx f_x^2 + 2 f_xy f_x f_y + fyy_f_y^2.
However, all of these derivatives are at xk which is equal to the starting parameters x0 at first. If the starting parameters are exactly at a local maximum, the derivatives f_x and f_y are equal to 0. Because of this, curv = 0. This results in a for loop break on the next line, thus, skipping to update xsupi, psupi and all the other parameters. Therefore, pk becomes [0,0] and _line_search_wolfe12 is called with basically all start parameters. This is where my understanding of the source code stops, however, I feel like things already went wrong after curv = 0 and breaking the for loop.
My concluding question is: What am I doing wrong leading to the minimize function being stuck on a local maximum?
Small update: I now understand that this case of starting at a local maximum and finding a local minimum from there might not have been implemented. So I am unsure if that would be doable to add in the future. If it were, I'd greatly appreciate it.
lucascolley
added
scipy.optimize
query
A question or suggestion that requires further information
and removed
defect
A clear bug or issue that prevents SciPy from being installed or used as expected
labels
Mar 13, 2024
Preface
I either have a usage problem or there is something wrong with the source code. Because I'm unsure which is the case I both opened a stackoverflow question and this Issue. I will just copy the text I've already written on stackoverflow to here.
tl;dr
I want to find the local minimum for a function that depends on 2 variables. For that my plan was to use the scipy.optimize.minimize function with the "newton-cg" method because I can calculate the jacobian and hessian analytically. However, when my starting guesses are on a local maximum, the function terminates successfully in the first iteration step on top of the local maximum, even though the hessian is negative.
Reproducing code
The reproducing code and the wrongly successfull termination message can be seen below.
The same outcome occurs if I use
hess=None
,hess='cs'
,hess='2-point'
orhess='3-point'
instead of my custom hess function. Also if I use other methods like'dogleg'
,'trust-ncg'
,'trust-krylov'
,'trust-exact'
or'trust-constr'
I basically get the same outcome exceptnhev = 1
but the outcome is still wrong with x = [0,0].Either I am doing something terribly wrong here (definitely very likely) or there is a major problem with the minimize function, specifically the "newton-cg" method (very unlikely?).
Regarding the latter case I also checked the source code to see if something's wrong there and stumpled upon something kinda weird(?). However, I don't completely understand the whole code, so I am a bit unsure if my worries are legitimate:
Taking a look at the source code
When the minimize function is called with method="newton-cg" it jumps into the _minimize_newtoncg function (see source code here). I want to go into detail what I believe happens here (so you can tell where I am potentially wrong):
On line 2168
A = sf.hess(xk)
the hessian is first calculated in dependence ofxk
which is at first the start guessx0
. For my test case the hessian is of courseA = [[f_xx, f_xy], [f_xy, f_yy]]
with f_ij being the derivatives of f after i and j. In my case f_xy = f_yx is also true.
Next on line 2183
Ap = A.dot(psupi)
the product of the hessianA
andpsupi
is calculated.psupi
is basically equal tob
which is the negative gradient off
atxk
. SoAp = A.dot(psupi)
results inAp = [f_xx f_x + f_xy f_y, f_xy f_x + f_yy f_y].
Now to the (possible) problem
Next, the curvature curv is calculated on line 2186 by
np.dot(psupi, Ap)
. As explained above,psupis
the negative gradient off
so this results incurv = f_xx f_x^2 + 2 f_xy f_x f_y + fyy_f_y^2.
However, all of these derivatives are at
xk
which is equal to the starting parametersx0
at first. If the starting parameters are exactly at a local maximum, the derivativesf_x
andf_y
are equal to0
. Because of this,curv = 0
. This results in a for loop break on the next line, thus, skipping to updatexsupi
,psupi
and all the other parameters. Therefore,pk
becomes [0,0] and_line_search_wolfe12
is called with basically all start parameters. This is where my understanding of the source code stops, however, I feel like things already went wrong aftercurv = 0
and breaking the for loop.My concluding question is: What am I doing wrong leading to the minimize function being stuck on a local maximum?
Small update: I now understand that this case of starting at a local maximum and finding a local minimum from there might not have been implemented. So I am unsure if that would be doable to add in the future. If it were, I'd greatly appreciate it.
Reproducing Code Example
Error message (not an error but the OptimizedResult)
SciPy/NumPy/Python version and system information
The text was updated successfully, but these errors were encountered: