-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failure to converge (again...) #9
Comments
It might be numerical issues. The Gaussian kernel computes the exponential function of the squared distance. For example, in your example the distance between -4 and 4 is 8, so when Specific to your problem, let me think of a better way to computing the gradient. |
I think you might be right. Now I realize that the Python code actually uses numerical gradients (as I don't pass it the derivative), so it might be able to avoid the numerical instabilities at the price of performance? I am looking at GPy's code also to see if I can find a different way to compute the gradient.. |
Hi, sorry again. I'm having trouble with convergence, again using the code you helped me to fix. I have tested the code with larger problems successfully, but I have just found a small case where it fails.
It looks relatively simple, but the minimization process fails due to the line search step becoming too small after 2 iterations. Small variations in the setup make the problem solvable (for example, modifying the initial guess from
1.0, 1.0
to1.1, 1.1
; or by adding more points to the training set or modifying some of them).Is this type of failure a common occurrence with this type of optimizers (as you may have guessed I'm not really an expert in this field)? Using scipy, with the same algorithm, does not seem to usually have problems. Should I just put some try/catch blocks around the optimizer, and if it fails try again with different parameters (which I should do anyway to ensure a global min is found, but I'm asking specifically about errors)? Or maybe it's just a bug and I can keep reporting them :)
Thanks again in advance for your help.
The output is:
If the starting point
[1.1, 1.1]
is used, it works:Same if the starting point
[0.9, 0.9]
is used.The text was updated successfully, but these errors were encountered: