-
Notifications
You must be signed in to change notification settings - Fork 231
Open
Description
You set an absolute tolerance of g_tol = 1e-8 by default. I don't know how you picked this — it depends on the scaling of the problem, because gradients are dimensionful.
This implicitly assumes that the function is scaled so that the typical derivatives are of order unity. This makes it pretty dangerous to use the default tolerances, in my opinion.
A safer default would have been to set a relative tolerance g_reltol on the gradient, e.g. by setting g_tol = g_reltol * initial_g.
At the very least, the manual should highlight this implicit scaling assumption, and let people know that they are likely to need a problem-specific g_tol. (Also for gradient-free methods, where g_tol means who-knows-what.)
Metadata
Metadata
Assignees
Labels
No labels