too many function calls? #100

amerand · 2019-01-01T19:58:10Z

I am starting with Julia, so this might be something obvious I am missing. I re-wrote a code I have from Python to Julia, and it involves an LM fit. The results are the same, but all things being equal, scipy.optimize.leastsqfit calls the function to fit 16 time, whereas curve_fit calls it >250 times (!). I need to reduce a lot tolX and tolG to converge faster, but the result is no longer accurate. The code is rather complex to I cannot easily post it, I was just wondering if there is a know way to minimize the number of function calls.

Edit: I find the same behavior for a simple 2nd polynomial fit: curve_fit makes 74 calls, whereas scipy.optimize.leastsq makes 9 call, with same accuracy:
testfit_jl_py.txt

pkofod · 2019-01-02T09:18:40Z

Two things:

is the curve_fit call slower than scipy? If so, by how much?
Are you sure those nfev also count the function evaluations used in the finite difference approximation to the jacobian?

If you can check out the master branch of LsqFit, you can try to use the ForwardDiff automatic differentiation package by specifying autodiff=:forward, but your code obviously has to support dual numbers.

amerand · 2019-01-02T15:01:18Z

The function I try to optimize currently runs at about the same speed in Python and Julia, so the Julia code is nearly 10x slower than the Python code, because curve_fit makes nearly 10 times more calls...
Yes, nfev includes the estimation of the Jacobian. To check, I added a counter, like I did in the Julia code.

The forward algo works for my polynomial example and reduces a lot the number of calls (by about a factor of 4) and ends up calling only 2x as much as the python code. I tried autodiff=:forward on my more complex function, but I got a very long error message which I could not decrypt. What are "dual numbers"?

PS: by the way, the function "standard_error" has changed name to "stderror", but this is not reflected in the documentation.

pkofod · 2019-01-02T15:36:30Z

Re 1 and 2, perfect. I just want to be 100% sure that we're counting the same things.

The forward algo works for my polynomial example and reduces a lot the number of calls (by about a factor of 4) and ends up calling only 2x as much as the python code. I tried autodiff=:forward on my more complex function, but I got a very long error message which I could not decrypt. What are "dual numbers"?

Yes, this is because the function is called fewer times since we're no longer using an approximation to the jacobian using central differences. Note, that MINPACK (that scipy is called) is using a forward differences approach, so you can expect them to make fewer calls for the same number of major iterations of the LM algorithm. I could allow for specifying something else than central differences (say forward differences) in the finite differences approximation.

The error message you got is telling you that ForwardDiff.jl cannot pass through a Dual number type. This can either be because you're doing something that ForwardDiff doesn't support, or it can be because you've restricted something that you didn't have to (say to Float64). I need to see the code to be of more help here.

My conclusion: scipy is called your model function fewer times because they're using forward finite differences to approximate the jacobian where LsqFit is using central finite differences for better accuracy at the expense of run-time. This doesn't matter for simple functions and small number of data points, but for complicated examples it may be a problem. We should document this. If you can use ForwardDiff it will be a help sometimes. It may also very well be that MINPACK's LM is more efficient (requires fewer major iterations) than ours, or that it's terminating further from the optimum. We'll have to look into this a bit more to be sure.

PS: by the way, the function "standard_error" has changed name to "stderror", but this is not reflected in the documentation.

Yes, this is because you're looking at the readme for the development version. Github doesn't allow me to show the readme for the release version.

amerand · 2019-01-02T15:49:21Z

I removed some types I had forced in the code (Float64 indeed) and it now accepts to compute the gradient using ForwardDiff, which reduced by a factor of ~5 the number of calls. It is still more than scipy (by a factor of 2) but much better. From the tests I am running, I do not see differences in optima between LsqFit and scipy... Thanks for the quick feedback!

pkofod · 2019-01-02T15:50:25Z

From the tests I am running, I do not see differences in optima between LsqFit and scipy... Thanks for the quick feedback!

Cool, thanks. I'm just guessing here as I cannot run your code :) Is it proprietary or protected in some way, or can you share it in private?

amerand · 2019-01-04T13:24:14Z

Thanks for offering to help, but you would be losing your time I am afraid: the code is not particularly neat as it is my first code in Julia coming from Python... I managed to have the core computation function running faster than the Python one, but I have now to revise the data structure which is the limiting factor now.

pkofod · 2019-01-04T14:58:23Z

I'm leaving this open because there's both a documentation issue here (that we're using central differences, so expect a lot of f calls with default settings) and a potential wasteful amount of f calls beyond the finite difference calls.

pkofod · 2019-03-19T14:06:48Z

8ca89ff fixes this imo

thanks for bringing my attention to this

gcalderone mentioned this issue Feb 4, 2019

Result comparison with CMPFit #105

Open

pkofod closed this as completed Mar 19, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

too many function calls? #100

too many function calls? #100

amerand commented Jan 1, 2019 •

edited

pkofod commented Jan 2, 2019

amerand commented Jan 2, 2019 •

edited

pkofod commented Jan 2, 2019

amerand commented Jan 2, 2019

pkofod commented Jan 2, 2019

amerand commented Jan 4, 2019

pkofod commented Jan 4, 2019

pkofod commented Mar 19, 2019

too many function calls? #100

too many function calls? #100

Comments

amerand commented Jan 1, 2019 • edited

pkofod commented Jan 2, 2019

amerand commented Jan 2, 2019 • edited

pkofod commented Jan 2, 2019

amerand commented Jan 2, 2019

pkofod commented Jan 2, 2019

amerand commented Jan 4, 2019

pkofod commented Jan 4, 2019

pkofod commented Mar 19, 2019

amerand commented Jan 1, 2019 •

edited

amerand commented Jan 2, 2019 •

edited