Performance Improvements #442

dtnguyen2 · 2018-04-12T18:32:05Z

Model evaluation
- Model caching, not to recompute models with frozen parameters
- Paralellization with intelligent parsing of the model expression
Slowness in fitting X-ray spectra
- issues in simultaneous fitting of multiple data (more general) - do not reevaluate the models with the
same parameters multiple times
Parallelization for MonCar and Nelder-Mead methods:

lower priority until the two parts as described above are done

JohannesBuchner · 2023-11-24T12:20:33Z

Model evaluation

Model caching, not to recompute models with frozen parameters

Paralellization with intelligent parsing of the model expression
Slowness in fitting X-ray spectra

issues in simultaneous fitting of multiple data (more general) - do not reevaluate the models with the
same parameters multiple times

These would be great. Another low-hanging fruit improvement would be to not recomputing apec (and other xspec models) shape if only the normalization parameter was changed.

We are fitting a sum of 10 apecs with the temperatures fixed in logarithmic steps, where the normalisations are linked to a formula whose parameters are varied. There is no real reason why this should be slow.

So I'd propose model caching, not to recompute models with unchanged parameters. This can be achieved with memoization.

I placed a practical benchmark here: https://github.com/JohannesBuchner/xray-fit-benchmark

hamogu · 2023-11-24T12:32:39Z

In general, models are cached in Sherpa using the

sherpa/sherpa/models/model.py

Line 373 in 80b22a8

def modelCacher1d(func):

decorator.

I'm not sure what @dtnguyen2 had in mind when he wrote that, but the first bullet should be solved by now.

I agree with

Another low-hanging fruit improvement would be to not recomputing apec (and other xspec models) shape if only the normalization parameter was changed.

Unfortunately, that's a little harder to implement in a general sense: For all XSPEC models, we pass all model parameters to the XSPEC function. There is no special markup which of those is the norm and which of those require a re-compute. We could make an informed guess that any parameter with the substring "norm" in it may not require a recomputation, but that's seem brittle (it would work for APEC, though).

However, you should be able to achieve the same effect by changing your model definitions; instead of set_source(apec.a1 + apec.a2) you could do

sect_source(scale1d.norm1 * apec1.a1 + scale1d.norm2 * apec.a2)
a1.norm = 1
a2.norm =1
# freeze all the apec parameters ...

Because there are no changes to any of the parameters for apec1 and apec2 in the fitting loop now, it should not be eveluated again.

JohannesBuchner · 2023-11-24T12:59:34Z

Thanks @hamogu!

Unfortunately, that's a little harder to implement in a general sense: For all XSPEC models, we pass all model parameters to the XSPEC function. There is no special markup which of those is the norm and which of those require a re-compute. We could make an informed guess that any parameter with the substring "norm" in it may not require a recomputation, but that's seem brittle (it would work for APEC, though).

Maybe sherpa could maintain a list of additive models for which "norm" exists and this is safe to do, and automatically wrap those, i.e., add a scale1d.norm and keep xspec.norm=1.

With your workaround, the speed increased from 15 model evaluations per second to 20 model evaluations per second.

I could not apply modelCacher1d to the xspec table model:

Traceback (most recent call last):
  File "generateconst.py", line 28, in <module>
    mymodel = xstbabs.galabso * (scale1d.torusnorm * modelCacher1d(torus) + scale1d.omninorm * modelCacher1d(omni))
  File "/home/user/bin/ciao-4.13/lib/python3.7/site-packages/sherpa-4.13.0-py3.7-linux-x86_64.egg/sherpa/models/model.py", line 86, in modelCacher1d
    cache_model.__name__ = func.__name__
  File "/home/user/bin/ciao-4.13/lib/python3.7/site-packages/sherpa-4.13.0-py3.7-linux-x86_64.egg/sherpa/models/model.py", line 203, in __getattr__
    NoNewAttributesAfterInit.__getattribute__(self, name)
AttributeError: 'XSTableModel' object has no attribute '__name__'

DougBurke · 2023-11-24T23:49:21Z

Also see #1782

…

On Fri, Nov 24, 2023 at 12:59 PM Johannes Buchner ***@***.***> wrote: Thanks @hamogu <https://github.com/hamogu>! Unfortunately, that's a little harder to implement in a general sense: For all XSPEC models, we pass all model parameters to the XSPEC function. There is no special markup which of those is the norm and which of those require a re-compute. We could make an informed guess that any parameter with the substring "norm" in it may not require a recomputation, but that's seem brittle (it would work for APEC, though). Maybe sherpa could maintain a list of additive models for which "norm" exists and this is safe to do, and automatically wrap those, i.e., add a scale1d.norm and keep xspec.norm=1. With your workaround, the speed increased from 15 model evaluations per second to 20 model evaluations per second. I could not apply modelCacher1d to the xspec table model: Traceback (most recent call last): File "generateconst.py", line 28, in <module> mymodel = xstbabs.galabso * (scale1d.torusnorm * modelCacher1d(torus) + scale1d.omninorm * modelCacher1d(omni)) File "/home/user/bin/ciao-4.13/lib/python3.7/site-packages/sherpa-4.13.0-py3.7-linux-x86_64.egg/sherpa/models/model.py", line 86, in modelCacher1d cache_model.__name__ = func.__name__ File "/home/user/bin/ciao-4.13/lib/python3.7/site-packages/sherpa-4.13.0-py3.7-linux-x86_64.egg/sherpa/models/model.py", line 203, in __getattr__ NoNewAttributesAfterInit.__getattribute__(self, name) AttributeError: 'XSTableModel' object has no attribute '__name__' — Reply to this email directly, view it on GitHub <#442 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABW27R22GK3KDYGXTI2V5LYGCK4BAVCNFSM4E2JEHXKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBSGU3DIMZZHE3Q> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

hamogu · 2023-11-25T02:10:00Z

@JohannesBuchner The caching for XSPEC models should already be applied without you doing things. @DougBurke is the master of knowing how it's done, but the goal is that, as a user, you don't have to do anything because we apply sensible caching as default.

DougBurke · 2023-12-07T14:16:47Z

So, the cache code is complicated to understand as there's some complexities there I don't quite understand and have never had time to track down what's going on (last time I tried to look at how well the cache was working the results didn't make sense to me; perhaps it's time to look again).

One thing I will say, and it echoes the suggestion in #442 (comment) is that the current cache code works at the python model class instance level - that is, if you have

mdl1 = XSapec("m1")
mdl2 = Xsapec("m2")

then there are caches for mdl1 and mdl2, so even if the parameters are exactly the same then the XSPEC function would be called twice. This means that you do want to use the same model instance as much as possible (with an extra constant-type term where necessary) to try and benefit from the cache code. I added was the cache_status method for models to get some indication if it's being used - e.g.

sherpa In [15]: clean()

sherpa In [16]: create_model_component("xsvapec", "m1")
Out[16]: <XSvapec model instance 'xsvapec.m1'>

sherpa In [17]: m1.cache_status()
 xsvapec.m1                 size:    1  hits:     0  misses:     0  check:     0

sherpa In [18]: m1([0.1, 0.2, 0.3], [0.2, 0.3, 0.4])
Out[18]: array([0.32428567, 0.16706778, 0.10965365])

sherpa In [19]: m1.cache_status()
 xsvapec.m1                 size:    1  hits:     0  misses:     1  check:     1

sherpa In [20]: m1([0.1, 0.2, 0.3], [0.2, 0.3, 0.4])
Out[20]: array([0.32428567, 0.16706778, 0.10965365])

sherpa In [21]: m1.cache_status()
 xsvapec.m1                 size:    1  hits:     1  misses:     1  check:     2

sherpa In [22]: m1([0.1, 0.2, 0.3], [0.2, 0.3, 0.4])
Out[22]: array([0.32428567, 0.16706778, 0.10965365])

sherpa In [23]: m1.cache_status()
 xsvapec.m1                 size:    1  hits:     2  misses:     1  check:     3

sherpa In [24]: m1.kt = 2

sherpa In [25]: m1([0.1, 0.2, 0.3], [0.2, 0.3, 0.4])
Out[25]: array([0.53666901, 0.22059353, 0.15235161])

sherpa In [26]: m1.cache_status()
 xsvapec.m1                 size:    1  hits:     2  misses:     2  check:     4

JohannesBuchner · 2023-12-07T16:02:24Z

Even with independent caches, since my model keeps all parameters except for the normalisation the same, it should be so slow.

I cannot find where create_model_component is defined (missing in github search).

If that would wrap some selected models and handle norm internally, it would be solved.

Another idea I had is that projection through the response could potentially be cached, if one applied a distributing transformation to the model tree (Resp * (a * ma + b * mb + c * mc) -> a * Resp * ma + b * Resp * mb + c * Resp * mc), which would be trivially fast to compute if only normalisations change. This can then be used in combination with block sampling.

Is there a way to use sherpa in a pedestrian fashion? Can I do something like:

mymodel = get_model(myid)
spec = get_data(myid)
bkg = get_bkg(myid)
statistic = compute_cstat(mymodel.projected_model_spectrum, spec.counts, bkg.counts, bkg.scale)

What is the right way for that last line?

DougBurke · 2023-12-07T18:20:59Z

FYI I don't have much time to spend time on so it'll take some time for me to respond.

hamogu · 2023-12-07T18:33:31Z

FYI I don't have much time to spend time on so it'll take some time for me to respond.

It took me some time to count how many times the word "time" can be used in a sentence with < 20 words..

hamogu · 2023-12-07T18:44:11Z

But seriously, there is a trade-off between developer time and run-time. The suggestion in #442 (comment) should work in principle, but it's not simple in any sense.
It requires treating certain model parameters special, starting with XSPEC model norms. But why stop there? How about norms in Sherpa models? Or in user-defined models?
I'm not saying it won't work and it's not worthwhile to do in Sherpa itself or at least as a documented "work-around" for users that do need that speed-up in some case. What I am saying is that I don't know if that's the most efficient use of limited developer time or if there are other bottlenecks that we should address first that are either easier to do or more general and thus bring benefit to more users.

For that, it would be good to actually write down a few end-to-end workflows that we can benchmark and profile to identify where we should focus our efforts. Is it caching as discussed here? Parallelization? Avoiding temporary copies of big arrays? Move more Python Code to C? Avoid overhead of calling C by moving more code to Python/Numpy? Just-in-time compilation of Python code with Numba / JAX / ...? Find better optimizers / use analytical gradiends/ autograd / to reduce the number of model evaluations needed to get a good fit?

Any of those can help in specific cases, but without profiling a set of representative use cases, I don't know which of those are high or low priority or have a high or low likelihood of success.

dtnguyen2 mentioned this issue Apr 19, 2018

Improve caching #444

Merged

This was referenced May 17, 2019

An initial release of simultaneous fit on multicores (slower for most… #483

Merged

Run optimization algorithms over multiple cores #596

Merged

anetasie added area:code area:docs type:enhancement type:question labels Jul 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance Improvements #442

Performance Improvements #442

dtnguyen2 commented Apr 12, 2018

JohannesBuchner commented Nov 24, 2023

hamogu commented Nov 24, 2023

JohannesBuchner commented Nov 24, 2023

DougBurke commented Nov 24, 2023 via email

hamogu commented Nov 25, 2023

DougBurke commented Dec 7, 2023

JohannesBuchner commented Dec 7, 2023

DougBurke commented Dec 7, 2023

hamogu commented Dec 7, 2023 •

edited

hamogu commented Dec 7, 2023

Performance Improvements #442

Performance Improvements #442

Comments

dtnguyen2 commented Apr 12, 2018

JohannesBuchner commented Nov 24, 2023

hamogu commented Nov 24, 2023

JohannesBuchner commented Nov 24, 2023

DougBurke commented Nov 24, 2023 via email

hamogu commented Nov 25, 2023

DougBurke commented Dec 7, 2023

JohannesBuchner commented Dec 7, 2023

DougBurke commented Dec 7, 2023

hamogu commented Dec 7, 2023 • edited

hamogu commented Dec 7, 2023

hamogu commented Dec 7, 2023 •

edited