Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimising the computation time for the running of the RG equations #53

Open
Jorge-Alda opened this issue Jan 20, 2023 · 6 comments
Open

Comments

@Jorge-Alda
Copy link

I would be interested in reducing smelli's computation times, and I think that the running of the RG equations and matching could be an easy starting point.

For reference, I am using the following Wilson coefficients:

w = wilson.Wilson({'lq1_3323': -1e-7}, scale=1e3, eft="SMEFT", basis="Warsaw")

The computation of .log_likelihood_dict() with the default likelihoods takes 9.61s in my computer.

  • The function wilson.run.smeft.rge.smeft_evolved is called 23 times, taking a total of 1.06s. All of the calls except one had the same arguments, they were from the NP scale down to the EW scale (the other one corresponded to the Higgs mass), so it would be possible to compute it once and cache the result.
  • The function wilson.match.smeft.match_all is called 21 times, taking a total of 0.39s. Again, all the calls have the same arguments, the matching is always performed at the EW scale, so it would be possible to cache it.
  • The function wilson.run.wet.classes.WETrunner.run is called 217 times, taking a total of 0.83s. But only 10 different scales were used, so cacheing would also be helpful.

A rough estimate, assuming that the SMEFT running is called twice, the matching once and the WET running ten times, it would be a reduction by 2.13s, a 22% of the total computation time.

@Jorge-Alda
Copy link
Author

I haven't yet tried to implement it, but this is more or less what I think it could be done (this is mostly a note to myself, I will try it in the following days):

  • In the likelihood_*.yaml and fast_likelihood_*.yaml files, add a section for the renormalization scale (all observables within each file have the same scale, right?)
  • In the method _log_likelihood(), compute the matching and running of the Wilson coefficients before calling each likelihood's log_likelihood method. The coefficients would be stored in a dict whose keys are the renormalization scales, so if two likelihoods have the same scale, it wouldn't be necessary to calculate everything again.
  • To keep backwards-compatibility, if a yaml file doesn't include a renormalization scale, it should call the likelihood's log_likelihood with the NP scale.

@Jorge-Alda
Copy link
Author

Looks like one can't add new things to the yaml files without modifying the schema defined in flavio.

@Jorge-Alda
Copy link
Author

Actually, most of the calls come from par_dict_np

@peterstangl
Copy link
Collaborator

Hi @Jorge-Alda,

the reduction of the computation time of smelli is actually an important point, on which I have been working in several directions (reducing the time needed for computing theory predictions as well as RG running and SMEFT-WET matching), but it will still take some time until these optimizations are done and will be made available in smelli.

Concerning your suggestions, please note that wilson already implements caching that should make sure that the running to a certain scale is only done once. I.e. even if wilson is called multiple times during the computation of the likelihood, most of the times it should just return cached results, which takes essentially no time. So I am doubting that implementing additional caching in smelli would significantly reduce the computation time. Did you actually observe such a reduction in your implementation in PR #54?

@Jorge-Alda
Copy link
Author

Regarding wilson implementing caching: I see that here there are methods to cache the results of match_run and retrieve the cached values. The problem is that this is a per-instance cache, and in Python, function arguments aren't passed by reference. So if that cache is modified inside a function, the wilson.Wilson instance outside the function won't have an updated cache. Take for example this code:

import wilson
import flavio
w = wilson.Wilson({'lq1_3323': -1e-7}, scale=1e3, eft="SMEFT", basis="Warsaw")
rd_np = flavio.np_prediction("Rtaul(B->Dlnu)", w)
print(w._cache)

If you use a Python debugger and set a breakpoint here, you will see indeed that wilson only computes once the running to 91 GeV and once the running to 4.8 GeV and then reuses the values, but only for the wilson.Wilson coefficient inside flavio.np_prediction(). When that function ends, the cache is lost forever. The _cache of the wilson.Wilson instance outside the function is empty. And if you now want to run

rds_np = flavio.np_prediction("Rtaul(B->D*lnu)", w)

then wilson will have to match and run again.

If you want a working cache at the wilson level, I think that you would have to implement it as a global variable. The code in my PR is a workaround at the smelli level.

I can confirm that my code reduces the computation time of likelihood_dict() by aprox 20%.

@DavidMStraub
Copy link
Contributor

DavidMStraub commented Jan 24, 2023

The problem is that this is a per-instance cache, and in Python, function arguments aren't passed by reference.

Nope, that's wrong.

The problem, I believe, is that, to make flavio (which predates wilson) compatible with wilson.Wilson, we made flavio.physics.eft.WilsonCoefficients a subclass of wilson.Wilson. It does copy the cache on instantiation, but that cache is obviously not mirrored to the original instance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants