Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use cache for predctions in likelihood #80

Merged
merged 7 commits into from
Mar 15, 2019

Conversation

peterstangl
Copy link
Collaborator

This PR implements caching for predictions of observables inside a MeasurementLikelihood instance.

@coveralls
Copy link

coveralls commented Mar 15, 2019

Coverage Status

Coverage decreased (-0.003%) to 94.012% when pulling 7b08a5c on peterstangl:predictions_cache into 5a9b64e on flav-io:master.

@DavidMStraub
Copy link
Contributor

That's a brilliant idea!

Do I understand correctly that the idea is that the cost of hashing is negligible since it is done only once and is small compared to the calculation of all observables? Did you check how long it roughly takes (just to understand if it would affect e.g. a MeasurementLikelihood with just a single, very fast observable)?

We should also add a comment to the doc string about the existence of caching and a comment in the code as to what the line defining predictions_key is doing.

Concerning the hashing, I see a potential problem with hashing wc_obj. The problem is that, for historical reasons, WilsonCoefficients was initialized empty and WCs were set afterwards with set_initial. But the default hash of a user class will stay the same over the lifetime of the object, so it won't change when the WCs change.

For Wilson, this would be less of an issue, as WCs are set on instantiation. Nevertheless, it would be better to have a custom __hash__ that also looks at the config dictionary etc. Since this is probably not too relevant for wilson, this could be done aat the level of WilsonCoefficients. The questions is how to get a fast hash. I will play around a bit.

@peterstangl
Copy link
Collaborator Author

Do I understand correctly that the idea is that the cost of hashing is negligible since it is done only once and is small compared to the calculation of all observables? Did you check how long it roughly takes (just to understand if it would affect e.g. a MeasurementLikelihood with just a single, very fast observable)?

The hashing takes around 30 μs on my laptop. How fast is the fastest observable? Probably still some orders of magnitude slower?

We should also add a comment to the doc string about the existence of caching and a comment in the code as to what the line defining predictions_key is doing.

OK, I will add some comments.

Concerning the hashing, I see a potential problem with hashing wc_obj. The problem is that, for historical reasons, WilsonCoefficients was initialized empty and WCs were set afterwards with set_initial. But the default hash of a user class will stay the same over the lifetime of the object, so it won't change when the WCs change.

For Wilson, this would be less of an issue, as WCs are set on instantiation. Nevertheless, it would be better to have a custom __hash__ that also looks at the config dictionary etc. Since this is probably not too relevant for wilson, this could be done aat the level of WilsonCoefficients. The questions is how to get a fast hash. I will play around a bit.

It might be good to actually recompute the hash of a wc_obj each time the WCs are changed and save this hash in the wc_obj such that it can be retrieved very fast. It would be nice, if the hash is constructed only from the defining wc_dict, the scale and the basis with

hash((frozenset(wc.dict.items()),wc.basis,wc.scale))

Using such a hash, the hash for two different wc_obj would be the same if they describe the same point in an EFT basis.

@DavidMStraub
Copy link
Contributor

So for me, hashing the par_dict takes about 60 µs on my machine, hashing a general wcxf.WC.dict about 230 µs. But here are some fast observables that only take about O(10 µs). However I guess it is OK to assume wcxf.WC instances to remain unchanged during their lifetime.

@DavidMStraub
Copy link
Contributor

DavidMStraub commented Mar 15, 2019

OK so in conclusion I think we can merge your PR already, with only

  • doc string
  • code comment
  • wc_obj.__hash__() (this is a private method) → hash(wc_obj)

I can then separaretly implement WilsonCoefficients.__hash__. Assuming wcxf.WC to be immutable, this would essentially amount to

hash((self.wc, frozenset(self._options)))

if I am not mistaken.

@DavidMStraub
Copy link
Contributor

... actually since this is just 3 lines, you can just add this to your PR. in WilsonCoefficients:

def __hash__(self):
    """Return a hash of the `WilsonCoefficient` instance.
    This assumes that `self.wc` is not modified over its lifetime. The hash only changes when options are modified."""
    hash((self.wc, frozenset(self._options)))

@DavidMStraub
Copy link
Contributor

Sorry, this docstring is misleading, as the attribute self.wc can indeed change, just not the instances themselves. Better:

    """Return a hash of the `WilsonCoefficient` instance.

    The hash changes when Wilson coefficient values or options are modified.
    It assumes that `wcxf.WC` instances are not modified after instantiation."""

@DavidMStraub
Copy link
Contributor

I realized that this solution as of now has a memory leak: it caches all calls, which will quickly eat up all memory in a scan.

So either we need a LRU cache or, much simpler, just cache the last value called. This should be sufficient for our use case.

@peterstangl
Copy link
Collaborator Author

... actually since this is just 3 lines, you can just add this to your PR. in WilsonCoefficients:

def __hash__(self):
    """Return a hash of the `WilsonCoefficient` instance.
    This assumes that `self.wc` is not modified over its lifetime. The hash only changes when options are modified."""
    hash((self.wc, frozenset(self._options)))

This actually does not work since self._options is not set on initialization. Should I just set it to None on init?

@peterstangl
Copy link
Collaborator Author

Actually, None does also not work, but I could set it to {}.

@peterstangl
Copy link
Collaborator Author

@DavidMStraub I think this is ready to be merged.

@DavidMStraub DavidMStraub merged commit 48347a1 into flav-io:master Mar 15, 2019
@peterstangl peterstangl deleted the predictions_cache branch March 15, 2019 19:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants