-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[proforma] Use cache on modification resolvers #147
Conversation
Thank you @RalfG ! Do you think we can use the old utility function @mobiusklein could you also take a look at this? P.S. Another unnecessary optimization idea: should we auto-wrap all resolve methods of |
Did not realize there was a built-in version of |
Not sure on how to implement it like that, but feel free to modify the PR. |
The Ideally, you'd have the instance manage the cache itself, with as much of the heavy lifting done by the base class. class ModificationResolver(object):
def __init__(self, name, **kwargs):
self.name = name.lower()
self.symbol = self.name[0]
self._database = None
self._cache = {}
...
# Move existing implementations of `resolve` to `_resolve_impl`
def _resolve_impl(self, name=None, id=None, **kwargs):
raise NotImplementedError()
# Manage the cache inside `resolve`
def resolve(self, name=None, id=None, **kwargs):
if self._cache is None:
return self._resolve_impl(name, id, **kwargs)
cache_key = (name, id, frozenset(kwargs.items()))
if cache_key in self._cache:
return self._cache[cache_key]
value = self._resolve_impl(name, id, **kwargs)
self._cache[cache_key] = value
return value This involves no metaprogramming and is just a bit more efficient than Otherwise, we could add a metaclass to A third way would be to split the difference with Python's name resolution system and just do this: class ModificationResolver(object):
def __init__(self, name, **kwargs):
self.name = name.lower()
self.symbol = self.name[0]
self._database = None
self.resolve = memoize(self.resolve) Take a reference to the resolved instance method In terms of compatibility, they have differences in "compatibility":
Which road seems best? I think letting the instance manage the cache is probably best, but it breaks things. The third option is hacky but cheap. The second option is much more work than the first, and if we're doing that, we may as well implement some kind of instance-aware caching anyway. |
Thanks for the detailed answer, @mobiusklein! I don't have any hard preferences for either of the solutions. The first seems the most doable? |
Thank you from me as well @mobiusklein , your analysis is always a ton of insight. Although I would enjoy trying to get option 2 to work when I have time (which is next week), I can easily imagine shooting myself (or someone else) in the foot with it in the process. That being said, I don't exactly understand what you mean by cache inheritance that the metaclass code would have to handle. Wouldn't it essentially do what option 1 does, "renaming" (or "de-naming"?) the original Overall I think option 1 is the most straightforward and it allows creating subclassses that benefit from caching or subclasses that don't use it, so I would not object to just going with that if that seems to be the overall preference. |
I had some other changes already in place for ProForma based upon the recent discussion in HUPO-PSI/ProForma#6. These changes and the first approach I proposed are now in #148. |
Thanks everyone, #148 is merged. Closing this one, but feel free to follow up as needed. |
This PR adds caching to proforma modification resolvers.
In one of our use cases - reading PSMs from a search engine and recalculating theoretical masses - the same modifications (but different object instances) are resolved many times. Adding cache decorators to the resolvers massively speeds up this operation.
I assume there are no drawbacks to adding this. If lru_cache cannot be imported (Python < 3.2), a dummy decorator is defined. For now, I set maxsize to None, as there should be a finite list of modifications in most realistic use cases.