-
Notifications
You must be signed in to change notification settings - Fork 0
Add custom function for filtering cached arguments. #1
Conversation
^ huh, that's unexpected. On local I get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a monster comment discussing the API we want for this.
cache_utils/decorators.py
Outdated
@@ -11,7 +11,8 @@ | |||
logger = logging.getLogger("cache_utils") | |||
|
|||
|
|||
def cached(timeout, group=None, backend=None, key=None): | |||
def cached(timeout, group=None, backend=None, key=None, | |||
filter_args_kwargs=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very useful feature 💯 I have some opinions about the API for this, let me know your thoughts!
The goal here is to allow the user to modify the cache key for each call, right? filter_args_kwargs
as a concept seems slightly too restrictive than it needs to be: we may also want to transform them. (Or if you want to live dangerously, add something new to the key, like a mutable global variable )
Further, the return value doesn't have to be in the args & kwargs, tuple and dict form—we just need a serializable value we can use as a cache key. I think the simplest interface would be a function that maps arguments to any Python value that may be serialized to a concrete cache key.
I would separate out the following things and allow the user to customize each:
- Calculating the function cache key (the prefix)
- Based on function arguments on a single call, determine the unserialized cache key value
- Serializing cache key value above (JSON? Pickle?
hash
?django-cache-utils
casts to string)
Calculating the function cache key
The generated function key in django-cache-utils
is something like cacheable_scrape_article[cached]:431
. We may want to make the key
kwarg accept a callable to facilitate this, e.g.
@cached(..., key=lambda fn: f'{fn.__name__}_version_1')
That would allow the default value of key
to be the function that returns cacheable_scrape_article[cached]:431
based on the function.
Determining the cache key for a single call
Specifying the cache key for a call would be facilitated with a keyword argument similar to filter_args_kwargs
; I think the most intuitive kwarg name would be key
, but that is already used for the function key. Without having to think about backwards compatibility, I would rename key
to fn_key
and have the call cache key function specified in key
.
Serializing the key value for a call
Now that we have the key (Python) value for a call, we can use the user-specified serializer to turn it into a key value usable by the cache backend. django-cache-utils
uses a simple string cast, but I think JSON/pickle would be more robust. Then the whole cache key would be serialize(fn_key(fn)) + serialize(call_key(*args, **kwargs))
Obviously we don't need to make all these changes at once, but it would be useful to have a consistent, robust API in mind as we make improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that renaming key
to fn_key
and making it a callable with the default work as is now would be nice.
Also now that key
is free, we can use that for computing the paramters to a python value, I wanted to use that but was taken.
Lastly, I guess a nice to have would be to add other serialization methods, yep. But this isn't an immediate option to use since we use cached in a bunch of places and changing this to sth other than the _args_to_unicode
with prefix [cached]
would be equivalent to invalidating all our cache. We could start using different serialization protocols for new code though.
I've changed the code to support this. At first I was hesitant to do too many changes and the odd implementation from before encouraged hacks. I like your idea with splitting the args and the function conversion to serializables and using those. I just used str for now, I didn't look much into what the final output needs to be so memcached/group_backend.py etc. related code would accept it. Moreover, kept backwards compatibility so our old cache is hit with this update. Let me know what you think and if you spot any issues 👍 .
cache_utils/utils.py
Outdated
""" Construct readable cache key """ | ||
if filter_args_kwargs and inspect.isfunction(filter_args_kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does callable
work here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that's less restrictive, we can go with that.
cache_utils/decorators.py
Outdated
and returns the new args and kwargs. Those are | ||
combined with function name to get the final key. | ||
E.g. | ||
def filter_args_kwargs(args, kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some discussion about filter_arg_kwargs(args, kwargs)
versus filter_args_kwargs(*args, **kwargs)
in my other comment, but now I see that this code comment was just outdated and that the actual implementation is like filter_args_kwargs(*args, **kwargs)
which I think is the better choice. I removed that discussion from my other comment.
^ tests don't work well for me, I think because I don't have memcached setup locally. Not sure if on circleCI it worked or not, but now seems some tests fail, so might have worked, idk. Nevertheless, should sort out the tests thing before merging this in. I'm not sure if all tests passed beforehand since some were failing for me because I don't have memcached running. Putting below which ones failed for me in master probably due to not running memcached on local.
|
…m default key, provide legacy key and fn_key functions to allow cache key compatibility with previous versions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made the following changes:
-
Removed support for decorating methods
-
Improved
fn_key
: omitted line number, used__qualname__
instead of__name__
(is there a case we would still need the line number?) -
Simplified the default implementation for
key
not to consider serialization -
Added
legacy_fn_key
andlegacy_key
to remain compatible with previous cache key generation, added a test for that. Usage would be like this:from cache_utils import cached, legacy_fn_key, legacy_key @cached(3600, fn_key=legacy_fn_key, key=legacy_key) def foo(...): ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that you simplified the code even more and kept an option for the legacy way of doing it. We probably will be using those ourselves for old code.
See the only question with None
I had below. Otherwise I think we can merge this in.
cache_utils/decorators.py
Outdated
@@ -46,16 +71,26 @@ def test(*args, **kwargs): | |||
cache_backend = caches['default'] | |||
|
|||
def _cached(func): | |||
func_type = _func_type(func) | |||
fn_key_str = str( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could become str(None) = "None"
(if you explicitly pass fn_key=None
for any reason) and then it wouldn't raise an error below.
Do you think we should maybe also add an if above for fn_key to not be falsy? Or if a user provides an argument, then maybe they check docs? But there aren't any, so the code in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch! We can make a hard requirement for fn_key
to either be a callable or a str
, and the result of a callable fn_key
to be a str
to remove the str
casting. If these requirements fail, we raise an exception. This should avoid any unexpected behavior.
|
||
|
||
def default_fn_key(func): | ||
return ".".join([func.__module__, func.__qualname__]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this, didn't like lineno either, and qualname is better. But I suppose we'll have to make sure to keep on using the legacy
ones for the caches we've already set up.
# try to get the value from cache | ||
key = _get_key(wrapper._full_name, func_type, args, kwargs) | ||
key = _get_key(*args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
also, makes perfect sense to move all the _get_key
and fn_key_str
computations inside the _cached
context, since we have the function and can drop one param.
Re: setting up memcached locally and getting the tests to run, this is all I had to do: brew install libmemcached
brew services start memcached |
Adds ability to only use certain fields as cache key.