RedisCache - Data Persistence #947
Conversation
This is almost hilariously easy since we can just use the official image for it.
This is the module we will be using to interface with Redis.
We're using __init_subclass__ to initialize our RedisDict with the subclass name as a namespace. This will be prefixed to all data that we store, so that there won't be collisions between different subclasses.
It was brought to my attention that we may need several caches per Cog
for some of our Cogs. This means that the original approach of having
this be a mixin is a little bit problematic.
Instead, RedisDict will be instantiated directly inside the class you
want it in. By leveraging __set_name__, we can create a namespace
containing both the class name and the variable name without the user
having to provide anything.
For example, if you create an attribute MyClass.cache = RedisDict(),
this will be using the redis namespace 'MyClass.cache.' before anything
you store in it.
With this approach, it is also possible to instantiate a RedisDict with
a custom namespace by simply passing it into the constructor.
- RedisDict("firedog") will create items with the 'firedog.your_item'
prefix.
- If there are multiple RedisDicts using the same namespace, an
underscore will be appended to the namespace, such that the second
RedisDict("firedog") will actually create items in the
'firedog_.your_item' namespace.
This is also possible to use outside of classes, so long as you provide
a custom namespace when you instantiate it.
Custom namespaces will always take precedence over automatic
'Class.attribute_name' ones.
The rest of the features should be provided by the MutableMapping abc we're interfacing. Specifically, MutableMapping provides these: .pop, .popitem, .clear, .update, .setdefault, __contains__, .keys, .items, .values, .get, __eq__, and __ne__.
This would've been implemented by MutableMapping, but that implementation is O(n) instead of O(1) since it just iterates the entire hash and does HDEL. Feels wasteful.
This is supposed to be provided by our MutableMapping mixin, but unit tests are demonstrating that these don't really work as intended.
Turns out the MutableMapping class doesn't give us servicable implementations of these, so we need to implement them ourselves. Also, let's not have keys returned as bytestrings.
Turns out that bumping the flake8 version up to 3.8 introduces a long list of new linting errors. Since this PR is the one that bumps the version, I suppose we will also fix all the linting errors in this branch.
kwzrd
left a comment
There was a problem hiding this comment.
I haven't tested yet, but I wanted to leave a couple of initial comments and questions. I have no experience with Redis so this is a learning opportunity for me.
It has taken me some time to figure out how the namespace is inferred when not provided explicitly. I appreciate the effort to have it done automatically, but I found the underlying machinery a little bit confusing. I'm not entirely convinced that it is necessary, either. Couldn't each cog simply provide a namespace explicitly? Would it be too difficult to detect potential collisions?
The PR description shows 3 ways in which the RedisDict can be instantiated. Could we also have this in the class docstring? My instinct would be to require that a namespace is provided everytime we instantiate RedisDict - seems like that would make the initialisation process be a lot simpler and transparent. It would also give us the fabled "one, obvious way to do it".
This just tests that the various RuntimeErrors are reachable - that includes the error about not having a bot instance, the one about not being a class attribute, and the one about not having instantiated the class. This test addresses a concern raised by @MarkKoz in a review. I've decided not to test that actual contents of these RuntimeErrors, because I believe that sort of testing is a bit too brittle. It shouldn't break a test just to change the content of an error string.
|
During my testing yesterday I also found out that the test cases aren't entirely isolated in the sense that values in the cache persist between them. My approach would have been to make sure the cache is cleared before each test (probably in the |
I agree that it would make sense to clear the cache in between tests, otherwise it ends up being somewhat subject to side effects from previous tests and can lead to inconsistent results depending on the order the tests are executed in (which is typically something to avoid). |
Addressed in 35a1de3 |
The way we were doing the asyncio.Lock() stuff for increment was slightly problematic. @aeros has adviced us that it's better to just initialize the lock as None in __init__, and then initialize it inside the first coroutine that uses it instead. This ensures that the correct loop gets attached to the lock, so we don't end up getting errors like this one: RuntimeError: got Future <Future pending> attached to a different loop This happens because the lock and the actual calling coroutines aren't on the same loop. When creating a new test, test_increment_lock, we discovered that we needed a small refactor here and also in the test class to make this new test pass. So, now we're creating a DummyCog for every test method, and this will ensure the loop streams never cross. Cause we all know we must never cross the streams.
ks129
left a comment
There was a problem hiding this comment.
This looks very good for me. I see so much use cases for this.
Also added a test for this. This is the DRYest approach I could find. It's a little ugly, but I think it's probably good enough.
We're using functools.partialmethod to make the code a little cleaner and more readable here. Read more about them here: https://docs.python.org/3/library/functools.html#functools.partial https://docs.python.org/3/library/functools.html#functools.partialmethod
I should be shot.
- Refactor error messages in _to_typestring and _from_typestring to just print the prefix tuples instead of that custom error string. - Create a RedisKeyOrValue type to simplify some annotations. - Simplify partialmethod calls. - Make the signatures for _to_typestring and _from_typestring one-liners - Fix a typo in the errors.
kwzrd
left a comment
There was a problem hiding this comment.
Looks good. I did a function review yesterday & a smaller one today after the changes and everything seems to work fine. The real test will happen once a cog actually begins using the cache. All my comments were addressed so I'm happy to approve. Solid work @lemonsaurus!
Before merge, let's wait for @SebastiaanZ's approval.
The bot can get into trouble in three distinct ways: - It has no Bot instance - It has no namespace - It has no parent instance. These happen only if you're using it wrong. To make the test more precise, and to add a little bit more readability (RuntimeError could be anything!), we'll introduce some custom exceptions for these three states. This addresses a review comment by @aeros.
This addresses a review comment by @aeros.
This pull requests adds a new feature that the bot can use to add persistent storage.
This works by utilizing a Redis server as a cache, while providing a user-friendly dict-like interface.
Let's have a look at how it works and exactly what this pull request contains.
Creating a new RedisCache
The star of the show is the
RedisCache. This must be created as a class attribute.SomeClass.my_cache, in this case.Using a RedisCache
Once you have created this object, it's a lot like an async-friendly dictionary interface. Redis only supports strings and numbers, so all keys and values must be ints, floats, or strings. If you need to store something more complex inside of it, consider writing a
to_jsonandfrom_jsonmethod for your complex object, so you can easily jsonify the object going in and out of the cache.The following features are available, and mostly behaves like their dictionary counterparts:
.get(key, default=None)- Get a value. If thekeydoesn't exist, returnsdefaultinstead..set(key, value)- Set a value in the cache..delete(key)- Delete a specified key value pair..clear()- Clear the entire cache..contains(key)- Return True if key exists in cache, otherwise False..pop(key, default=None)- Get a value and remove it from the cache. If it doesn't exist, returndefaultinstead..items()- This returns an AsyncIterator, so you need toasync forto iterate it..update(dictionary)- Update the cache with values in the provided dict. This will overwrite the value for any key that's already in the cache..to_dict()- Returns a dictionary with all key value pairs in the cache..increment(key, amount)- increments the value atkeybyamount.decrement(key, amount)- decrements the value atkeybyamountValid keys are integers and strings.
Valid values are integers, strings, and floats.
What else is in this PR?
Well, there's a full suite of tests, and we've added Redis to the docker-compose file so it will be possible to use this when running the bot locally.
We also add a few new dependencies, all Redis related.
How do I test this?
Well, you'll need to add a few new variables to your
config.yml. Unless you plan to set up Redis (there's a docker-compose service if you want to), you can use this config:This will allow the bot to run with fakeredis instead of trying to connect to a Redis server, so you will not need to set up any additional dependencies.
Closes #884