Skip to content
This repository has been archived by the owner on Feb 21, 2023. It is now read-only.

Port redis-py impl and tests to aioredis. #891

Merged
merged 12 commits into from Mar 18, 2021

Conversation

seandstewart
Copy link
Collaborator

@seandstewart seandstewart commented Jan 25, 2021

What do these changes do?

This is a full port of redis-py's client implementation into an asyncio-friendly implementation.

The approach is essentially a line-by-line re-implementation. The major caveats to this are:

  1. Some connection code is simply incompatible and had to be implemented differently.
  2. We use black on our code-base, which will create line-count discrepancies between the two sources.
  3. I've added a first-pass at type annotations (a lot of work is needed to tighten it up if we want to add mypy to our CI, but it gets us much further along).

Beyond that, all functional logic remains as close to the original, synchronous source as possible.

The motivation behind this is to ease the workload for future contributors and maintainers. By replicating redis-py's implementation, we greatly reduce our own workload, and are able to quickly port new features and bugfixes as they are added to redis-py, including tests.

Are there changes in behavior for the user?

Yes.

Related issue number

#822

Checklist

  • I think the code is well written
  • Unit tests for the changes exist
  • Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES/ folder
    • name it <issue_id>.<type> (e.g. 588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the PR
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example:
      Fix issue with non-ascii contents in doctest text files.

@seandstewart seandstewart force-pushed the sean/aioredis-redis-py-compliance branch from 364e04b to f8393c7 Compare January 25, 2021 21:00
@seandstewart
Copy link
Collaborator Author

@Andrew-Chen-Wang as promised... this is the port from redis-py.

I still need to investigate some intermittent test failures but this is mostly complete.

@lgtm-com
Copy link

lgtm-com bot commented Jan 25, 2021

This pull request introduces 5 alerts when merging f8393c7 into da75ada - view on LGTM.com

new alerts:

  • 2 for Missing call to __init__ during object initialization
  • 1 for Missing call to __del__ during object destruction
  • 1 for Unused import
  • 1 for __init__ method calls overridden method

@lgtm-com
Copy link

lgtm-com bot commented Jan 26, 2021

This pull request introduces 5 alerts when merging e344695 into da75ada - view on LGTM.com

new alerts:

  • 2 for Missing call to __init__ during object initialization
  • 1 for Missing call to __del__ during object destruction
  • 1 for Unused import
  • 1 for __init__ method calls overridden method

@codecov
Copy link

codecov bot commented Jan 26, 2021

Codecov Report

Merging #891 (d4487d2) into master (53d8103) will decrease coverage by 4.78%.
The diff coverage is 93.12%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #891      +/-   ##
==========================================
- Coverage   96.43%   91.65%   -4.79%     
==========================================
  Files          57       17      -40     
  Lines        8580     5809    -2771     
  Branches      554      489      -65     
==========================================
- Hits         8274     5324    -2950     
- Misses        230      372     +142     
- Partials       76      113      +37     
Flag Coverage Δ
unit 91.63% <92.78%> (-4.77%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aioredis/client.py 81.04% <ø> (ø)
tests/test_commands.py 99.21% <ø> (ø)
tests/test_connection.py 50.00% <ø> (ø)
tests/test_connection_pool.py 99.29% <ø> (ø)
tests/test_encoding.py 98.82% <ø> (ø)
tests/test_lock.py 100.00% <ø> (ø)
tests/test_monitor.py 100.00% <ø> (ø)
tests/test_multiprocessing.py 56.88% <ø> (ø)
tests/test_pipeline.py 100.00% <ø> (ø)
tests/test_pubsub.py 96.22% <ø> (ø)
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 53d8103...d4487d2. Read the comment docs.

@Andrew-Chen-Wang
Copy link
Collaborator

woah this is super cool. I skimmed through the code today and I'll review it later this weekend. I'm a little afraid though that there'll be a lot of breaking changes...? But this is awesome!!!

Additionally, we probably would want to run a benchmark of the port vs. v1.3.1 if there's any significant change (I assume not much since the structure is really pretty much the same). Also note: there are a couple remnants of redis-py in some places like L3827 aioredis/client.py and I'm assuming tested Redis version is v5 for now (i.e. the auth still only allows password)?

@Andrew-Chen-Wang
Copy link
Collaborator

My biggest concern is this is completely untested in production, and pretty different from the existing aioredis lib. I'm wondering if we should push out a beta version of the current status of the repo, or we should instead just implement this PR and push it out. It'll have a lot of breaking changes, and a lot of people aren't fans of migrations.

@seandstewart
Copy link
Collaborator Author

Yeah - I think it’s a valid concern, however we also get the benefit that it’s now much easier for current users of redis-py to migrate their existing code to aioredis, which may help drive adoption and in turn drive community support.

As long as we have a decent alpha/beta/rc period and good migration docs, I think this could be the way to go.

As for breaking changes or the outstanding changes we currently have in master… many of the changes or bug fixes are moot with this PR, and many outstanding features (such as full support for redis 6) are now done. Additionally, master currently contains breaking changes of its own, so I don’t see value in pushing out a beta there if we go this route.

All in all, my main goal in maintaining this library is to make continued maintenance as easy as possible. The loss of the core maintainer of this library exposed a very real problem that happens to codebases with a large amount of custom code and only one author - there is little understanding of the original implementation or the motivations behind it. Such code is always at a high risk of “aging” to a state where no one can touch it or make effective contributions.

I don’t want to see that happen again, and I think the best way to do that is ensure that the code itself can be understood by as many people as possible.

Does that make sense?

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@lgtm-com

This comment has been minimized.

@Andrew-Chen-Wang
Copy link
Collaborator

Was there a reason you needed to add redis-server --daemonize yes in d9cbb9b?

@seandstewart
Copy link
Collaborator Author

seandstewart commented Jan 28, 2021

Was there a reason you needed to add redis-server --daemonize yes in d9cbb9b?

Yeah - aioredis has traditionally started all of its own redis-servers internally within the test run. I've been trying to move away from that and instead run the redis as external processes and break out each redis version we test as its own test run.

It makes debugging test failures much simpler, since it's not one 10K line output per test run

@lgtm-com

This comment has been minimized.

@seandstewart
Copy link
Collaborator Author

So Windows tests are still borked, but I'm wondering if they're of any value considering we're running the tests on a pretty old version of Redis to begin will.

I may mark these as allowed to fail and look at putting Windows on Azure pipelines in the near future.

@Andrew-Chen-Wang
Copy link
Collaborator

Hm I'm still of the opinion most devs aren't even using Redis at this point if they're on Windows; if they were, they would already be using WSL, which would mean they're using the latest version of Redis. At least to me, it feels like unnecessary maintenance and not really worth the effort to test for the lower versions anyways (e.g. redis-py, aredis, and asyncio-redis don't test them).

@seandstewart
Copy link
Collaborator Author

Aaaand our tests are green! ✅

I'm going to dig into making sure we're exporting our codecov correctly, then update documentation to reflect this major change.

@lgtm-com
Copy link

lgtm-com bot commented Jan 28, 2021

This pull request introduces 6 alerts when merging b08b7b1 into 8c6cb22 - view on LGTM.com

new alerts:

  • 2 for Unused import
  • 2 for Missing call to __init__ during object initialization
  • 1 for Missing call to __del__ during object destruction
  • 1 for __init__ method calls overridden method

@popravich
Copy link
Contributor

OMG, what it has become((((

@popravich
Copy link
Contributor

popravich commented Jan 29, 2021

If and when this is merged feel free to rename the library into redis-py-asyncio, its just not aioredis any more... (((

@popravich
Copy link
Contributor

Now, without a sarcasm.
This is not the way to go!
You literally deleted all aioredis code and cloned redis-py. This library is not a port of something else!
This is not proper place to make redis-py asynchronous, you should have had open this PR in that repository.
You basically killing work and effort other people put in this project.

Not buffering responses from redis into a byte-stream (as redis-py does). Instead we wrap our futures with deserialization futures, which is hard to debug and leads to ambiguous low-level asyncio errors. Buffering reads into a byte-stream means we can write more discernible and understandable code for our parsing, which now doesn't need to deal with futures directly.

Why not refactoring connection logic to use asyncio protocol not streams?

Making the ConnectionPool and Connection objects interchangeable on a single attribute. For simple cases, this makes sense, but for many cases the ambiguity results in potentially bad or unexpected behavior.

I disagree. First of all, there always was a way to explicitly acquire a connection to use it, there should be documentation and examples how to do it. And AFAIR the case is opposite: this works for most cases.

The codebase is littered with TODOs acknowledging areas where we need to go back and "pin" a connection to prevent unexpected behavior. A much better pattern here would be to have an explicit connection attribute and always explicitly aquire a connection if necessary.

Partially agree. This how it worked in the very beginning. Agree that connections handling need refactoring but I believe that bringing new ideas can be better than copy-pasting someone else's code.

Porting from redis-py also lowers the barrier to implementing new features as Redis advances, as we're now able to source potential solutions from another library.

"implementing" here is just a wrong word, "copying" is the one.


All the motivation above looks to me like a reluctance to improve and evolve current aioredis library, this makes me really sad.
Under the hood aioredis is not simple and a person who wants to develop/enhance must deeply understand how it works.

@seandstewart
Copy link
Collaborator Author

seandstewart commented Jan 30, 2021

@popravich -

Welcome back! I'm glad you've returned.

I want to be clear from the start: if you wish to resume maintenance of this library, then I will of course retract this PR and step away. I made it clear in #822 that I was taking this position on in a temporary manner in order to get contributions flowing again.

This PR is an attempt to make this library maintainable by more than just a single author. Porting redis-py achieves that goal. It is a pragmatic choice. I don't particularly like all of the decisions in that library, but I do think that using redis-py as an original source means there doesn't have to be a single BDFL maintaining this library, which has been my goal since taking on temporary maintainership. When I made this decision, the original author hadn't been heard from in over a year.

Faced with that knowledge, I made the choice to do what you see here today. With this change, continuing contributions to this code-base are much simpler. The redis-py implementation is not only well understood, it is also officially endorsed by Redis. The aioredis implementation is not well understood by anyone but you. I would say I'm quite familiar with it as well at this point, but that is not an argument for keeping said implementation. That should be a warning sign. When the barrier to maintainership is so high, and the pool of individuals with enough understanding to provide meaningful contributions to its core is so low, that is a signal that the library is doomed to die. And we have proof of that right here.

I don't wish to "kill" others' contributions, but your argument is a sunk cost fallacy. The fact that prior effort and time has been put into a project should not prevent one from changing course when situations change. And they did, drastically.

I do not think the original code was "bad". I actually quite like a lot of it. But that's not why I went this direction and I hope you understand that.

Finally, I'd like to add that this isn't an all-or-nothing thing. I think there's plenty we can learn from redis-py if we want, and we can take pieces of this implementation that we think are good. For example, I like that deserialization code is not mixed into a future, but I do prefer your provider pattern for implementing groups of commands. If you wish to resume maintainership, then let's work together to improve the library, but I also think we should be sure to make contributing as accessible as possible and ensure the library's core can be well understood. I want to see this library succeed and my only goal here is to do just that.

@aviramha
Copy link
Contributor

aviramha commented Feb 1, 2021

This is awesome work @seandstewart!
Edit:
I see now the other comments. I think the statement

Under the hood aioredis is not simple and a person who wants to develop/enhance must deeply understand how it works.

extremely illustrates @seandstewart 's point. An open source library should strive to be simple and maintainable.
Whatever happens, whether this is retracted or not, I'd like to see it as a fork of redis-py if not a newer version of aioredis.

@aviramha
Copy link
Contributor

aviramha commented Feb 1, 2021

Was there a reason you needed to add redis-server --daemonize yes in d9cbb9b?

Yeah - aioredis has traditionally started all of its own redis-servers internally within the test run. I've been trying to move away from that and instead run the redis as external processes and break out each redis version we test as its own test run.

It makes debugging test failures much simpler, since it's not one 10K line output per test run

I think you should consider using https://github.com/biocatchltd/yellowbox . It provides a simple wrapper for setting up and tearing containers between tests, and we already have a RedisService class that is easily extensible. This way the tests can run on any system with Docker.

@Andrew-Chen-Wang
Copy link
Collaborator

Andrew-Chen-Wang commented Feb 1, 2021

I think the biggest problem with this fork is just the huge migration. I feel like taking advantage of the popularity of using aioredis was the point of keeping this as one package rather than converting it into a new package. There are still good portions of aioredis that can inspire some changes in this fork, but I have to agree that the maintainability is quite difficult.

I've used aioredis before for my own package and projects, and I try to help maintain this package, but, to be fair, I also have quite a bit of trouble understanding the code base.

There are a couple options:

  1. Make aioredis a little easier to understand to maintain and forget about this fork. @popravich This may be biased, but can you point out how you think people could better understand this package besides documentation? For example, for some of those TODOs, I can't figure out how to fix them. Lots of unknown errors pop up, and I don't understand why sometimes.
  2. Make a new package based on this fork. Erases quite a bit of the popularity, and such adoption is limited and aioredis will probably never release with updates again if maintainer ship does not keep it moving.
  3. Use the name of this package in favor of this fork. Adoption is quite clearly high, migration will be a slight pain, but easier maintainability and hopefully more useful PRs to fix any nagging portions.

In my opinion, I actually like option 1 if we can identify the pain points of this package (honestly, it's quite a bit like the acquire/release mechanism, magic methods of aenter etc.). But if we can't identify the pain points that basically causing us to not understand how we could fix those TODOs, then the next best option would be option 3.

Thoughts?

@seandstewart
Copy link
Collaborator Author

@Andrew-Chen-Wang -

Thanks for the thoughtful response here and sorry for the 5-day delay in responding back.

I'm obviously biased in my choice, I'd rather rip the band-aid off now (move forward with migrating this codebase) but I do see value in minimizing the changeset and erring on the side of caution. In most any other situation I'd agree to a more conservative approach.

I won't lie though, I really have limited time and won't be around with this frequency forever. My schedule and mental health won't sustain it. My concern with a smaller port is that there is still a barrier to entry for any new maintainer and a piecemeal port means a long-term investment that I'm not sure anyone currently contributing has in them. Maybe that's unfounded, but I'd like to err on the side of caution w.r.t. active maintainership. I know that if all we need to do for new features is watch redis-py for changes and port them over, there's a much much greater likelihood of keeping this library moving. The same can go for the parser logic and the implemented clients.

For connection-related bugs we'll still need to do triage, but thankfully the connections implementation in this PR is easier to grok than the original version so we've still got a win there.

I will defer (a bit) to the will of the community here, with a grain of salt. At the end of the day I want to make we do right by our community and I do think the best way to do that is to keep the codebase maintainable and up-to-date, so whatever we decide to do, let's just be sure we're achieving that goal.

@gjcarneiro
Copy link

Just, please, don't publish this aioredis-py as aioredis package. This is essentially a new project, with different API, it should have a different package name.

@seandstewart
Copy link
Collaborator Author

Hey @gjcarneiro -

I understand how you feel. To say this is a large change is an understatement.

I would like to stress, however, that the client-facing API is largely the same.

Any differences in the signatures of commands are changes that were either needed to add support for changes in Redis or the result of having a very unique implementation of the command.

Changes to the connection management to align with the redis-py implementation should not dramatically alter how the client is used or initiated either, as asynchronous contexts are still supported, as will as checking out single connections.

Top-level transactions have not changed, and in fact should have much better performance.

I don’t initially see the need to publish a new package given the fact that the major version will be bumped, indicating the fact that the API has changed. I also want to allow time before an official release, I won’t rush out a major change like this. There will be pre-releases, testing, and outreach to downstream dependencies.

Does this help assuage your concerns?

@gjcarneiro
Copy link

It doesn't have the MPSC module (aioredis.pubsub.Receiver), which I use in a lot of places. client.setex() used to accept a float as well as int. Just two examples of API changes off the top of my head. I'm sure there are plenty more.

I don't have time or motivation to port old code to a new API. You're just going to force me to pin aioredis<2 requirement I guess. Even that is problematic, I have to remember all the modules where I use aioredis, to add this pin. Others will get more "surprised".

OTOH, kudos for giving it type annotations. I wanted type hints for a long time in aioredis. 👍

It's a cool project, but I wish it was a separate project.

@seandstewart
Copy link
Collaborator Author

It doesn't have the MPSC module (aioredis.pubsub.Receiver), which I use in a lot of places.

I can see now how removing those abstractions could be a pretty large blocker for some. Could you create an issue describing how you use the Channel/Reciever abstraction? I can look at adding it back in, perhaps as a contrib or ext module to denote that it is an extension over the default implementation.

client.setex() used to accept a float as well as int.

Regarding setex, accepting float for that value is actually disingenuous, as the Redis command expects an integer value in seconds: https://redis.io/commands/setex. If you need millisecond precision, psetex should be used, also with a whole integer value: https://redis.io/commands/psetex. The original implementation hid this detail, but that is an implicit action which could lead to surprising behavior.

@Andrew-Chen-Wang
Copy link
Collaborator

@seandstewart Do you mind triggering the ReadTheDocs for the alpha release? I'm not sure if publishing the alpha release to PyPi will force everyone to update (e.g. on PyJWT, they published alpha and beta releases in the same format as you did, but dependabot and pip didn't ask users to update) ref: #919

@seandstewart
Copy link
Collaborator Author

Yeah, I don't appear to have administrative access to ReadTheDocs and LGTM. I need administrator access to trigger builds and adjust configurations on those services. We just haven't needn't that access till now which is why it went overlooked. @asvetlov should be able to grant access. Additionally, as he's the owner of the project, he can work with @abrookins to get him set up with the correct access. Let's track all this in #822

@yurymann
Copy link

yurymann commented Mar 20, 2021

@seandstewart, @Andrew-Chen-Wang, huge thank you for your efforts! I know I'm a bit late to answer Sean's requests for community verdict, but I have to say that I use aioredis heavily in a mission-critical production project (because it was an asyncio port with the most stars as said above and at the time also seemed relatively maintained) and your way of resurrecting it has my full support.

I'll feel much more confident about addressing breaking changes with a switch to the maintained port of redis-py than I would about relying on a stale codebase with known bugs and performance issues. As previous commenters, I'm grateful to @popravich and other maintainers of the original codebase for the working version which we currently use. But I was seriously considering moving away from redis to a different message broker/in-memory cache product just because a major part of our code is asyncio-based and with growing user base and system load we would not be able to keep ignoring issues like performance, pipeline transactions and cluster support. I also saw issues with reconnection logic which I didn't get around to understanding better and reporting.

A note on Windows support: redis 5 and later can't run on Windows, but we still need to use the client lib on Windows as this is where we do most coding and debugging.

@kevr
Copy link

kevr commented Aug 9, 2021

What happened to the CHANGES/ additions?



@pytest.mark.asyncio
async def test_ssl_connection(create_connection, server, ssl_proxy):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seandstewart if you're ever reimplementing the TLS tests, look into integrating the trustme library. aiohttp has integration examples in the tests.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants