New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixing cancelled async futures #2666
Conversation
Co-authored-by: James R T <jamestiotio@gmail.com>
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #2666 +/- ##
==========================================
+ Coverage 92.25% 92.29% +0.04%
==========================================
Files 115 116 +1
Lines 29788 29893 +105
==========================================
+ Hits 27481 27590 +109
+ Misses 2307 2303 -4
... and 3 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even in theory, this change should fix the issue. asyncio.CancelledError may never reaches this redispy code due to this bug: aio-libs/async-timeout#229.
|
@fps7806 @sileht I saw that when doing some research on this a couple of days back. That's what lead down some of the paths. IMHO the right thing to do on the security side - is this series of changes. Yes, this can lead to the side affect of a true connection issue, if the connection has been cancelled. I think, looking at that we can safely say that the use of the client should handle that, as opposed to the client. We're going to have a look at some try/catch work here. If try/catch within the shield triggers properly - I think we're good for release. Targeting tomorrow. |
|
Relevent: python/cpython#28149 |
|
Taking a final look at things.. it appears the best way to solve this, it to properly break the pipeline, when this timing issue occurs. In breaking the pipeline only in this case a RuntimeError is raised. This exception is ultimately raised from below, at least in the specific asyncio/streams code: /usr/lib/python3.10/asyncio/streams.py:616. As a result, the shield needs to catch the RuntimeError - exception, and we can trigger the pipeline reset. This is necessary because the CancelledError, as alluded to previously, doesn't trigger - though should. Our choices are either to:
This final change, has that in it. |
|
Failures are the result of #2670 and unrelated. |
Co-authored-by: James R T <jamestiotio@gmail.com> Co-authored-by: dvora-h <dvora.heller@redis.com>
Co-authored-by: James R T <jamestiotio@gmail.com> Co-authored-by: dvora-h <dvora.heller@redis.com>
* Fixing cancelled async futures (#2666) Co-authored-by: James R T <jamestiotio@gmail.com> Co-authored-by: dvora-h <dvora.heller@redis.com> * Version 4.4.4 * fix tests * linters * fixing the test for the 4.4 state * remove superfluous try-except --------- Co-authored-by: Chayim <chayim@users.noreply.github.com> Co-authored-by: James R T <jamestiotio@gmail.com> Co-authored-by: Chayim I. Kirshen <c@kirshen.com>
|
(Oh, I see 4.3.7 has been tagged but not released yet) |
|
This change makes it impossible to cancel a command and retry, because the cancellation never happens and the single connection lock is held indefinetly. So, the unit-tests I devised for pr #2506 no longer work. Is this really intentional? Not to cancel operation at all? Is there a tl/dr somewhere on why this is desirable? |
|
Anyone reading the code will be clueless. "Wait, it is creating a new |
|
This testing code now breaks: ready = asyncio.Event()
async def helper():
with pytest.raises(asyncio.CancelledError):
# blocking pop
ready.set()
await r.brpop(["nonexist"])
# if all is well, we can continue. The following should not hang.
# DEADLOCK, here:
await r.set("status", "down")
task = asyncio.create_task(helper())
await ready.wait()
await asyncio.sleep(0.01)
# the task is now sleeping, lets send it an exception
task.cancel() The task was interrupted with a CancelledError. but the inner task (shileded()) was not interrupted and continues to wait, hanging on to the Basically, task interruption has been nerfed... Is there a workaround? |
|
For what reason does this shield the sub-task running In other words: Always manage your tasks! Having unmanaged tasks lying around is bad practice. |
|
@agronholm I believe this PR was a mistake. It is an attempt to fix an issue caused by a regression that I caused last year, but without understanding the problem. I have had a fix for that regression lying around for four months without being accepted. Please see pr #2695 which undoes this change and applies the correct fix to the original regression. |
The following pull request contains changes to Async cancelling. Specifically, all cases where a command is sent to redis, via execute_command (and friends), and now wrapped with a shield. This builds on the helpful proxy provided in #2665.
This change covers the following usage patterns.
Reproducible tests currently exist for async, async pipeline, and async cluster. But note - the async cluster pipeline seems at best incorrect.
Feedback and changes are very much welcomed. Bonus points for assistance with testing,