Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
improve error handling in open_tcp_stream #809
improve handling of
I've changed it to track everything in
I started writing this along with Nathaniel's PyCon 2018 talk but ended up structuring the retry behaviour differently as I found his confusing to follow. Not sure if my code will catch all the cases appropriately or is actually better, hence the WIP.
@@ Coverage Diff @@ ## master #809 +/- ## ========================================== - Coverage 99.01% 99.01% -0.01% ========================================== Files 96 94 -2 Lines 11682 11631 -51 Branches 828 832 +4 ========================================== - Hits 11567 11516 -51 Misses 94 94 Partials 21 21
Whoa, this is awesome.
Yeah, the old code could leak file descriptors in rare cases. It required an unlucky alignment of the stars, and only really matters on PyPy, but it's been bugging me for months. So thank you. It looks like I never actually opened an issue for it, but there's a brief mention here.
I never thought of moving the loop-and-wait logic into the parent task like that. That's really elegant! (I guess this might be a side-effect of my originally writing this back in prehistoric Trio, where you couldn't put blocking logic into the parent task? But it's kind of mind-blowing to realize I spent weeks tweaking the logic for the talk and still missed this.)
One quibble with the socket tracking: I think I'd factor it out into a helper, like:
@contextmanager def close_all(): sockets_to_close = set() try: yield sockets_to_close finally: for sock in sockets_to_close: sock.close() async def open_tcp_stream(...): with close_all() as sockets_to_close: ... # Opening a socket sock = socket.socket(...) sockets_to_close.add(sock) ... # Finishing up if winning_socket is None: ... else: sockets_to_close.remove(winning_socket) return winning_socket
This makes the main function's flow clearer, and lets you drop the awkward
It would be great if we could add tests to confirm that the new code cleans up sockets correctly, in cases where the old code didn't. If you look in
We'll also want a news entry, noting that we fixed some edge cases where
thanks for the positive feedback! I'll do some cleaning up and push again. didn't want to go too far without some confirmation that this was a useful change
the "simplified nurseries" is an interesting feature, feels like an intuitive change
I'm not sure how I'd provoke any resource leakage, it was more of an academic concern… I made sure tests succeeded before committing, but that's as far as I went into looking at your test framework — I'm generally afraid of concurrency-related heisenbugs
also, thanks for publishing Trio! it's a very cool experiment, be interesting to see where it goes!
I've done the easy stuff… just need to figure out a test that would leak in the old version and make sure my version catches it
I put dd2be12 into its own commit as it feels somewhat gratuitous and might lead to unnecessary history/merge awkwardness down the line.
Looks fine to me.
I looked at it again, and I think one case where it could leak would be if
I'm not sure if this would work or not though... when two things are supposed to happen at "the same time" then in fact one of them ends up happening first, and it might not be the one we want :-).
Another approach, that's a little more complicated but that would definitely work, would be to set up a cancel scope around the whole call, and then arrange for the
@njsmith your existing code was pretty water-tight! thinking up valid/reliable cases that would actually fail was difficult. I thought getting your thread cancellation idea to reliably provoke any misbehaviour would be kind of awkward, it could also be fixed easily by moving the code around to do:
the test I left in (that also leaks when I pull
I've also made the
feel free to squash the last two fixup commits, I left them as is per nodejs suggestions
I don't think that would fix it... suppose that the cancellation happens after the
But with the new code it's obvious that it handles all these cases correctly without needing any complicated analysis, so I'm not too worried.
In general we're pretty relaxed about squashing-or-not. Keeping a nice clean history is nice, but I don't want people to feel like they need to spend a bunch of time fiddling with git to contribute.