-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug in Session._checked_out_sockets #55
Comments
Hi dd-dent! I think we need a bit more context here. What does |
Hi. Can you post a full traceback? Junk included :) Also, what version of asks? |
Forgot to mention, using pycharm on win 10 with python 3.6.4 here's a sample query(note I omitted the pid key which is my API key for obvious reasons): @theelous3 I'm using asks 1.3.9 and trio 0.3.0
I appreciate attention, let me know if anything else is needed and I'll try to reply ASAP (a bit busy for the next few hours) |
Thanks! I was in work earlier so couldn't take a proper look. Will poke around it this evening and / or tomorrow evening. |
By the way, are you using any stream sockets? Like Side note: am working on retries and such in a local branch at the moment, so you won't need to implement it yourself soon :) |
Nope, just the regular response objects from a shared session instance. I'll note that this particular exception was thrown under rather extreme circumstances: Needless to say, that's not normal circumstances, and I did refactor a bit since then, with connections now limited to 20 and a If you want I'll try writing a script which would reproduce similar conditions when I have the time. I understand that it might be difficult to solve\pinpoint this issue, along with the possibility that it might've been caused due to simple misuse by me (a bit new to python aio after all...) but this lib really helped me, so I'd like to help back if I can :) |
Your first paragraph made me feel dirty in a very good way. I would love a similar environment. I'm trying to recreate one at the moment, but running a few workers locally with httpbin isn't throwing any issues. Not dirty enough! If you can whip something up that throws the same error that'd be great. I'll keep trying in the mean time. |
I'm at work for the next few hours, but free for the weekend. I'm planning on writing a script to more or less recreate the same load conditions and flow. Until I do, I can only guess at the cause as there are many possible culprits at play here. To summarize, I think possible causes would appear to be either platform (win vs linux), bad connectivity, and\or misuse by me, which would hopefully could be identified once I get down to writing that script... Will update in a few hours\tomorrow. |
I have a sneaking suspicion in the back of my mind that if a connection sits in the pool for a while unused, and is dc'd by the server, it is not handled correctly. I also have a sneaking suspicion that I fixed that a while ago. Maybe I imagined it. Time to dig! At the very least it makes sense given that I can make 100k requests locally with no issues, where nothing sits still, but irl it's not as smooth. |
So... job_count = 20 # total jobs to run
job_limit = 10 # trio.Capacitylimiter for jobs
connection_limit = 20 # asks.Session connections kwarg If anything is unclear, or if I did something horribly wrong please tell me. |
Hey, I was away for a few days somewhere sunny. Nice, I'll take a look over this now :) |
It's been raining like crazy here the last few days, so at least someone, somewhere had fun... Haven't had much time to mess with it myself... Also forgot to mention that the server can be run with the -d option to add delay in ms... |
Check it peeps: https://github.com/theelous3/asks/tree/miracle2k-timeout-leak Specifically: Ignore the pants on head addition of the pytest-cache. Seem to have fixed the leaky sockets by removing retry logic from the silly place it was (internal of the request) and moving it to the session level, and handling errors there too, a la the thinking of @miracle2k . Working well for random test cases on my end. Thoughts? |
My app is still exhibiting this behaviour. I think it might be related to timeouts: import trio
import asks
import multio
multio.init('trio')
async def main():
session = asks.Session()
for x in range(1000):
try:
await session.get('http://slowwly.robertomurray.co.uk/delay/3000/url/http://www.google.co.uk', timeout=1)
except Exception as e:
print("failed, now sockets: %s, %s" % (session._conn_pool, session._checked_out_sockets))
pass
trio.run(main) Output:
I do not think it's enough to just handle the |
What's the exception there? The Away from system atm. Can't test. Will though. As an aside, that's delay endpoint is fun. |
Wait, I just read what you said properly. Forget what was written here a moment ago. Am digging. |
Sorry for not being clear. It is about the sockets leaking, so this might not actually be the right ticket! But it's related to the |
Yep no I gottcha @miracle2k. Working on it at the moment. You'll be pleased to know there is probably a finally :) |
@miracle2k can you post example of your God there should be an IM feature here. |
Thanks for fixing it! Here is the still occurring problem with
I know that finally is tough to add (I forget the details, but I ran in to an issue), but maybe a catch-all |
Note though that trio's |
So what we can't do, is anything at all involving actual io-ish async once a I refactored a little bit, and added a If something has been killed we clean it up before we try to do anything later on. To be honest, I feel a little weird putting that in the 10e0d38 812c03a (whoops got ahead of myself a43f846) Shortly once we add nice timeouts around the connection process, there should be little to no reason to use anything that may trigger this flag anyway (at least solely for the request), which will be nice. Seems fine to me overall. I'm open to refactors on this, and also open to leave this open a little while considering how good the bug-hunting's been by @miracle2k, I don't want to hastily close this again. |
I'll give it a try. Interesting about not being able to catch and re-raise the Canceled exception. I would have assumed that works, because I think it works in asyncio. Maybe trio is different here (maybe @njsmith can tell us). |
I'm not quite following the discussion here, but I can say:
|
Hi hi. Yes, we're using Catching all BaseException in the cancelled task and trying to clean up after them, we use the following: except BaseException as e:
print(type(e))
print(type(sock))
print('Closing socket')
await sock.close()
print('Socket closed')
sock._active = False
print('Replacing connection')
await self._replace_connection(sock)
print('Connection replaced')
raise e Run using the example posted a few comments ago, employing
All other exceptions that trigger tidying up the socket use the same code and work as expected. For example, here's the output of some timeouts:
Here's a branch that just catches |
Ah, I see. So what's happening in the example is that the Now One approach would be to put the Can you elaborate a little on what this bookkeeping is trying to do? I would have expected that once a socket was closed you would just throw it away. What does |
Ok, to summarize, here is what the problem is: with trio.move_on_after(1):
# Inside the asks library:
socket_stream = await open_trio_socket_stream()
try:
await long_running_socket_operation(socket_stream)
except trio.Canceled:
# we want to close the socket now and re-raise the exception to continue cancellation
await socket_stream.aclose()
some_other_cleanup_steps()
raise What happens here is that because the task is already cancelled, And again, if I understand what @njsmith said correctly, this is ok and to be expected, and we should deal with it? So, either we do: # Catch Canceled directly
except trio.Canceled:
# Do all the other cleanup steps first
some_other_cleanup_steps()
# Let aclose re-raise cancelled
await socket_stream.aclose()
# Just to be safe, re-raise ourselves in case aclose did not
raise Or we do: # Catch all base exceptions
except BaseException:
# Call aclose and catch any cancelled exception it may raise
try:
await socket_stream.aclose()
except CanceledError:
pass
# Do anything else for cleanup
some_other_cleanup_steps()
# Re-raise
raise The problem of course is that asks should not have a dependency on trio directly, so maybe be even more aggressive: except BaseException:
try:
await socket_stream.aclose()
# Instead of only catching trio.CanceledError, catch all.
except BaseException:
pass
some_other_cleanup_steps()
raise While none of them seem particularly pretty, I do think the solution we have now, where in the case of outside cancellation such as here the the socket gets added to a temporary |
@njsmith I see. I guessed this was roughly what was taking place. Of note, however, when I tried handling it like: except BaseException as e:
await self._handle_exception(e, sock)
# where _handle_exception loooked like
async def _handle_exception(e, sock):
print('this will print')
await sock.aclose()
print('this will not print')
raise e
Note: I just made a branch removing the |
That's discussed some here: https://trio.readthedocs.io/en/latest/reference-core.html#checkpoints Basically, it's not literally the Also btw, note that writing Code like this is fine though: except:
some_synchronous_cleanup()
await obj.aclose()
raise (And look how short and simple it is :-).) |
Oh, neato. d82c328 (not merged atm) Now we just except BaseException as e:
await sock.close()
raise e As |
Nice! |
Working on an async API downloader for work, my current crack at it uses asks and trio.
Randomly, the following statement:
could inspire this:
I only went as far out to see what deque was problematic, which is the field mentioned in the title, which in turn is an instance of SocketQ(deque).
So... am I looking at a bug or misuse here?
The text was updated successfully, but these errors were encountered: