Below two lines seem conflict with each other:
return not (self.free or self.busy or self.waiting_queue)
If there is something in waiting_queue need to be aborted, then if self.conns.all_dead: will return False. Thus the self.conns.abort_waiting_queue is not going to get executed.
The possible solution:
Change return not (self.free or self.busy or self.waiting_queue) to return not (self.free or self.busy or self.pending)
return not (self.free or self.busy or self.pending)
Looks like you are correct. How did you spot it? Do you have a minimal reproduction code that can be put to unittest?
I just looked into the unittest code. I guess we could spot this issue by adding a line follows L816 in test_abort_waiting_queue():
Not tested yet. Do you guess the same way?
But with the execution order of L478 and L479 in Pool._operate:
if not keep:
I guess the second db query could get the only connection and issue the query command successfully. Then the testcase disconnects db by send SIGTERM to tcproxy. I consider it highly likely that the second db query could finish before the connection disconnect. Which would result in altered version of test_abort_waiting_queue()'s failure.
Maybe we need a better solution in unittest.
You are right both ways - I've changed the code to set the result first to let a done callback in the unittest to kill the connections. Then the assertion work (after fixing the original bug of course).
For the test with proxy, however the test will not raise DatabaseNotAvailable, because restarting proxy will not immediately mark the connection object as closed. I guess we'll have to live with it for now.
See #142. What do you think?
Fixes 139 - `all_dead` should check for pending
`all_dead` - should check for pending connections
and not waiting_queue.
To be able to test it properly, future's result
need to be set before connection is returned to
pool in order to allow our kill callbacks to run.