Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

requests hang if pool is shutdown before receiving response #1245

Closed
brharrington opened this issue Jun 27, 2017 · 10 comments
Closed

requests hang if pool is shutdown before receiving response #1245

brharrington opened this issue Jun 27, 2017 · 10 comments
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted bug t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module
Milestone

Comments

@brharrington
Copy link
Contributor

I'm seeing requests hang after a while when using a connection pool. In this case the code is using Http().singleRequest(request) and onComplete is never called for the returned future. This might be the same issue as reported in #908, but if I understand correctly they have a high request rate all the time. For the case I'm seeing, I think the problem is a race condition between shutting down an idle connection pool and a new request being made using that pool. For the connections I have seen get in this state where I have detailed logging, I see it push a request to the connection just before the pool is shutdown.

The idle-timeout is set to 30s. Normally I see a pattern of pushing, Received, Finished for a given connection. In the case where it hangs I see pushing and then "Slot was stopped". Snippet of log shown below:

2017-06-24T18:29:08.204 DEBUG [default-dispatcher-578] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> pushing request to connection: GET /PATH Empty
2017-06-24T18:29:08.235 DEBUG [default-dispatcher-250] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> Received response: GET /PATH Empty -> 200 OK Chunked
2017-06-24T18:29:08.319 DEBUG [default-dispatcher-399] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> Finished reading response entity for GET /PATH Empty -> 200 OK Chunked
2017-06-24T18:29:08.403 DEBUG [default-dispatcher-506] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> pushing request to connection: GET /PATH Empty
2017-06-24T18:29:08.442 DEBUG [default-dispatcher-321] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> Received response: GET /PATH Empty -> 200 OK Chunked
2017-06-24T18:29:08.504 DEBUG [default-dispatcher-45] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> Finished reading response entity for GET /PATH Empty -> 200 OK Chunked

... A little over 30s later ...

2017-06-24T18:29:38.691 DEBUG [default-dispatcher-397] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> pushing request to connection: GET /PATH Empty
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-212] akka.http.impl.engine.client.PoolGateway: Shutting down host connection pool
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-123] akka.http.impl.engine.client.PoolInterfaceActor: stopped
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-123] akka.io.SelectionHandler: now supervising Actor[akka://atlas/system/IO-TCP/selectors/$a/45404#1332330586]
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-123] akka.io.TcpOutgoingConnection: started (akka.io.TcpOutgoingConnection@400b4f1b)
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-123] akka.http.impl.engine.client.PoolGateway: [2] Idle -> Loaded(1)
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-123] akka.io.TcpOutgoingConnection: now watched by Actor[akka://atlas/system/IO-TCP/selectors/$a#-920956002]
2017-06-24T18:29:38.700 DEBUG [default-dispatcher-578] akka.http.impl.engine.client.PoolGateway: [0] </IP:35922->HOST:7001> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-578] akka.http.impl.engine.client.PoolGateway: [1] </IP:36068->HOST:7001> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-578] akka.http.impl.engine.client.PoolGateway: [2] </IP:36071->HOST:7001> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-578] akka.http.impl.engine.client.PoolGateway: [3] <unconnected> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-127] akka.http.impl.engine.client.PoolGateway: [4] <unconnected> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-127] akka.http.impl.engine.client.PoolGateway: [5] <unconnected> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-127] akka.http.impl.engine.client.PoolGateway: [6] <unconnected> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-127] akka.http.impl.engine.client.PoolGateway: [7] <unconnected> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-127] akka.http.impl.engine.client.PoolGateway: [8] <unconnected> Slot was stopped
2017-06-24T18:29:38.701 DEBUG [default-dispatcher-391] akka.http.impl.engine.client.PoolGateway: [9] <unconnected> Slot was stopped

Thus far I haven't been successful at reproducing in a controlled minimal environment.

@brharrington
Copy link
Contributor Author

I managed to create a small sample program that reproduces the issue pretty reliably for me: https://github.com/brharrington/akka-http-issue1245

@brharrington brharrington changed the title requests using connection pool can hang requests hang if pool is shutdown before receiving response Jul 4, 2017
@jrudolph
Copy link
Member

jrudolph commented Jul 5, 2017

Thanks for the report and the reproducer, @brharrington. We'll try to have a look next week.

@jrudolph jrudolph added 0 - new Ticket is unclear on it's purpose or if it is valid or not t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module labels Jul 5, 2017
@jrudolph jrudolph added 1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted bug and removed 0 - new Ticket is unclear on it's purpose or if it is valid or not labels Jul 26, 2017
@jrudolph
Copy link
Member

I finally found time to look into this. I can reproduce the behavior with your project by playing a bit around with the timeout value. I think your explanation is correct that there's a race condition between sending in new requests and the idle-timeout.

@jrudolph
Copy link
Member

I understand what's going on, the pool idle-timeout is implemented by sending out gateway.shutdown without checking if new requests have been been dispatched in the meantime. Afterwards there's is nothing that would dispatch the responses to the original caller.

Until this is fixed I'd suggest to the disable the connection-pool idle-timeout (host-connection-pool.idle-timeout) for now. It's only useful if you connect to a huge number of different hosts over time anyway.

@jrudolph
Copy link
Member

I guess one solution could be keep the PoolInterfaceActor running a bit longer as long as remainingRequested > 0.

jrudolph added a commit to jrudolph/akka-http that referenced this issue Jul 26, 2017
Previously, a shutdown request didn't check if there are outstanding
requests so that outstanding responses were just lost.
jrudolph added a commit to jrudolph/akka-http that referenced this issue Jul 27, 2017
Previously, a shutdown request didn't check if there are outstanding
requests so that outstanding responses were just lost.
jrudolph added a commit to jrudolph/akka-http that referenced this issue Jul 27, 2017
Previously, a shutdown request didn't check if there are outstanding
requests so that outstanding responses were just lost.
jrudolph added a commit that referenced this issue Aug 15, 2017
…t-race

=htc #1245 fix pool idle-timeout race condition
@jrudolph jrudolph added this to the 10.0.10 milestone Aug 15, 2017
@jrudolph
Copy link
Member

Fixed by #1311.

brharrington added a commit to brharrington/atlas that referenced this issue Aug 31, 2017
Fixes race condition with the idle timeout on the
host connection pool (akka/akka-http#1245).
brharrington added a commit to Netflix/atlas that referenced this issue Aug 31, 2017
Fixes race condition with the idle timeout on the
host connection pool (akka/akka-http#1245).
@e-orz
Copy link

e-orz commented Sep 3, 2017

Hi,
Is it related to #303 ?

@jrudolph
Copy link
Member

jrudolph commented Sep 4, 2017

Hi @e-orz, I added a comment to #303. I guess it could be related, but it's not completely clear to me how it would get stuck there. The effect of the bug here was that responses for some requests were never delivered.

@e-orz
Copy link

e-orz commented Sep 5, 2017

Same issue there. The request never got sent. What interests me is that is was solved already in version 10.0.5.

@jrudolph
Copy link
Member

jrudolph commented Sep 5, 2017

No, that was only fixed recently in 10.0.10

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 - triaged Tickets that are safe to pick up for contributing in terms of likeliness of being accepted bug t:client Issues related to the HTTP Client t:core Issues related to the akka-http-core module
Projects
None yet
Development

No branches or pull requests

3 participants