Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConnectionClosed exception #86

Closed
enumag opened this issue Mar 26, 2019 · 7 comments
Closed

ConnectionClosed exception #86

enumag opened this issue Mar 26, 2019 · 7 comments
Labels
bug Something isn't working

Comments

@enumag
Copy link
Member

enumag commented Mar 26, 2019

Our command that connects to persistent subscription to project events into a SQL database sometimes fails with Prooph\EventStore\Exception\ConnectionClosed.

Here is the relevant part of the stack trace:

In AbstractSubscriptionOperation.php line 250:
                                                  
  [Prooph\EventStore\Exception\ConnectionClosed]  
  Connection was closed                           
                                                  

Exception trace:
 () at .../vendor/prooph/event-store-client/src/ClientOperations/AbstractSubscriptionOperation.php:250
 Prooph\EventStoreClient\ClientOperations\AbstractSubscriptionOperation->connectionClosed() at .../vendor/prooph/event-store-client/src/Internal/SubscriptionsManager.php:92
 Prooph\EventStoreClient\Internal\SubscriptionsManager->purgeSubscribedAndDroppedSubscriptions() at .../vendor/prooph/event-store-client/src/Internal/EventStoreConnectionLogicHandler.php:405
 Prooph\EventStoreClient\Internal\EventStoreConnectionLogicHandler->tcpConnectionClosed() at .../vendor/prooph/event-store-client/src/Internal/EventStoreConnectionLogicHandler.php:170
 Prooph\EventStoreClient\Internal\EventStoreConnectionLogicHandler->Prooph\EventStoreClient\Internal\{closure}() at .../vendor/prooph/event-store-client/src/Internal/MessageHandler.php:38
 Prooph\EventStoreClient\Internal\MessageHandler->handle() at .../vendor/prooph/event-store-client/src/Internal/EventStoreConnectionLogicHandler.php:190
 Prooph\EventStoreClient\Internal\EventStoreConnectionLogicHandler->enqueueMessage() at .../vendor/prooph/event-store-client/src/Internal/EventStoreConnectionLogicHandler.php:338
 Prooph\EventStoreClient\Internal\EventStoreConnectionLogicHandler->Prooph\EventStoreClient\Internal\{closure}() at .../vendor/prooph/event-store-client/src/Transport/Tcp/TcpPackageConnection.php:172
 Prooph\EventStoreClient\Transport\Tcp\TcpPackageConnection->Prooph\EventStoreClient\Transport\Tcp\{closure}() at n/a:n/a
 Generator->throw() at .../vendor/amphp/amp/lib/Coroutine.php:73
 Amp\Coroutine->Amp\{closure}() at .../vendor/amphp/amp/lib/Failure.php:29
 Amp\Failure->onResolve() at .../vendor/amphp/amp/lib/Coroutine.php:103
 Amp\Coroutine->__construct() at .../vendor/amphp/amp/lib/Loop/Driver.php:123
 Amp\Loop\Driver->tick() at .../vendor/amphp/amp/lib/Loop/Driver.php:72
 Amp\Loop\Driver->run() at .../vendor/amphp/amp/lib/Loop.php:84
 Amp\Loop::run() at .../src/Module/Core/Console/ConnectToPersistentSubscriptionCommand.php:147

I checked some other similar issues and as far as I can tell it seems I need to implement a reconnecting logic here. Can you give me some advice how to do it? The async code confuses me and I'm not even sure if I can reuse the same EventStoreConnection instance or if I need a new one. Once I understand it I'll gladly send a PR to improve the example.

@prolic
Copy link
Member

prolic commented Mar 26, 2019

@enumag this is related: #85

About the connection getting closed: there seems to be a problem with the tcp connection logic, I receive this error as well quite often which is a problem to me. Will need to look further into this, but it's hard to debug, as it happens not regular.

@prolic prolic added the bug Something isn't working label Mar 26, 2019
@enumag
Copy link
Member Author

enumag commented Mar 26, 2019

Maybe we could try debug it with something like phptrace.

Is there a way to work around this for now and reconnect manually? I tried to implement it today but since I can't reproduce the issue reliably I have no way to test if it works.

@prolic
Copy link
Member

prolic commented Mar 26, 2019 via email

@enumag
Copy link
Member Author

enumag commented Apr 2, 2019

It seems that reconnecting doesn't really help. Or actually it does but it only projects a few events before the connection is closed and then it takes several seconds to reconnect. This is completely useless because I have tens of thousands of events waiting in the queue so with a pause after every few events it will never finish.

Here is something from docker-compose logs. The reason is always the same HEARTBEAT TIMEOUT at msgNum 2:

eventstore_1  | [00001,13,08:37:35.081] Closing connection '"external-normal"":1c6f372d33600741adaf5030299bc0ef"' [172.19.0.5:39810, L172.19.0.2:1113, {831b6de3-e5af-4b05-9a59-cf937a73ed12}] cleanly." Reason: HEARTBEAT TIMEOUT at msgNum 2"
eventstore_1  | [00001,13,08:37:35.081] ES "TcpConnection" closed [08:37:35.082: N172.19.0.5:39810, L172.19.0.2:1113, {831b6de3-e5af-4b05-9a59-cf937a73ed12}]:Received bytes: 121, Sent bytes: 239094
eventstore_1  | [00001,13,08:37:35.081] ES "TcpConnection" closed [08:37:35.083: N172.19.0.5:39810, L172.19.0.2:1113, {831b6de3-e5af-4b05-9a59-cf937a73ed12}]:Send calls: 5, callbacks: 5
eventstore_1  | [00001,13,08:37:35.084] ES "TcpConnection" closed [08:37:35.084: N172.19.0.5:39810, L172.19.0.2:1113, {831b6de3-e5af-4b05-9a59-cf937a73ed12}]:Receive calls: 3, callbacks: 2
eventstore_1  | [00001,13,08:37:35.084] ES "TcpConnection" closed [08:37:35.084: N172.19.0.5:39810, L172.19.0.2:1113, {831b6de3-e5af-4b05-9a59-cf937a73ed12}]:Close reason: [Success] "HEARTBEAT TIMEOUT at msgNum 2"
eventstore_1  | [00001,13,08:37:35.084] Connection '"external-normal"":1c6f372d33600741adaf5030299bc0ef"' [172.19.0.5:39810, {831b6de3-e5af-4b05-9a59-cf937a73ed12}] closed: Success.
eventstore_1  | [00001,46,08:37:35.084] Persistent subscription lost connection from 172.19.0.5:39810
eventstore_1  | [00001,46,08:37:35.084] Retrying message "everything::projection" "everything"/56926
eventstore_1  | [00001,46,08:37:35.084] Retrying message "everything::projection" "everything"/58503
eventstore_1  | [00001,46,08:37:35.084] Retrying message "everything::projection" "everything"/58504

@enumag
Copy link
Member Author

enumag commented Apr 2, 2019

Do you know what is the heartbeat about? Could some changes to ConnectionSettings::heartbeatInterval() and ConnectionSettings::heartbeatTimeout() fix it?

I found the default values here but I'm not sure what settings I should try. The issue is still not easy to reproduce for me to experiment with it properly.

Btw. the failure seems to always occur right after an event that is somewhat bigger than usual (holds more data inside).

@enumag
Copy link
Member Author

enumag commented Apr 2, 2019

Hypothesis: Maybe the heartbeat does not occur while I'm processing an event so if the processing of one event takes longer than heartbeatTimeout then the connection is closed.

@prolic
Copy link
Member

prolic commented May 6, 2019

Upon further investigation: This is not a bug in the library!

The problem occurs when the server is sending too many data for the client to handle and thus the server decides, the connection is broken, because too much data is piling up.

There are two possible options:

  1. ConnectionPendingSendBytesThreshold is a setting on the server, the default is 10485760. You can increase this number.
  2. Use a smaller batchSize - if you set batchSize to 1000 and have 50KB event size, you reach that pending bytes treshold really fast.

Most of the times, the server is much faster sending data than any php script could handle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants