-
Notifications
You must be signed in to change notification settings - Fork 10
heartbeat timeout cause crash #70
Comments
nothing in rabbitmq logs, so assuming its a client error that's just not handled as of yet. |
Ooh looks fun! That'd usually indicate that the client's connection dropped in which case the crash (and therefore process exit) is expected. For anything where the connection dies, the immediate action is to crash. It could throw a handleable error, but then you'd have a running container that wasn't actually working as expected. Is that correct or have I found the wrong end of the stick? |
No you've got the gist. Just so happens that it's running on... nodemon... so if we were not it would have restarted and been fine; instead it waits for changes. But maybe it'd be better to handle reconnection without crashing? |
We could look in to handling reconnect attempts, yeah, but would be a pretty large change. That said, if I remember correctly there are a few libraries floating around that wrap Possible, but needs some looking in to. For now, switch that mofo to |
yeah that works, definitely shouldn't have been on nodemon, just an observation i noticed since it had been using it. i guess though if you can imagine a split brain situation where an api server can no longer reach an remote rabbitmq server. If the heartbeat causes a crash everything would be reconnecting at the same time and potentially bring down rabbitmq? I guess even if we did not crash and tried to reconnect the same would apply, unless the library which wraps amqplib has an option for some randomness. For example, SocketCluster has the following config:
That's not a possibility in our use as we have relatively few services, but yeah maybe migrating to a wrapper could be a good shout especially if it's completely compatible with our current one :) essentially would be reconnection logic for free |
Sure that makes sense. I'll close this - as technically it's expected behaviour - but open up another issue (#71) for a feature request for reconnection logic. I'll try have a look at the listed packages tonight and see what's viable. |
This would seem to have happened on remit 2 but not certain because it's happened on a now dead docker container.
but this error was not caught and caused a crash
The text was updated successfully, but these errors were encountered: