Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There is a race between RMQ shutting itself down after heartbeat failure and other operations. #104

Closed
fmstephe opened this issue May 20, 2021 · 1 comment
Assignees
Labels

Comments

@fmstephe
Copy link
Contributor

Currently when RMQ has established a connection it will send a heartbeat, which sets a value with a 1 minute TTL, every second. This tells the queue cleaner that this connection is still live and should not be removed and cleaned up.

https://github.com/adjust/rmq/blob/master/connection.go#L110

If the heartbeat operation fails 45 times in a row, i.e. the redis is unreachable for 45 seconds, then the connection stops itself and shuts down all existing consumers.

https://github.com/adjust/rmq/blob/master/connection.go#L136

If the connection has shut itself down due to heartbeat failure calls to either *redisConnection.OpenQueue() or *redisQueue.StartConsuming() will attempt to write data into redis related to the connection which has already shut itself down.

The queue data that is written to redis in this case will contain key values for a connection which no longer exists

This race is unlikely under most usage scenarios, because it requires redis to be

  1. available to establish the connection
  2. unavailable for more than 45 seconds
  3. available again when we try to create a new queue or start consuming on one

However, if the client system expects redises to be unavailable but does not want their service to shutdown during this period, any persistent retry mechanism they implement makes this scenario more likely to occur.

@fmstephe fmstephe self-assigned this May 20, 2021
@wellle
Copy link
Member

wellle commented Jan 17, 2024

This is all very true! I don't think we will be addressing this anytime soon though, so I'm closing this as won't fix ✌️

@wellle wellle closed this as completed Jan 17, 2024
@wellle wellle added the wontfix label Jan 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants