Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: Only attempt to disconnect idle client instances #6

Closed
wants to merge 1 commit into from

Conversation

davidtaylorhq
Copy link
Member

@davidtaylorhq davidtaylorhq commented Oct 28, 2020

Waiting to acquire the mutex on the ::Redis objects is likely to cause a deadlock in multi-threaded environments. For example:

  • Thread 1 and thread 2 have in-progress redis requests during a failover
  • Thread 1 is the first to raise an exception, triggers RailsFailover::Redis::Handler#disconnect_clients, acquiring the RailsFailover::Redis::Handler mutex
  • Thread 1 loops through all ::Redis instances, acquire their locks, and disconnect them
  • When it reaches the ::Redis instance used by Thread 2, Thread 1 waits for the mutex to become available

  • Meanwhile, Thread 2 raises an exception and triggers RailsFailover::Redis::Handler#disconnect_clients
  • Thread 2 waits for the RailsFailover::Redis::Handler mutex to become available

This situation is never resolved, and the threads remained deadlocked.

This commit will skip disconnecting ::Redis clients which are currently locked by another thread. If the lock is held by another thread, it is fair to assume that a request is in process, it will fail, and the client will automatically disconnect very soon.

Waiting to acquire the mutex on the `::Redis` objects is likely to cause a deadlock in multi-threaded environments. For example:

- Thread 1 and thread 2 have in-progress redis requests during a failover
- Thread 1 is the first to raise an exception, triggers RailsFailover::Redis::Handler#disconnect_clients, acquiring the RailsFailover::Redis::Handler mutex
- Thread 1 loops through all ::Redis instances, acquire their locks, and disconnect them
- When it reaches the ::Redis instance used by Thread 2, Thread 1 waits for the mutex to become available

- Meanwhile, Thread 2 raises an exception and triggers RailsFailover::Redis::Handler#disconnect_clients
- Thread 2 waits for the RailsFailover::Redis::Handler mutex to become available

This situation is never resolved, and the threads remained deadlocked.

This commit will skip disconnecting `::Redis` clients which are currently locked by another thread. If the lock is held by another thread, it is fair to assume that a request is in process, it will fail, and the client will automatically disconnect very soon.
Copy link

@eviltrout eviltrout left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice one.

@tgxworld
Copy link
Contributor

@davidtaylorhq ahh this only happens in Sidekiq right? When I added the feature I was considering Unicorn workers only which only have one instance of Redis each.

@davidtaylorhq
Copy link
Member Author

Yep, this is only causing problems in sidekiq 👍

@davidtaylorhq
Copy link
Member Author

I realised that this assertion I made

If the lock is held by another thread, it is fair to assume that a request is in process, it will fail, and the client will automatically disconnect very soon.

Is not always correct. When we are failing over due to a connection error, it holds. But if we are failing over manually, or failing back to the master, we cannot rely on a connection error to resolve things.

Going to close this PR and come back with another solution.

@CvX CvX deleted the redis-deadlock branch February 15, 2022 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants