Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High volume / low latency locking is unstable #4

Closed
kurtome opened this issue Aug 11, 2015 · 4 comments
Closed

High volume / low latency locking is unstable #4

kurtome opened this issue Aug 11, 2015 · 4 comments

Comments

@kurtome
Copy link
Owner

kurtome commented Aug 11, 2015

Let me preface this issue by saying it could be completely dependent on the our particular installation and usage pattern. We had 3 consul masters, a few dozen consul agents, running consul v0.5.2, and were using this to acquire about dozens of locks a minute which lasted under 2 seconds each.

The main issue we were seeing was leadership handoff between the master nodes about every 2 hours, sometimes more frequently. Which means for about 1 second there was no leader, meaning any attempt to acquire a lock would instantly fail, causing short loss of functionality for our application.

Reading more details from the raft paper and chubby lock paper, it seems to me that short-lived locks are not the intended use case of consul's locking API. Instead this seems to be much more useful for leader election style locks, where one master controls the resource that is being locked.

@kurtome
Copy link
Owner Author

kurtome commented Aug 11, 2015

For simplicity we wound up using our existing Zookeeper cluster and the Kazoo library for locking, since this seemed to be more stable and required no additional maintenance on our part.

@armon
Copy link

armon commented Aug 11, 2015

@kurtome Consul locking really isn't intended for short lived locks like this. That said, you shouldn't be experiencing such a high level of leadership churn either, and even a few dozen requests per second shouldn't have been an issue. It would be great if you could provide more details on the deployment and issues as a ticket against Consul so that we can investigate.

@kurtome
Copy link
Owner Author

kurtome commented Aug 11, 2015

Also worth noting, other Consul users have had issues with consul leadership election happening frequently, and it seems many performance improvements are on the roadmap for 0.6.0 https://groups.google.com/forum/#!topic/consul-tool/Yp-j7bZYkmI

@kurtome
Copy link
Owner Author

kurtome commented Aug 11, 2015

@armon thanks for the confirmation!

We're not planning to dig into our Consul issues anymore at this very second just because it's working well enough for our other needs (also it's possible the increased load from our usage of this library contributed to the problem), but if we need more I'll open a ticket in the Consul project

@kurtome kurtome changed the title High volume / low latency locking is unstable in consul 0.5.2 High volume / low latency locking is unstable Aug 11, 2015
@kurtome kurtome closed this as completed Oct 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants