Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Distributed lock cloud be acquired more then once. #127
The problem is, redbeat uses
For example if we have two instances running, A is current owning the lock. A has a network issue and can't connect redis, then B acquired the lock. When A is reconnected, is use
To reproduce this scenario, you can start one
And then use
Wait until the lock in redis timeout:
Then start another redbeat instance:
So we have two instance running now. Each job will be put in queue twice and will be run twice.
I have tried use redis-py's lock.extend( https://github.com/andymccurdy/redis-py/blob/b51bfd818ce36cc3ae8591b54c988fbb16eb336d/redis/lock.py#L235 ) to replace
I guess the best way to do this is register a lua script just like redis lock, first check the lock's owner, if ok, then extends lock.
@sibson what do you think?
The extend lua script:
Thanks for debugging this and providing a great summary. My first instinct would be to use redis-py's implementation, to avoid having more code to maintain but I'm not sure if that opens us to a situation where the lock extends far far into the future. @laixintao you have a good understanding of this. What do you think is the best option?
I think the best solution is that redis-py's lock.extend take a new timeout and do not add the left timeout. But there is an issue opened here for 4 years it seems that they won't support this feature: andymccurdy/redis-py#629 .
So the choices left to us would be:
I will try
I think we can try