Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lock response when lease has expired #9921

Closed
naddison36 opened this issue Jul 12, 2018 · 3 comments
Closed

Lock response when lease has expired #9921

naddison36 opened this issue Jul 12, 2018 · 3 comments

Comments

@naddison36
Copy link

I have a scenario where a lock is obtained even though the lease associated with the lock has expired.

I have a scenario where I have two processes that want to acquire a lock on the same lock name. The first process successfully acquires the lock using a LockRequest with a lease time-to-live of 20 seconds. Immediately after a second process tried to acquire the same lock name but with a lease time-to-live of only 5 seconds.

After 20 seconds the first lease expires and the lock is released. This in turn sends a LockResponse from the Lock service back to the second process. The second process now thinks it has the lock even though it doesn't as it's lease has expired.

etcdlocking

Is this the expected behaviour?
I would have expected the LockResponse to return an error if a LockRequest was not able to be acquired before its lease expired.

If the above is expected behaviour, how are the clients meant to handle this scenario? I can see two solutions

  1. the client does another call after receiving a LockResponse to check the key actually has the lock
  2. the client maintains its own timers which will throw errors when a lock is not acquired in time

Or have I completely misunderstood how distributed locks work with etcd?

@vimalk78
Copy link
Contributor

clientv3/concurrency/Mutex.Lock() works by creating a Key (with a Lease of default TTL of 60 sec, kept alive every 20 secs) with "lock-prefix" and the leaseID.
If the create revision of this key is lowest, then lock is acquired. else wait for all keys with "lock-prefix" and lower create revision. Since the keys are kept alive, (unless user cancels the ctx ), a session waiting for other keys to be deleted doesn't have to worry about self key to be deleted.

In case of etcdserver/api/v3lock.LockServer.Lock() the keys are not kept alive. This is done by making the session Orphan(). Each key could be with a different TTL and could expire independently. So if the lock is already acquired by key1 with TTL of 20 secs, and another key2 with TL of 5 secs is waiting for lock, the key2 can wait for key1 to be expired, but it doesn't come to know that it itself is expired.
Session.Done() channel also doesnt help, since it is closed as soon as Orphan is invoked, which cancels the context resulting in Done() channel being closed, much before the key expires.

Perhaps in this case, an additional watch is required for the self key (in addition to the watch handling by concurrency.waitDeletes() ), which can stop the watch for earlier keys in case TTL expires the key.

@gyuho , what is your opinion.

@naddison36
Copy link
Author

Thanks for commenting on this. That helps me better understand what's going on.

BTW, I went with the second option listed above. That is, I have a timer in my client that fires if I have not acquired the lock in time and will ignore the returned lock that has already timed out.

@stale
Copy link

stale bot commented Apr 7, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Apr 7, 2020
@stale stale bot closed this as completed Apr 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants