Lock response when lease has expired #9921

naddison36 · 2018-07-12T02:46:58Z

I have a scenario where a lock is obtained even though the lease associated with the lock has expired.

I have a scenario where I have two processes that want to acquire a lock on the same lock name. The first process successfully acquires the lock using a LockRequest with a lease time-to-live of 20 seconds. Immediately after a second process tried to acquire the same lock name but with a lease time-to-live of only 5 seconds.

After 20 seconds the first lease expires and the lock is released. This in turn sends a LockResponse from the Lock service back to the second process. The second process now thinks it has the lock even though it doesn't as it's lease has expired.

Is this the expected behaviour?
I would have expected the LockResponse to return an error if a LockRequest was not able to be acquired before its lease expired.

If the above is expected behaviour, how are the clients meant to handle this scenario? I can see two solutions

the client does another call after receiving a LockResponse to check the key actually has the lock
the client maintains its own timers which will throw errors when a lock is not acquired in time

Or have I completely misunderstood how distributed locks work with etcd?

The text was updated successfully, but these errors were encountered:

vimalk78 · 2018-08-27T23:37:42Z

clientv3/concurrency/Mutex.Lock() works by creating a Key (with a Lease of default TTL of 60 sec, kept alive every 20 secs) with "lock-prefix" and the leaseID.
If the create revision of this key is lowest, then lock is acquired. else wait for all keys with "lock-prefix" and lower create revision. Since the keys are kept alive, (unless user cancels the ctx ), a session waiting for other keys to be deleted doesn't have to worry about self key to be deleted.

In case of etcdserver/api/v3lock.LockServer.Lock() the keys are not kept alive. This is done by making the session Orphan(). Each key could be with a different TTL and could expire independently. So if the lock is already acquired by key1 with TTL of 20 secs, and another key2 with TL of 5 secs is waiting for lock, the key2 can wait for key1 to be expired, but it doesn't come to know that it itself is expired.
Session.Done() channel also doesnt help, since it is closed as soon as Orphan is invoked, which cancels the context resulting in Done() channel being closed, much before the key expires.

Perhaps in this case, an additional watch is required for the self key (in addition to the watch handling by concurrency.waitDeletes() ), which can stop the watch for earlier keys in case TTL expires the key.

@gyuho , what is your opinion.

naddison36 · 2018-08-28T04:38:03Z

Thanks for commenting on this. That helps me better understand what's going on.

BTW, I went with the second option listed above. That is, I have a timer in my client that fires if I have not acquired the lock in time and will ignore the returned lock that has already timed out.

stale · 2020-04-07T05:11:44Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

naddison36 mentioned this issue Jul 12, 2018

Timeouts of a second lock waiting for a first lock to yield DavidCai1111/node-etcd-lock#1

Open

gyuho added the area/clientv3 label Jul 12, 2018

stale bot added the stale label Apr 7, 2020

stale bot closed this as completed Apr 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lock response when lease has expired #9921

Lock response when lease has expired #9921

naddison36 commented Jul 12, 2018

vimalk78 commented Aug 27, 2018

naddison36 commented Aug 28, 2018

stale bot commented Apr 7, 2020

Lock response when lease has expired #9921

Lock response when lease has expired #9921

Comments

naddison36 commented Jul 12, 2018

vimalk78 commented Aug 27, 2018

naddison36 commented Aug 28, 2018

stale bot commented Apr 7, 2020