Locks return without checking whether the lease is still held #11456

aphyr · 2019-12-16T21:54:30Z

In Jepsen testing, we found that updating shared state protected by a lock could result in lost updates or other anomalies. This was due in part to the fact that distributed locks are intrinsically unsafe, but was exacerbated by an issue where a server failed to check whether a lease was valid after blocking for a lock to be released, and simply told the client they held the lock instead.

As @xiang90 wrote:

It seems the root cause is the failure of maintaining the lease via keepalive (not sure why this happened though...) of process 0. We can know this since the explicit release of the lease failed after the acquire operation.

And there is a "bug" inside etcd server side. While waiting for the previous lock holder to release the lock, the client does not check the current lease status. So when the lock operation returns, the lease might be already expired. Then the etcd client will falsely think it still holds the lock.

We can mitigate the issue by checking the lease status before the return of lock operation. However, it wont completely solve the problem. If the response RPC takes long enough, the lease might still expire in between...

Basically, as we discussed, the lock holder have to checked the sequence number (in etcd the revision of the lock key) and the remaining lease timeout before doing anything...

More concretely, we should check if the owner key is still exist here before return

etcd/clientv3/concurrency/key.go

Line 57 in 34bd797

if len(resp.Kvs) == 0 {

This was addressed by #11408; I'm creating this issue as a reference point for later readers.

aphyr closed this as completed Dec 16, 2019

aphyr mentioned this issue Dec 16, 2019

Document that locks aren't really locks #11457

Closed

agargi mentioned this issue Dec 14, 2020

concurrency: Improve distributed locking by adding support for cancel… #12547

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Locks return without checking whether the lease is still held #11456

Locks return without checking whether the lease is still held #11456

aphyr commented Dec 16, 2019

Locks return without checking whether the lease is still held #11456

Locks return without checking whether the lease is still held #11456

Comments

aphyr commented Dec 16, 2019