Skip to content

Latest commit

 

History

History
41 lines (31 loc) · 2.02 KB

DESIGN.md

File metadata and controls

41 lines (31 loc) · 2.02 KB

dynamodb-lease design

dynanmodb-lease makes use of dynamodb condition experssions & ttl to provide concurrent-safe distributed leases.

Table

The lease table has the following fields:

  • key (S, hash key)
  • lease_expiry (N, ttl enabled)
  • lease_version (S)

Acquire, extend, drop algorithm

To acquire a lease for key foo (using default config values)

  • PutItem with key: foo with:
    • lease_version a unique id.
    • lease_expiry unix timestamp set to 60s from now.
    • Condition that the item does not exist yet.
  • In the background periodically UpdateItem key: foo with:
    • lease_version a new unique id.
    • lease_expiry 60s from now.
    • Condition that the lease_version is the previous value.

The lease is now alive an cannot be acquired elsewhere.

When finished the Lease is dropped.

  • On drop DeleteItem key foo
    • Condition that the lease_version is the current value.

A new lease can now be acquired.

Edge cases, issues & error scenarios

Dynamodb leases provide decent exclusivity for the initial lease_expiry and make a "best effort" to extend for as long as needed. Because of this, the use of leases alone may not provide enough guarantee for processes that must never lose exclusivity.

Lost access to db after acquiring lease

If access to the db is lost after acquiring a lease the background task will be unable to UpdateItem to extend the lease. The lease also will not be able to DeleteItem on drop.

  • The lease is still exclusive for the original lease_expiry ttl. It makes sense then to set the ttl to longer than the expected max duration needed to provide a decent guarantee of exclusivity.
  • As DeleteItem fails other tasks will remain blocked, but only until the lease_expiry ttl triggers dynamodb to remove the item. So this is not a deadlock, but does inform that the ttl shouldn't be too long.

Clock skew

The client uses the local clock to generate lease_expiry timestamps. To mitigate client clock skews consider lengthening the lease_expiry ttl.