Riak KV vnodes can block in certain scenarios when using Bitcask #423

jtuple · 2012-11-02T22:00:16Z

Background. Prior to Riak 0.14.2, all fold operations would block the relevant vnode and prevent the vnode from servicing requests. This was changed in Riak 1.0, with the introduction of asynchronous folds that used an async worker pool, as well as additions to the various backends to support async folds.

To support async folds, Bitcask freezes it's in-memory keydir and has async folds iterate over the frozen keydir, with new concurrent writes going to a pending keydir. Since the keydir is in-memory, Bitcask only allows a single frozen keydir. Multiple folds can reuse the same keydir, but only if there has not been writes since the keydir was frozen. If a fold is started, a write occurs, and then a new fold is started, the second fold will block until the first fold finishes, and then re-freeze the keydir.

The Problem. Blocking async folders is expected and not a big deal. However, when determining if a vnode should handoff data, Riak will end up calling riak_kv_vnode:is_empty which will call, for a Bitcask vnode, bitcask:is_empty. In Bitcask, the is_empty check is implemented through as a fold (start a fold and exit as soon as any key is found) to deal with tombstones, expired keys, etc. This fold is executed directly in the vnode pid, not an async worker, and will block the vnode in scenarios such as above.

There are two scenarios:

An existing fold is running (list keys, one of the folds used in mDC replication, etc) and handoff is triggered. The vnode will then block until the first fold finishes, servicing no requests and leading to an ever growing message queue.
Handoff is triggered (which starts a fold), and then handoff is re-triggered in the future. The vnode manage retriggers handoff periodically as a fault-tolerance mechanism. The handoff manager ensures that a handoff won't be started if already running. However, the is_empty check occurs before calling the handoff manager. So, handoff A to B, write, handoff A to B will cause the vnode to block on the second handoff request, again servicing no requests and leading to a growing message queue.

The text was updated successfully, but these errors were encountered:

Add bitcask:is_empty_estimate to quickly determine if a bitcask contains no data. Currently, determining if a bitcask has data requires folding over the keydir to ensure tombstones and expired keys are skipped. However, this is a potentially blocking operation and no where in Riak do we actually need perfect knowledge. The estimate is determined from the bitcask stats, which may overcount data, but will not undercount. Therefore, the estimated result may return false when the bitcask is actually empty, but it will never return true when there is data. See issue: basho/riak_kv#423

Previously, determining if a bitcask was empty or not was accomplished through a fold over the keydir which is a potentially blocking operation. This commit changes riak_kv_bitcask_backend:is_empty to use the new function bitcask:is_empty_estimate. The estimate is determined from the bitcask stats, which may overcount data, but will not undercount. Therefore, the estimated result may return false when the bitcask is actually empty, but it will never return true when there is data. In all cases where is_empty is currently used in Riak, an estimate is acceptable. In the worst case, additional work may be triggered that is unnecessary but safe (eg. folding over an empty bitcask). See issue: #423

ghost assigned jtuple Nov 2, 2012

jtuple mentioned this issue Nov 2, 2012

Add bitcask:is_empty_estimate basho/bitcask#67

Merged

jtuple mentioned this issue Nov 2, 2012

Change riak_kv_bitcask_backend to use bitcask:is_empty_estimate #424

Merged

ghost assigned reiddraper Nov 3, 2012

jtuple mentioned this issue Nov 5, 2012

Make vnode check for existing handoff before starting another basho/riak_core#250

Merged

evanmcc closed this as completed Aug 9, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Riak KV vnodes can block in certain scenarios when using Bitcask #423

Riak KV vnodes can block in certain scenarios when using Bitcask #423

jtuple commented Nov 2, 2012

Riak KV vnodes can block in certain scenarios when using Bitcask #423

Riak KV vnodes can block in certain scenarios when using Bitcask #423

Comments

jtuple commented Nov 2, 2012