Purpose ---------- Visibility into handoff is really poor. The typical method used to discover handoff information is `riak-admin transfers` but that gives hardly any useful information, as shown below. 'firstname.lastname@example.org' waiting to handoff 7 partitions 'email@example.com' waiting to handoff 4 partitions 'firstname.lastname@example.org' waiting to handoff 5 partitions This PR adds visibility transfers/handoff by tracking various stats on active transfers and displaying this information in a human friendly way, as shown below. ./dev/dev1/bin/riak-admin transfers 'email@example.com' waiting to handoff 6 partitions 'firstname.lastname@example.org' waiting to handoff 4 partitions 'email@example.com' waiting to handoff 6 partitions Active Transfers: transfer type: ownership_handoff vnode type: riak_kv_vnode partition: 365375409332725729550921208179070754913983135744 started: 2012-04-24 18:43:44 [5.96 s ago] last update: 2012-04-24 18:43:48 [1.91 s ago] objects transferred: 8651 2135 Objs/s firstname.lastname@example.org =======================> email@example.com 17.62 MB/s This PR also gets rid of the annoying side effect of resetting the inactivity timeout when calling `riak-admin transfers`. This would often cause users to wonder why handoffs were never occurring. Implementation Details ---------- One issue with handoff is that it uses vnode folds to do all it's work. This has the one nice benefit that it avoids a local copy of data (1) but has bad side effect of using uninterruptable fold. That is, the vnode fold does the work as fast as it can and doesn't stop until it's done (2). In order to get status updates about the handoff the accumulator keeps some local stats and _approximately_ every 2 seconds sends those stats to the handoff manager via the `status_update/2` API. I say the timing is approximate because expiration of the interval is only checked during a sender/receiver sync phase (determined by `ACK_COUNT`). If the receiver can't keep up or the sender fold is slow then the status updates could take longer. Essentially, this code assumes that `ACK_COUNT` objects can be transferred in less than 2s. **N.B.** The duration of the status update interval will not invalidate the stats since they are based on start time and time of last sync (see `riak_core_handoff_manager:update_stats/2`). The reason the sender only sends a status update every 2s and only checks if this interval has expired on sender/receiver sync is because the vnode fold is a tight loop. Sending an update for every object would be too chatty and checking the interval every object could potentially slow from overhead of getting time and doing math. There are two types of transfer currently, _ownership handoff_ and _hinted handoff_. Soon there will be another type, _repair_. In order to disambiguate the two types of handoff I have to determine if the source vnode is primary or not. In the case of ownership handoff it is a `primary -> secondary` handoff (where the secondary becomes primary after handoff completes) and for hinted handoff it's `secondary -> primary`. In order to make the stats a little easier to read I added a little human friendly formatting. I decided to put the code to support this in Core rather than KV. I stole and modified the code from @seancribbs. One aspect of this PR I'm not wild about is the fact that in order to get the status a msg must be sent to each handoff manager on each each node for every time `riak-admin transfers` is called (3). I'd rather see a push system where all active status data is collated at a particular node, like the claimant node in ownership, and the status call simply reads that. The pull system is probably fine for now but could cause trouble on larger clusters, especially if some script accidentally calls it in a tight loop. I'm wondering if I should have make use of the stats API for the collection of data in the handoff manager rather than a dict? Footnotes ---------- 1: That is, if the handoff sender process itself was running the handoff then the vnode data would have to be copied from vnode heap to sender heap. 2: In the future I think an iterator/cursor based approach to handoff that is async, interruptable, and rate limited would be good. 3: Which calls `riak_core_status:all_active_transfers` where the RPC is done.