Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background Manager Integration with Handoff #484

Merged
merged 7 commits into from Dec 20, 2013

Conversation

buddhisthead
Copy link
Contributor

NOTE: DO NOT MERGE YET (this PR contains code from #475)

The riak_core background manager PR is here: #475
The riak_repl PR is here: basho/riak_repl#488

The background manager allows subsystems to coordinate with locks (and tokens) in a non-blocking API when desired. In our case, we've seen several customer support issues related to a production cluster being overloaded by the combined weight of MDC fullsync replication in combination with either handoff, or AAE tree rebuilds (or all three!).

Each of these subsystems is being taught, for 2.0, to "take" a lock on the kv vnode/partition before it does a fold over that partition. Locks are returned when the process dies or exists normally. Each subsystem (at this point in time) is coded to keep working on other partitions if one partition is denied a lock via the return value of max_concurrency, which means progress is made even when one partition is busy.

Use of this feature can be bypassed in production. Besides a global skill switch on the background manager itself, the riak_core configuration supports two new parameters: aae_skip_bg_manager and handoff_skip_bg_manager, which defaults to false. Setting it to true will not call the background manager and proceed like it used to.

This PR is a rebase and re-implementation. It replaces #481

%%
%% @doc vnode_lock(Module, PartitionIndex) is a kv per-vnode lock, used possibly,
%% by AAE tree rebuilds, fullsync, and handoff.
-define(VNODE_LOCK(Mod, Idx), {vnode_lock, Mod, Idx}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there isn't any reason for this to be in riak_core any more. It should be moved to riak_kv and it no longer needs to be parameterized on the Mod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. Can we just add it to an existing kv include file? Or just rename and move the file? I liked it in core because it would be one place to see all the lock types we define. But I see your point and it's been bugging me a little too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed by b645b94

@jrwest
Copy link
Contributor

jrwest commented Dec 18, 2013

The actual diff is very small so i've done a quick review of the code already. Couple comments I think should be addressed and then I will run this along w/ basho/riak_kv#769

@@ -92,7 +92,28 @@ start_fold(TargetNode, Module, {Type, Opts}, ParentPid, SslOpts) ->
SrcNode = node(),
SrcPartition = get_src_partition(Opts),
TargetPartition = get_target_partition(Opts),
try

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would also be nice to pull this all out into a function (or at least the part about performing the callback, maybe leave the case statement). start_fold is already a beast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by b03f95b

@@ -0,0 +1,875 @@
%%% @author Jordan West <>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this still needs a proper header before merging

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed by b645b94

@jrwest
Copy link
Contributor

jrwest commented Dec 20, 2013

+1 merge after addressing the license header and commented out ifdefs in eqc. Other comments being addressed would be great but I don't see them as necessary to merge. Opened issues where you agree and it makes sense would be good.

This includes the background manager work from #475 whose review jumped here because i lost track of PRs.

Final Comments:

  • In the end, I believe the use of the table manager added more complexity than was necessary over the public ets table owned by a parent supervisor approach. The former, used in this PR, adds a third state callers must handle, besides process alive/process dead. Process alive but doesn't own table, admittedly a small window but besides the point, means we need functions like try_set_concurrency_limit in each subsystem that wants to register a lock. I'd like to see us move away from the table manager and use the approach taken by the entropy manager and the new riak_kv_hooks, before we start integrating into other subsystems more.
  • We should document the many ways the background manager can be disabled somewhere
  • cuttlefish mappings are needed for any new config stuff added

* The goal is to allow riak sub-systems to coordinate use of shared resources,
* e.g. protect from concurrent vnode folds on the same partition.
* Locks and tokens have a settable maximum concurrency limit.
* "Taken" locks and tokens are tracked in an ETS table.
* max_concurrency is returned when the set limits are reached.
* Processes that take locks are monitored.
* Locks are released when the taking processes terminate.
* Tokens are refreshed at a specified periodic rate.
* Token processes are not monitored because tokens never "release".
* A table manager is introduced to add persistence across process crashes,
* and to allow proper table transfer to occur without losing the table.
* An EQC test exercises the majority of the API. see test/bg_manager_eqc.erl
* See the original PR for background manager here: #364
…le is unavailable * Allows callers to try again later
* Remove history API, use unregistered to signal unavailability of bg-mgr, fix set_token_rate bug.
* Prune the query API down to a consistent set of commands and adjust tests.
* Change query API to use named ETS table directly; is not threaded through the gen_server.
* rebase on latest background manager.
* Add an optional callback entry point to the kv vnode: handoff_started/2
* core_vnode will check for the optional function and call if present
* returning max_concurrency will cause the handoff to be aborted and retried.
* Remove lock defs file; it was moved to riak_kv_vnode.hrl since all these locks are KV-specific
* The optional call to handoff_started/2 allows vnodes one last chance to abort the handoff process. the function is passed the source vnode's partition number and node name (this node) because the callback does not have access to the full vnode state at this time. In addition the worker pid is passed so the vnode may use that information in its decision to cancel the handoff or not (e.g. get a lock on behalf of the process)
* Move code that calls handooff_started/2 into separate function and put the call inside the try/catch.
@buddhisthead
Copy link
Contributor Author

Rebased off develop and force pushed before final merge.

buddhisthead added a commit that referenced this pull request Dec 20, 2013
Background Manager Integration with Handoff
@buddhisthead buddhisthead merged commit 30557a8 into develop Dec 20, 2013
@seancribbs seancribbs deleted the cet-bg-mgr-vnode-lock-rebased branch April 1, 2015 23:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants