Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read repair - configure to repair primary only #1844

Merged
merged 10 commits into from
Jan 19, 2023

Conversation

martinsumner
Copy link
Contributor

By default, the behaviour should be unchanged. however it is now configurable to read repair primary vnodes only - fallback vnodes will not be repaired on failing GETs, they will only receive new PUTs.

See schema change for more details.

basho/riak_test#1369

By default, the behaviour should be unchanged.  however it is now configurable to read repair primary vnodes only - fallback vnodes will not be repaired on failing GETs, they will only receive new PUTs.

See schema change for more details.
Initially to troubleshoot in test - but perhaps of generic use.
Rather than replicate piecemeal the standard PUT code within do_diffobj_put/3, instead use do_handoff_put/3 and use standard prepare_put/2 and perform_put/3 functions used in normal PUTs.

The effect of this is that any optimisations in the normal PUT workflow, will now automatically be used for handoffs.  Of particular relevance at this point is the HEAD (not GET) before PUT optimisation available with leveled backend.  If there are large objects, and objects which already exist in the receiving vnode are to be handed off (such as in hinted handoff), then this increases efficiency.

Some spec improvements to help with some editors that do not like fun() type.  Some indent reductions to improve readability.
@martinsumner
Copy link
Contributor Author

martinsumner commented Jan 16, 2023

The change from do_diffobj_put/3 to do_handoff_put/3 breaks the test https://github.com/basho/riak_test/blob/develop-3.0/tests/overlap_with_crdt_tag.erl

This is (as per usual) a bucket type issue. The original assumption of bucket types is that they should be created and activated, only when the cluster is fully formed. However, we may need to expand (join nodes) at any time.

As a node joins, the bucket types might not be propagated to the node before the handoffs start. In the case of the new do_handoff_put/3 function this means that {error, no_type} may be returned when the bucket properties are fetched for a typed bucket - and hence an attempt to treat the properties as a list will crash and the item will not handoff.

The previous do_diffobj_put also had potential issues with this - https://github.com/basho/riak_kv/blob/riak_kv-3.0.12/src/riak_kv_vnode.erl#L3649. However, the if there is already an object present for that type, it can be assumed that the node is re-joining and so has the metadata.

It is not clear that if PUTs immediately after joins could also have similar issues. This may not be exclusive to handoff.

@martinsumner
Copy link
Contributor Author

Need to consider UpdateHooks as well - https://github.com/basho/riak_kv/blob/riak_kv-3.0.12/src/riak_kv_vnode.erl#L3665-L3667 - if we use the general PUT code should the reason by a PutArg, and passed through? Only impacts the deprecated Yokozuna.

This allows the same code to be used for both handoff and put.
As riak_core updated to ensure bucket types are exchanged prior to join committing
To make recovery of nodes easier, adding some helper functions to riak_client.
@martinsumner martinsumner merged commit fbb5363 into develop-3.0 Jan 19, 2023
martinsumner added a commit that referenced this pull request Feb 1, 2023
* Read repair - configure to repair primary only

By default, the behaviour should be unchanged.  however it is now configurable to read repair primary vnodes only - fallback vnodes will not be repaired on failing GETs, they will only receive new PUTs.

See schema change for more details.

* get_option returns value not {K, V}

* Add ability to suspend AAE

* Add logging of read repairs

Initially to troubleshoot in test - but perhaps of generic use.

* Handle handoff put through standard put code

Rather than replicate piecemeal the standard PUT code within do_diffobj_put/3, instead use do_handoff_put/3 and use standard prepare_put/2 and perform_put/3 functions used in normal PUTs.

The effect of this is that any optimisations in the normal PUT workflow, will now automatically be used for handoffs.  Of particular relevance at this point is the HEAD (not GET) before PUT optimisation available with leveled backend.  If there are large objects, and objects which already exist in the receiving vnode are to be handed off (such as in hinted handoff), then this increases efficiency.

Some spec improvements to help with some editors that do not like fun() type.  Some indent reductions to improve readability.

* Make HookReason part of PutArgs

This allows the same code to be used for both handoff and put.

* Revert defaulting of properties

As riak_core updated to ensure bucket types are exchanged prior to join committing

* Add helpful operator functions to riak_client

To make recovery of nodes easier, adding some helper functions to riak_client.

* Update branches

Remove legacy thumbs

* Update rebar.config
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants