New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use SC bucket types and buckets to know ensembles #1008
Use SC bucket types and buckets to know ensembles #1008
Conversation
If we use a default bucket property with a different n_val from the strongly consistent bucket type, ensembles for that bucket type will never be created. The problem seems to stem from the fact that riak_kv_ensembles:required_ensembles/1 checks all current bucket n_vals for ensembles to create, but doesn't check bucket types that don't yet have buckets created. Since all buckets are created on demand with a put, and it's not possible to put if the ensemble doesn't exist, the buckets and ensembles never get created. To fix this we now iterate through all bucket types and collect all n_vals from them, as well as the n_vals from the existing buckets in order to figure out the set of all n_vals required to create ensembles. Fixes #1007
The riak_kv_ensembles gen_server determines which ensembles need to be created based on the current ring and n_vals corresponding to both existing buckets and active bucket types. It only runs this check on ring changes though. We need to alert it to check again in case a strongly consistent bucket type was activated. Part of a fix for basho/riak_kv#1007 Requires basho/riak_kv#1008
This is a somewhat heavy hammer, but it is expected that bucket types are created somewhat infrequently. It is required to notify other applications on top of riak core that may need this changing information. In particular it's required by riak_kv to bootstrap riak ensembles. Supercedes #617 Part of a fix for basho/riak_kv#1007 Requires basho/riak_kv#1008
Ensemble_ring_changes tests writing a value, expanding the cluster, then updating and reading that value after ring expansion has completed. It also creates a bucket using a bucket type with a different n_val from the default bucket type. The latter tests basho/riak_kv#1008 and it's corresponding riak_core PR.
Since the ring does not change when bucket types are activated we remove the check for a changing ring id that was supposed to prevent unnecessary work. Leaving the optimization in would require changing riak_core to trigger on cluster metadata changes in addition to ring changes. This change thus prevents us from having to change riak_core at the expense of some extra work to check if the ensembles have changed when they probably haven't.
+1 71016f2 Design matches what we discussed and all code looks good. Tested against the new riak_test and all is well. We should benchmark if there's any performance impact of rechecking all N every tick just to know how worse things are (or aren't), but this is the best approach for 2.0 so we're going to ship this way regardless. Future work is to consider restating the smarter approach that also checks bucket type hashes. |
+1 59883e4 Bah, +1 the wrong commit. This should do it. |
…t-type-activate Use SC bucket types and buckets to know ensembles Reviewed-by: jtuple
@borshop merge |
Thanks @jtuple ! |
It looks like this was merged to develop but not to the 2.0 branch, and thus isn't part of RC2. Without this change, users are unable to read/write data for consistent keys with a bucket N-val different than the default bucket N-val. This is a pretty major issue. The changes shouldn't invalidate any RC2 performance testing, nor any correctness qualification outside of tests related to strong consistency. In fact, without this change merged, the related test that was added to test this issue is going to fail. So, merging this branch actually improves test results. Is it possible to merge this into 2.0 for 2.0.0 final? |
Looks like we're punting on this until 2.0.1 now since we've already started 2.0.0 final qualification tests and don't want to rerun. So, going to clearly document things here in case someone hits this and people come looking for a solution. If a user creates a consistent bucket type with an N value that is different then the default bucket N value, ensembles for that different N value may not be created. This will manifest as reads/writes to data under that bucket type always timing out. If this occurs, a user should be able to fix this by running |
Added 2.0 Known Issue so it doesn't get lost and we make a note of it in the release notes. |
…t-type-activate Use SC bucket types and buckets to know ensembles Reviewed-by: jtuple Backport merge of 3489f5b
If we use a default bucket property with a different n_val from the
strongly consistent bucket type, ensembles for that bucket type will
never be created. The problem seems to stem from the fact that
riak_kv_ensembles:required_ensembles/1 checks all current bucket n_vals
for ensembles to create, but doesn't check bucket types that don't yet
have buckets created. Since all buckets are created on demand with a
put, and it's not possible to put if the ensemble doesn't exist, the
buckets and ensembles never get created.
To fix this we now iterate through all bucket types and collect all
n_vals from them, as well as the n_vals from the existing buckets in
order to figure out the set of all n_vals required to create ensembles.
Requires a corresponding riak_core PR and riak_test to show the issue. Those are coming shortly.
Fixes #1007