Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ensemble trust/paranoia configurable #980

Merged
merged 4 commits into from Jun 24, 2014

Conversation

jtuple
Copy link
Contributor

@jtuple jtuple commented Jun 13, 2014

This commit makes K/V ensemble trust user configurable, allowing for two settings: low and medium.

The low setting assumes that vnodes are durable -- that once data has been committed to a vnode, that data will never be silently lost. In this mode, K/V ensembles are not required to perform an AAE exchange before being trusted. This mode is similar to how most consistent databases operate, and is suitable when users trust that their disks will not lose data undetected. Using a filesystem design to detect bit rot (such as ZFS) on a redundant RAID configuration is essential when using low paranoia.

The medium setting assumes that vnodes are not durable, and that they may silently lose data without detection. This mode requires all K/V peers that restart/crash to first perform an AAE exchange with a majority of other peers before being trusted. Since this exchange must occur with a majority of peers (not counting the untrusted peer), this mode requires more than a usual quorum majority to guarantee availability.

The medium mode assumes that the AAE trees on remote peers are not corrupted such that they are missing the exact same keys which the corrupted node is missing from its K/V data. This should be true in practice, since AAE trees are rebuilt from scratch once a week across the entire cluster.

Note: the paranoia setting is designed to deal with undetected data loss -- an issue that many systems have no protection against. Normal data corruption that results in a detected CRC error can be handled without AAE and works in either paranoia mode.

Addresses issues #909

This pull-request does not yet expose Cuttlefish versions of this new setting. Input on if this should be configurable in riak.conf as well as suggestions for the user-facing setting is would be appreciated.

@jtuple jtuple added this to the 2.0-RC milestone Jun 13, 2014
@engelsanchez
Copy link
Contributor

I'm taking this review

@engelsanchez engelsanchez self-assigned this Jun 13, 2014
@engelsanchez
Copy link
Contributor

I'm curious about why we have "low" and "medium" but no "high" :)

I don't see a reason not to have a straight cuttlefish mapping for this: strong_consistency.paranoia = low | medium. It sounds very user friendly to me. I'll attach a commit with that to this PR and we can keep it if nobody has objections. /cc @seancribbs @michellep

@seancribbs
Copy link
Contributor

Yes, please add this to the configuration schema if it is configurable before startup.

@jtuple
Copy link
Contributor Author

jtuple commented Jun 13, 2014

"high" was going to be a paranoia setting where we didn't trust AAE trees either, and required AAE trees to be rebuilt from scratch before syncing. I never got around to implementing "high", and I think it would be terrible given how both AAE and AAE-based ensemble syncing are currently implemented. Ensembles could end up being unavailable for days while you check integrity/etc. So, punted on "high" for 2.0, but plan to investigate "high" in a future release.

@engelsanchez
Copy link
Contributor

While reviewing this, I ran across a nasty problem that prevented ensembles from starting up at all. As far as I can tell, it's not directly related to this simple config (although it may have made a race easier to hit). So I'm going to open a separate issue to track it. Unfortunately I have not been able to reproduce it.

@engelsanchez
Copy link
Contributor

+1 a7ad6b9
Added cuttlefish support for strong_consistency.paranoia and did all my testing with it. The code looks good and is very simple. Ensembles spend a while syncing if paranoia is medium when a node restarts, but become trusted immediately when set to low.

Unless @jtuple @andrewjstone or somebody else has objections on the cuttlefish name or related, this can be merged.

borshop added a commit that referenced this pull request Jun 17, 2014
Make ensemble trust/paranoia configurable

Reviewed-by: engelsanchez
@seancribbs
Copy link
Contributor

Cuttlefish schemas have tests. Add to them!

Also, config changes should get PM/CSE +1's.

@engelsanchez
Copy link
Contributor

@seancribbs Added a schema test for strong_consistency.paranoia. Please illuminate us on how to go about getting those +1's. That's why I'm asking.

@evanmcc
Copy link
Contributor

evanmcc commented Jun 17, 2014

Sorry, super late to the party with my two cents here, but for wishy washy
reasons, integrity or something world be my preference over paranoia.

@seancribbs
Copy link
Contributor

Config change +1's:

  • Eng
  • CSE
  • PM

/cc @angrycub @michellep

@engelsanchez
Copy link
Contributor

Ping @seancribbs @angrycub @michellep . What is the process to keep these things from stalling? If we can't get the required approval or feedback soon I might have to just drop my attempt to add cuttlefish support and leave it as an advanced thing. But we need to keep this moving.

@bsparrow435
Copy link
Contributor

looks good, CSE approved.

@seancribbs
Copy link
Contributor

@engelsanchez Looks like this will need rebase to merge cleanly.

@engelsanchez
Copy link
Contributor

Yes @seancribbs I'm doing a round of rebases/merges across my SC issues right now. Got ship this shit!

@michellep-basho
Copy link

Sorry for the delay. +1 from PM on the concept. I agree with Evan that we should call it something other than paranoia. Discussed with Engel and recommending that we use trust, reversing the order of the setting levels, as there are additional components to data integrity. So the current low would become high, medium would be unchanged.

jtuple and others added 3 commits June 23, 2014 16:27
This commit makes K/V ensemble trust user configurable, allowing for
two settings: low and medium.

The 'low' setting assumes that vnodes are durable -- that once data
has been committed to a vnode, that data will never be silently
lost. In this mode, K/V ensembles are not required to perform an AAE
exchange before being trusted. This mode is similar to how most
consistent databases operate, and is suitable when users trust that
their disks will not lose data undetected. Using a filesystem design
to detect bit rot (such as ZFS) on a redundant RAID configuration is
essential when using 'low' paranoia.

The 'medium' setting assumes that vnodes are not durable, and that
they may silently lose data without detection. This mode requires all
K/V peers that restart/crash to first perform an AAE exchange with a
majority of other peers before being trusted. Since this exchange must
occur with a majority of peers (not counting the untrusted peer), this
mode requires more than a usual quorum majority to guarantee
availability.

The 'medium' mode assumes that the AAE trees on remote peers are not
corrupted such that they are missing the exact same keys which the
corrupted node is missing from its K/V data. This should be true in
practice, since AAE trees are rebuilt from scratch once a week across
the entire cluster.

Note: the paranoia setting is designed to deal with undetected data
loss -- an issue that many systems have no protection against. Normal
data corruption that results in a detected CRC error can be handled
without AAE and works in either paranoia mode.
@engelsanchez
Copy link
Contributor

Thanks @michellep ! @bsparrow435 I'm currently implementing the change as proposed by Michelle. Since you were the CSE who approved the previous version I'm checking with you if you are OK with the change from paranoia -> trust here.

After discussing with product management, we are rename the paranoia
setting to trust and reversing the levels.  paranoia = medium maps to
trust = medium. paranoia = low maps to trust = high.
@andrewjstone
Copy link
Contributor

Tests run fine as well. Should be 👍 for at least the last commit.

@engelsanchez
Copy link
Contributor

+1 d9fc05e

borshop added a commit that referenced this pull request Jun 24, 2014
Make ensemble trust/paranoia configurable

Reviewed-by: engelsanchez
@engelsanchez
Copy link
Contributor

@borshop merge

@borshop borshop merged commit d9fc05e into develop Jun 24, 2014
@seancribbs seancribbs deleted the feature/configurable-ensemble-trust branch April 1, 2015 23:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants