New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zone config updates #1027
Zone config updates #1027
Conversation
92e046c
to
5b7341f
Compare
@bdarnell and @petermattis, in case you want to review as well. |
19b90f2
to
767af37
Compare
Reviewed 4 of 4 files at r1. configure-replication-zones.md, line 47 at r1 (raw file):
Unique node is still correct here. If localities are provided, there will be a preference for more diversity. configure-replication-zones.md, line 67 at r1 (raw file):
configure-replication-zones.md, line 74 at r1 (raw file):
There is no implied order to positive constraints, so in this case, the preference would be:
If locality info is provided:
Also, you mentioned added. This would also apply to rebalancing, so I don't know if added is technically correct. This is true for all sections here. But then again, when rebalancing, we are adding a replica and removing one. Ugh. configure-replication-zones.md, line 75 at r1 (raw file):
I don't think you want racks here. Use nodes or machines. If and only if locality information is provided, then it will try to spread the the nodes to the next available locality level. If you set datacenter to be a required constraint and there is locality information for sectors, than it will diversify at the sector level. If you only have 3 sectors, but 5 replicas, it would actually try to diversity the next level (floor) as well, when placing a 2nd replica in a sector. Also, this section can be simplified to state that only nodes with the required attributes or locality are considered. configure-replication-zones.md, line 76 at r1 (raw file):
Again, no racks here. That's very specific. I think the language in this needs to be updated. If you have a prohibited constraint, when looking for where to place a replica, it will simple ignore all nodes that are prohibited and proceed accordingly. configure-replication-zones.md, line 173 at r1 (raw file):
Please add a comment that adjusting the default config is dangerous and should be avoided. Maybe make this the final example? configure-replication-zones.md, line 199 at r1 (raw file):
Please mention how values not mentioned remain the same, so you can change the num_replicas without affecting the other values. configure-replication-zones.md, line 256 at r1 (raw file):
How about a section on how to write a yaml file first, then the specific commands to update a table, a db, or the default. recommended-production-settings.md, line 14 at r1 (raw file):
Perhaps elaborate on recommended-production-settings.md, line 20 at r1 (raw file):
I think there's also an advantage for rocksdb here. Something about shared memory. I don't remember if we actually use it or not. Suffice to say, it is significantly more efficient as well. start-a-node.md, line 45 at r1 (raw file):
And I think every machine must be stopped and restarted with the new value. Comments from Reviewable |
|
||
- If a machine has multiple disks or SSDs, it's better to run one node with multiple `--store` flags instead of one node per disk, because this informs CockroachDB about the relationship between the stores and ensures that data will be replicated across different machines instead of being assigned to different disks of the same machine. For more details about stores, see [Start a Node](start-a-node.html). | ||
|
||
- Configurations with odd numbers of replicas are more robust than those with even numbers. Clusters of three and four nodes can each tolerate one node failure and still reach a quorum (2/3 and 3/4 respectively), so the fourth replica doesn't add any extra fault-tolerance. To survive two simultaneous failures, you must have five replicas. | ||
|
||
- When replicating across datacenters, you should use datacenters on a single continent to ensure peformance (cross-continent scenarios will be better supported in the future). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we recommend adding locality labels as a best practice in these cases?
Review status: all files reviewed at latest revision, 12 unresolved discussions. configure-replication-zones.md, line 47 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Done. configure-replication-zones.md, line 67 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Yeah, when two hyphenated words share a suffix, that's actually a way to save space, instead of configure-replication-zones.md, line 74 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
I've made a few updates, in the overall intro and the intro to Replica Constraints, to clarify that zone configs apply both when replicas are first added and when they are rebalanced. Sometime soon, I'll want to add another page explaining rebalancing and link to that. But I still need to find a way to simplify these definitions while conveying the complexity of various scenarios. Not sure how to do that. Will keep working and push additional commits, but it'd be great to get input from @bdarnell as well. configure-replication-zones.md, line 173 at r1 (raw file):
But even on that point, we don't know that it's worth adding a warning unless we know that this is true. Ben didn't know if we've tested this yet. configure-replication-zones.md, line 199 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
I've worked that into all of the basic examples, i.e. , configure-replication-zones.md, line 256 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
We define the YAML format further up the page, and provide plenty of examples. I don't think we need to go too much further into explaining how to write one of these, unless we get questions. recommended-production-settings.md, line 14 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Done. recommended-production-settings.md, line 20 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Made a minor language change here. Let me know if you'd suggest more. start-a-node.md, line 45 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Done. Comments from Reviewable |
LGTM Review status: 1 of 4 files reviewed at latest revision, 15 unresolved discussions, some commit checks failed. configure-replication-zones.md, line 61 at r2 (raw file):
"cloud" probably shouldn't be in the locality list (or if it is, it belongs after "region"). Two different cloud providers' "east" regions are closer together than other regions of the same provider. I think "cloud" probably makes more sense as a non-locality attribute. I'd also simplify this a bit - it's going to be rare for anyone to use more than 1 or 2 levels here (until we get to much larger deployments). I'd trim the example to "region" and "datacenter". configure-replication-zones.md, line 62 at r2 (raw file):
We don't currently support GPUs, so this is an odd example. How about recommended-production-settings.md, line 20 at r2 (raw file):
I think "better" is more appropriate than "more efficient" here. The problem with running one node per disk isn't that it's less efficient (is it? we haven't measured), it's the one from the previous bullet - an increased risk of data loss due to the failure of a node that held multiple replicas. It might make more sense to combine these bullets. Comments from Reviewable |
f46a42d
to
e8b7903
Compare
Review status: 1 of 4 files reviewed at latest revision, 15 unresolved discussions. configure-replication-zones.md, line 74 at r1 (raw file): Previously, jseldess wrote…
I've decided to revise and simply the descriptions of constraint types for now. I want to take more time to work with @BramGruneir to get additional details right. configure-replication-zones.md, line 75 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Same as above: Simplifying for now. Will add additional details later. configure-replication-zones.md, line 76 at r1 (raw file): Previously, BramGruneir (Bram Gruneir) wrote…
Same as above. configure-replication-zones.md, line 61 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. configure-replication-zones.md, line 62 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. recommended-production-settings.md, line 24 at r1 (raw file): Previously, a-robinson (Alex Robinson) wrote…
Added that suggestion with a link to the new "Even Replication Across Datacenters" example. recommended-production-settings.md, line 20 at r2 (raw file): Previously, bdarnell (Ben Darnell) wrote…
Done. Comments from Reviewable |
e8b7903
to
740b36a
Compare
This PR updates the zone config docs to explain locality and constraints. It also adds examples and moves replica recommendations to our recommended production settings doc.
This change is