Skip to content

Conversation

jseldess
Copy link
Contributor

@jseldess jseldess commented Apr 11, 2018

Clarify replication zone levels and warning about increasing the default replication factor without increasing the replication factor of system ranges.

Fixes #2858

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@cockroach-teamcity
Copy link
Member

Copy link
Contributor

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We didn't fully explain the general problem. The issue isn't specific to just the system ranges -- zone configs form a hierarchy of sorts, and the problem is in how that hierarchy is handled.

The system ranges' replication settings are controlled by their specific zone configs (.liveness, .system, etc) if they exist. If they don't exist, they're controlled by .default.
A table partition's replication is controlled by the partition's zone config if it exists. If it doesn't, it's controlled by its table's zone config. If that doesn't exist, it's controlled by the database's zone config. If that doesn't exist, it's controlled by .default.

The issue is that if you haven't defined any of the lower-level configs, then everything falls back to .default. Or analogously, if you haven't defined a table-level zone config, the table will fall back to using the database-level config.

This means that in the common case, particularly in the past when we didn't pre-create any zone configs, setting .default was sufficient for controlling the replication of all data in the cluster.

However, setting .default does not affect any data that already has a more specific zone config in place, regardless of whether it's a system range or not. If you've set a zone config for table foo.bar, changing the .default zone config will not affect how foo.bar is replicated. The same goes for setting a zone config on database foo -- it will have no effect on the foo.bar table if you've already set up a zone config for foo.bar.

The getting-started situation is made much worse by the fact that the cluster now starts with a few zone configs in it for certain system ranges, but I think the documentation needs a little more subtlety than to just say that the .default zone don't affect system ranges.

Copy link
Contributor

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you want to chat about this to better understand it, I'm not sure how useful that explanation is.

@jseldess
Copy link
Contributor Author

Thanks, @a-robinson. I've tried to add more clarity around pre-configured and user-created replication zones. Not sure if these changes fully address your concerns, though. PTAL.

@cockroach-teamcity
Copy link
Member

@cockroach-teamcity
Copy link
Member

@@ -5,7 +5,7 @@ keywords: ttl, time to live, availability zone
toc: false
---

In CockroachDB, you use **replication zones** to control the number and location of replicas for specific sets of data, both when replicas are first added and when they are rebalanced to maintain cluster equilibrium. Initially, there is are a few special pre-configured replication zones for internal system data along with a default replication zone that applies to the rest of the cluster. You can adjust these pre-configured zones as well as add zones for individual databases, tables, and rows ([enterprise-only](enterprise-licensing.html)) as needed. For example, you might use the default zone to replicate most data in a cluster normally within a single datacenter, while creating a specific zone to more highly replicate a certain database or table across multiple datacenters and geographies.
In CockroachDB, you use **replication zones** to control the number and location of replicas for specific sets of data, both when replicas are first added and when they are rebalanced to maintain cluster equilibrium. Initially, there is are some special pre-configured replication zones for internal system data along with a default replication zone that applies to the rest of the cluster. You can adjust these pre-configured zones as well as add zones for individual databases, tables, and rows ([enterprise-only](enterprise-licensing.html)) as needed. For example, you might use the default zone to replicate most data in a cluster normally within a single datacenter, while creating a specific zone to more highly replicate a certain database or table across multiple datacenters and geographies.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's preexisting (and probably my fault), but s/there is are/there are/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


There are four replication zone levels:
CockroachDB comes with pre-configured replication zones for all user data in the cluster and for some internal system data. These replication zones can be adjusted but not deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These replication zones can be adjusted but not deleted.

That's not really true, since you can run cockroach zone rm on any of them except for .default, and if you do so then the corresponding ranges will just be governed by the .default zone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked.

- **Row:** ([For enterprise users](enterprise-licensing.html)) You can add replication zones for specific rows in a table by [defining table partitions](partitioning.html). See [Create a Replication Zone for a Table Partition](#create-a-replication-zone-for-a-table-partition-new-in-v2-0) for more details.
Zone Name | Description
----------|------------
`.default` | This replication zone applies to all user data in the cluster not constrained by more specific, [user-created replication zones](#user-created-replication-zones).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also doesn't apply to more specific system-created replication zones (the other three zones listed below that show up by default in a freshly created v2.0 cluster)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked.

----------|------------
`.default` | This replication zone applies to all user data in the cluster not constrained by more specific, [user-created replication zones](#user-created-replication-zones).
`.meta` | The "meta" ranges contain the authoritative information about the location of all data in the cluster. If these ranges are unavailable, the entire cluster will be unavailable, so this replication zone is pre-configured to have a lower-than-default `ttlseconds`. If your cluster is running in multiple datacenters, it's a best practice to configure the meta ranges to have a copy in each datacenter.
`.liveness` | <span class="version-tag">New in v2.0:</span> The "liveness" range contains the authoritative information about which nodes are live at any given time. If this range is unavailable, the entire cluster will be unavailable, so this replication zone is pre-configured to have a lower-than-default `ttlseconds`. Giving it a high replication factor is also strongly recommended.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this range is unavailable, the entire cluster will be unavailable, so this replication zone is pre-configured to have a lower-than-default ttlseconds.

That doesn't logically follow. The lower-than-default ttlseconds is because historical queries are never run on the liveness table and its advantageous to keep the liveness range smaller for reliable performance.

The "If this range is unavailable, the entire cluster will be unavailable" bit is a reason to recommend running it with a higher-than-default replication factor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification. Reworked.

`.default` | This replication zone applies to all user data in the cluster not constrained by more specific, [user-created replication zones](#user-created-replication-zones).
`.meta` | The "meta" ranges contain the authoritative information about the location of all data in the cluster. If these ranges are unavailable, the entire cluster will be unavailable, so this replication zone is pre-configured to have a lower-than-default `ttlseconds`. If your cluster is running in multiple datacenters, it's a best practice to configure the meta ranges to have a copy in each datacenter.
`.liveness` | <span class="version-tag">New in v2.0:</span> The "liveness" range contains the authoritative information about which nodes are live at any given time. If this range is unavailable, the entire cluster will be unavailable, so this replication zone is pre-configured to have a lower-than-default `ttlseconds`. Giving it a high replication factor is also strongly recommended.
`system.jobs` | This replication zone controls the replication of a variety of important internal data, including information needed to allocate new table IDs and track the health of a cluster's nodes. It is configured with a lower-than-default `ttlseconds` and applies to the `system.jobs` SQL table, which stores metadata about long-running jobs such as schema changes and backups.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This replication zone controls the replication of a variety of important internal data, including information needed to allocate new table IDs and track the health of a cluster's nodes.

That's not true. Looks like you copied from the description of the .system zone?

And speaking of the .system zone, it looks like you've removed all mentions of it. Was that intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked.


To control replication for one of the above sets of system ranges, create a YAML file defining only the values you want to change (other values will not be affected), and use the `cockroach zone set <zone-name> -f <file.yaml>` command with appropriate flags:
To control replication for a system range, create a YAML file defining only the values you want to change (other values will not be affected), and use the `cockroach zone set <zone-name> -f <file.yaml>` command with appropriate flags:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional, but we may want to change "other values will not be affected" to "other values will be copied from the .default zone".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -10,6 +10,10 @@ This page describes newly identified limitations in the CockroachDB v2.0 release

## New Limitations

### Replication factor for system ranges

Changes to the <code>.default</code> replication zone are not automatically applied to other <a href="configure-replication-zones.html#edit-the-replication-zone-for-a-system-range">pre-configured replication zones</a> for internal system data or to any <a href="configure-replication-zones.html#user-created-replication-zones">user-created replication zones</a>. If you increase the replication factor for <code>.default</code>, you may also want to increase the replication factor for <code>.meta</code>, <code>.liveness</code>, and <code>system.jobs</code> to ensure that important internal data is as resilient as your user data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we don't care to make the distinction, but this behavior (changes to .default not affecting pre-existing zones) has always been the case -- the new thing in v2.0 is that there are more pre-configured zones to be affected by the existing problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When put like that, I don't think we need to call this out as a "limitation" per se. It's more of a ux-gotcha, so hopefully covering it in the replication zone docs is enough.

@a-robinson
Copy link
Contributor

Reviewed 3 of 3 files at r2.
Review status: all files reviewed at latest revision, 7 unresolved discussions, all commit checks successful.


Comments from Reviewable

@jseldess
Copy link
Contributor Author

@a-robinson, please take another look. Hopefully I'm getting closer. Thanks for all the help here.

@cockroach-teamcity
Copy link
Member

@a-robinson
Copy link
Contributor

This looks great, Jesse! Thanks for sticking with it. :lgtm:


Reviewed 5 of 5 files at r3.
Review status: all files reviewed at latest revision, 2 unresolved discussions, some commit checks failed.


v2.0/configure-replication-zones.md, line 26 at r3 (raw file):

Cluster | CockroachDB comes with a pre-configured `.default` replication zone that applies to all table data in the cluster not constrained by a database, table, or row-specific replication zone. This zone can be adjusted but not removed. See [View the Default Replication Zone](#view-the-default-replication-zone) and [Edit the Default Replication Zone](#edit-the-default-replication-zone) for more details.
Database | You can add replication zones for specific databases. See [Create a Replication Zone for a Database](#create-a-replication-zone-for-a-database) for more details.
Table | You can add replication zones for specific tables. See [Create a Replication Zone for a Table](#create-a-replication-zone-for-a-table) for more details.<br><br>CockroachDB comes with a pre-configured replication zone for one internal table, `system.jobs`, which stores metadata about long-running jobs such as schema changes and backups. Historical queries are never run against this table, so the pre-configured zone gives this table a lower-than-default `ttlseconds`.

Minor, but I'd phrase the last sentence as:

"Historical queries are never run against this table and the rows in it are updated frequently, so the pre-configured zone gives this table a lower-than-default ttlseconds."

See cockroachdb/cockroach#21575 for the full original reasoning if you're curious


v2.0/configure-replication-zones.md, line 356 at r3 (raw file):

Zone Name | Description
----------|-----------------|------------
`.meta` | The "meta" ranges contain the authoritative information about the location of all data in the cluster.<br><br>Because historical queries are never run on meta ranges, its advantageous to keep these ranges smaller for reliable performance, so CockroachDB comes with a **pre-configured** `.meta` replication zone giving these ranges a lower-than-default `ttlseconds`.<br><br>If your cluster is running in multiple datacenters, it's a best practice to configure the meta ranges to have a copy in each datacenter.

The second sentence's grammar is messed up. I'd say:

Because historical queries are never run on meta ranges and it is advantageous to keep these ranges smaller for reliable performance, CockroachDB comes with a pre-configured .meta replication zone giving these ranges a lower-than-default ttlseconds.


v2.0/configure-replication-zones.md, line 359 at r3 (raw file):

`.liveness` | <span class="version-tag">New in v2.0:</span> The "liveness" range contains the authoritative information about which nodes are live at any given time.<br><br>Just as for "meta" ranges, historical queries are never run on the liveness range, so CockroachDB comes with a **pre-configured** `.liveness` replication zone giving this range a lower-than-default `ttlseconds`.<br><br>If this range is unavailable, the entire cluster will be unavailable, so giving it a high replication factor is strongly recommended.
`.timeseries` | The "timeseries" ranges contain monitoring data about the cluster that powers the graphs in CockroachDB's admin UI. If necessary, you can add a `.timeseries` replication zone to control the replication of this data.
`.system` | There are system ranges for a variety of other important internal data, including information needed to allocate new table IDs and track the health of a cluster's nodes. If necessary, you can add a `.system` replication zone to control the replication of this data.

While I'm throwing nits at you, I know I wrote this but in retrospect I'd replace the word "health" here with "status".


v2.0/known-limitations.md, line 15 at r2 (raw file):

Previously, jseldess (Jesse Seldess) wrote…

When put like that, I don't think we need to call this out as a "limitation" per se. It's more of a ux-gotcha, so hopefully covering it in the replication zone docs is enough.

👍


Comments from Reviewable

@jseldess
Copy link
Contributor Author

Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


v2.0/configure-replication-zones.md, line 26 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Minor, but I'd phrase the last sentence as:

"Historical queries are never run against this table and the rows in it are updated frequently, so the pre-configured zone gives this table a lower-than-default ttlseconds."

See cockroachdb/cockroach#21575 for the full original reasoning if you're curious

Done.


v2.0/configure-replication-zones.md, line 356 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

The second sentence's grammar is messed up. I'd say:

Because historical queries are never run on meta ranges and it is advantageous to keep these ranges smaller for reliable performance, CockroachDB comes with a pre-configured .meta replication zone giving these ranges a lower-than-default ttlseconds.

Done.


v2.0/configure-replication-zones.md, line 359 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

While I'm throwing nits at you, I know I wrote this but in retrospect I'd replace the word "health" here with "status".

Done.


Comments from Reviewable

@cockroach-teamcity
Copy link
Member

- Add known limitation
- Add warning to "edit default replication zone" example

Fixes #2858
@cockroach-teamcity
Copy link
Member

@jseldess jseldess merged commit cd7772f into master Apr 28, 2018
@jseldess jseldess deleted the system-ranges branch April 28, 2018 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants