Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rfc: Cascading Zone Configs #30426

Merged
merged 1 commit into from Oct 4, 2018

Conversation

ridwanmsharif
Copy link

@ridwanmsharif ridwanmsharif commented Sep 19, 2018

Adds an RFC to discuss proposals and design for cascading zone configs.
Discusses changes to the ZoneConfig structure, why it is needed, the
different ways we could change it and designs what the change should
look like

Issue: #28901

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only skimmed this so far, but it looks directionally good.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 91 at r1 (raw file):

by the Zone hierarchy above until they find a config with the field explicitly set.

After [adding caching], we must ensure the Zone configs are fully set before cached.

Didn't Spencer's caching PR just land? You're thinking of caching the fully resolved zone config, so its probably worth calling out that the worst case chain of inheritance won't be a performance concern.

Copy link
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good!

Reviewed 1 of 1 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 91 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Didn't Spencer's caching PR just land? You're thinking of caching the fully resolved zone config, so its probably worth calling out that the worst case chain of inheritance won't be a performance concern.

I believe Ridwan's plan, to start, is to cache fully resolved database and table zone configs, but partial index/partition zone configs. So there's still a bit of a perf hit for subzones.


docs/RFCS/20180919_cascading_zone_configs.md, line 22 at r2 (raw file):

zone config by default and so also has a replication factor of `3`. Now the
user updates the replication factor of the database to `5` but this change
does not apply to the table as the change doesn't cascade down.

I think you're missing a step here. The user needs to install a zone config for the table that sets some field other than num_replicas. If there's no zone config for t, then the update to the default num_replicas will propagate.


docs/RFCS/20180919_cascading_zone_configs.md, line 81 at r2 (raw file):

Unless explicitly set, the fields in the zone config for each zone/subzone would inherit
the value from its parent recursively. The `.default` fields are always explicitly set.

It's worth being aware that there are actually two defaults. The first is installed into system.zones when the cluster is bootstrapped—i.e., the moment the cluster comes into existence. The second is in-memory in the binary itself. If the default entry in system.zones is deleted, then the in-memory default is used instead. Ending up in this state is difficult—you have to manually run a DELETE FROM system.zones—but it's been seen in the wild. The new cascading should continue to do something sensible if the default in system.zones is missing or incomplete.


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

This change is acceptable (maybe shouldn't be?) as the inherited replication factor is 5.
However, now it must become illegal for the replication factor of the database to be set to
less than 5 (As that change would propagate to the table making its zone config invalid)

I think this is one of the biggest questions in this RFC. Do we want to validate a parent zone against all of its children, or do we want to live with the fact that it's possible to install an invalid zone config? I'm leaning towards the latter, but I'm curious what other folks think.


docs/RFCS/20180919_cascading_zone_configs.md, line 145 at r2 (raw file):

## Additional Considerations

If a table is a `SubzonePlaceholder`, all of it's fields must be unset except for `NumReplicas`

nit: "its"


docs/RFCS/20180919_cascading_zone_configs.md, line 158 at r2 (raw file):

What is the performance penalty of making the `ZoneConfig` fields nullable?
The fields become pointers, but how much does the extra heap allocations affect performance?
This needs to be measured.

Now that we have caching, I doubt it'll be noticeable.

Copy link
Contributor

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice writeup, @ridwanmsharif!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 22 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

I think you're missing a step here. The user needs to install a zone config for the table that sets some field other than num_replicas. If there's no zone config for t, then the update to the default num_replicas will propagate.

Right, the copying of zone config from parent to child happens when the child's zone config is modified, not when the child itself (i.e. the table) is created.


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

I think this is one of the biggest questions in this RFC. Do we want to validate a parent zone against all of its children, or do we want to live with the fact that it's possible to install an invalid zone config? I'm leaning towards the latter, but I'm curious what other folks think.

I disagree, I don't think we should live with that fact. Zone config changes are infrequent and important enough that we should have guard rails of some sort to prevent potentially catastrophic mistakes. Validating zone config changes against all children sounds pretty annoying to implement, but I think would be worth it.

Or alternatively, as @ridwanmsharif suggests for this case it'd be possible to prevent such broken configs by not allowing per-replica constraints without also explicitly setting the number of replicas in the same zone. If it's possible to avoid all such mistakes in that way, using localized requirements, I think that would be acceptable to most users. And as far as I can tell, the only real dependencies between different fields in the config are (1) this and (2) RangeMinBytes must be less than RangeMaxBytes. Forcing users to set both of those in tandem seems more annoying than for NumReplicas and per-replica constraints, but it seems better than the brokenness that accidentally having RangeMinBytes >= RangeMaxBytes` could cause.


docs/RFCS/20180919_cascading_zone_configs.md, line 155 at r2 (raw file):

# Drawbacks

# Unresolved Questions

I think this RFC is also missing a story for how the user can unset a field in a zone such that it starts being inherited from its parent again.


docs/RFCS/20180919_cascading_zone_configs.md, line 158 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

Now that we have caching, I doubt it'll be noticeable.

Yeah, with caching in place I'd be pretty curious about this as well. Zone configs are almost never changed in real clusters. Although being more explicit about it with booleans indicating which fields were set by the user has a nice clarity to it that could prevent future coding mistakes.

Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 91 at r1 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

I believe Ridwan's plan, to start, is to cache fully resolved database and table zone configs, but partial index/partition zone configs. So there's still a bit of a perf hit for subzones.

Yes, currently there's still going to be a lookup for subzones inside a cached table zone. We could also cache the subzones as mentioned later in the RFC.


docs/RFCS/20180919_cascading_zone_configs.md, line 22 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

Right, the copying of zone config from parent to child happens when the child's zone config is modified, not when the child itself (i.e. the table) is created.

Done.


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I disagree, I don't think we should live with that fact. Zone config changes are infrequent and important enough that we should have guard rails of some sort to prevent potentially catastrophic mistakes. Validating zone config changes against all children sounds pretty annoying to implement, but I think would be worth it.

Or alternatively, as @ridwanmsharif suggests for this case it'd be possible to prevent such broken configs by not allowing per-replica constraints without also explicitly setting the number of replicas in the same zone. If it's possible to avoid all such mistakes in that way, using localized requirements, I think that would be acceptable to most users. And as far as I can tell, the only real dependencies between different fields in the config are (1) this and (2) RangeMinBytes must be less than RangeMaxBytes. Forcing users to set both of those in tandem seems more annoying than for NumReplicas and per-replica constraints, but it seems better than the brokenness that accidentally having RangeMinBytes >= RangeMaxBytes` could cause.

I don't think for something as infrequent as zone config changes, having invalid ones being installed is the best approach either. Validation is something we should do if we're not too worried about the time penalty validating every child recursively would result in. The hacky solution of requiring NumReplicas when a constraint is set and similarly for RangeMinBytes and RangeMaxBytes, is one I'm fine with for now, that is if the validating children option is one we're not going to try.


docs/RFCS/20180919_cascading_zone_configs.md, line 155 at r2 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

I think this RFC is also missing a story for how the user can unset a field in a zone such that it starts being inherited from its parent again.

Done. What do think the sytax on the users end for unsetting a value should look like @a-robinson?

ridwanmsharif pushed a commit to ridwanmsharif/cockroach that referenced this pull request Sep 28, 2018
Very much a WIP. Modifies the `ZoneConfig` struct to
make fields optional now. If fields are not set, they
are looked for in the heirarchy. Refer to the RFC for
more details.

Issue: cockroachdb#28901
Implements most of RFC: cockroachdb#30426

Release note: None
Copy link
Contributor

@benesch benesch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

If you want to try implementing the validation of children, I won't stop you :) It's strictly better behavior from a user's perspective assuming its error messages are understandable. Just be aware it'll take significantly more time and effort to get right.

If you have 1k tables, that's potentially tens of thousands of zone configs. That's going to make changing the default zone config quite painful. I know we don't have many clusters of that scale today, but I think it'd be a shame to introduce that bottleneck. @bdarnell do you have thoughts?


docs/RFCS/20180919_cascading_zone_configs.md, line 155 at r2 (raw file):

Previously, knz (kena) wrote…

I am ok with using yaml nulls for this purpose, but I'd like to point out that SQL has another idiomatic way to do this, already exemplified in other ALTER statements (e.g. to remove constraints on columns): DROP

ALTER ... CONFIGURE ZONE DROP range_min_bytes

Yeah, the cockroach zone command is deprecated in v2.1 so I'm hopeful we can remove it entirely in v2.2. Then we don't need to worry about YAML at all. I like your suggested SQL a lot, Raphael.

Copy link
Member

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

If you want to try implementing the validation of children, I won't stop you :) It's strictly better behavior from a user's perspective assuming its error messages are understandable. Just be aware it'll take significantly more time and effort to get right.

If you have 1k tables, that's potentially tens of thousands of zone configs. That's going to make changing the default zone config quite painful. I know we don't have many clusters of that scale today, but I think it'd be a shame to introduce that bottleneck. @bdarnell do you have thoughts?

It doesn't bother me too much if changing the default zone config has costs proportional to the number of zone configs in the system, although it would be best if that could be avoided. I agree that the best thing to do here is to treat the number of replicas and the constraints as a unit. (This is how zone configs originally worked: there wasn't an independent num_replicas setting, you would ask for three replicas by setting something like constraints: [[], [], []]. We added the num_replicas setting because it's much easier for the user to understand, although maybe we should decouple the user-facing config format from the underlying representation).

ridwanmsharif pushed a commit to ridwanmsharif/cockroach that referenced this pull request Oct 1, 2018
Very much a WIP. Modifies the `ZoneConfig` struct to
make fields optional now. If fields are not set, they
are looked for in the heirarchy. Refer to the RFC for
more details.

Issue: cockroachdb#28901
Implements most of RFC: cockroachdb#30426

Release note: None
Copy link
Author

@ridwanmsharif ridwanmsharif left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

It doesn't bother me too much if changing the default zone config has costs proportional to the number of zone configs in the system, although it would be best if that could be avoided. I agree that the best thing to do here is to treat the number of replicas and the constraints as a unit. (This is how zone configs originally worked: there wasn't an independent num_replicas setting, you would ask for three replicas by setting something like constraints: [[], [], []]. We added the num_replicas setting because it's much easier for the user to understand, although maybe we should decouple the user-facing config format from the underlying representation).

We're in agreement then. Treating constraints and replication factor as a unit, and range min and max bytes as another, is an acceptable way to deal with this problem.


docs/RFCS/20180919_cascading_zone_configs.md, line 5 at r3 (raw file):

Previously, a-robinson (Alex Robinson) wrote…

This is usually meant to point to the PR in which the RFC was discussed, i.e. this one (#30426), in case anyone wants to look up the discussion that happened around the design decisions.

Done.

ridwanmsharif pushed a commit to ridwanmsharif/cockroach that referenced this pull request Oct 1, 2018
Very much a WIP. Modifies the `ZoneConfig` struct to
make fields optional now. If fields are not set, they
are looked for in the heirarchy. Refer to the RFC for
more details.

Issue: cockroachdb#28901
Implements most of RFC: cockroachdb#30426

Release note: None
Copy link
Contributor

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r4.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


docs/RFCS/20180919_cascading_zone_configs.md, line 155 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

Yeah, the cockroach zone command is deprecated in v2.1 so I'm hopeful we can remove it entirely in v2.2. Then we don't need to worry about YAML at all. I like your suggested SQL a lot, Raphael.

SGTM. The SQL command still has a version that accepts YAML, but that version doesn't have to support unsetting a field.

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


docs/RFCS/20180919_cascading_zone_configs.md, line 81 at r2 (raw file):

Previously, benesch (Nikhil Benesch) wrote…

It's worth being aware that there are actually two defaults. The first is installed into system.zones when the cluster is bootstrapped—i.e., the moment the cluster comes into existence. The second is in-memory in the binary itself. If the default entry in system.zones is deleted, then the in-memory default is used instead. Ending up in this state is difficult—you have to manually run a DELETE FROM system.zones—but it's been seen in the wild. The new cascading should continue to do something sensible if the default in system.zones is missing or incomplete.

I concur, this needs to be addressed. it's not even "difficult": it happens in tests! I had changed this inadvertently (to ignore the root case) and CI was not happy.


docs/RFCS/20180919_cascading_zone_configs.md, line 132 at r2 (raw file):

Previously, ridwanmsharif (Ridwan Sharif) wrote…

We're in agreement then. Treating constraints and replication factor as a unit, and range min and max bytes as another, is an acceptable way to deal with this problem.

Maybe add a note in the RFC with a summary of the discussion and the outcome.

Adds an RFC to discuss proposals and design for cascading zone configs.
Discusses changes to the `ZoneConfig` structure, why it is needed, the
different ways we could change it and designs what the change  should
look like

Release note: None
@ridwanmsharif
Copy link
Author

bors r+

craig bot pushed a commit that referenced this pull request Oct 4, 2018
30426: rfc: Cascading Zone Configs r=ridwanmsharif a=ridwanmsharif

Adds an RFC to discuss proposals and design for cascading zone configs.
Discusses changes to the `ZoneConfig` structure, why it is needed, the
different ways we could change it and designs what the change  should
look like

Issue: #28901 

Release note: None

Co-authored-by: Ridwan Sharif <ridwan@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Oct 4, 2018

Build succeeded

@craig craig bot merged commit f59e534 into cockroachdb:master Oct 4, 2018
@ridwanmsharif ridwanmsharif deleted the rfc-cascading-zone-configs branch October 4, 2018 04:33
ridwanmsharif pushed a commit to ridwanmsharif/cockroach that referenced this pull request Oct 11, 2018
Very much a WIP. Modifies the `ZoneConfig` struct to
make fields optional now. If fields are not set, they
are looked for in the heirarchy. Refer to the RFC for
more details.

Issue: cockroachdb#28901
Implements most of RFC: cockroachdb#30426

Release note: None
craig bot pushed a commit that referenced this pull request Oct 12, 2018
30611: sql, config: make zone configs cascade r=ridwanmsharif a=ridwanmsharif

Modifies the `ZoneConfig` struct to
make fields optional now. If fields are not set, they
are looked for in the heirarchy. Refer to the RFC for
more details.

Issue: #28901
Implements most of RFC: #30426

TODO:
- [x] Update `set_zone_config` to use the new structure and validate them
- [x] Exhaustively test cascading zone configs

Release note: Changes behavior of Zone Configs.

Co-authored-by: Ridwan Sharif <ridwan@cockroachlabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants