Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

system.replication_constraint_stats reporting incorrect violations #70024

Open
smcvey opened this issue Sep 10, 2021 · 1 comment
Open

system.replication_constraint_stats reporting incorrect violations #70024

smcvey opened this issue Sep 10, 2021 · 1 comment
Labels
A-kv-observability A-kv-replication-constraints C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-observability

Comments

@smcvey
Copy link
Contributor

smcvey commented Sep 10, 2021

After setting constraints on the RANGE default zone, the system.replication_constraint_stats incorrectly report system ranges which are in violation of the zone configuration.

To Reproduce

Create 9 nodes, with the following localities:

n1: --locality=region=region1,DC=dc1
n2: --locality=region=region1,DC=dc1
n3: --locality=region=region1,DC=dc1
n4: --locality=region=region1,DC=dc2
n5: --locality=region=region1,DC=dc2
n6: --locality=region=region1,DC=dc2
n7: --locality=region=region2,DC=dc3
n8: --locality=region=region2,DC=dc3
n9: --locality=region=region2,DC=dc3

Then run:

ALTER RANGE default CONFIGURE ZONE USING
num_replicas = 5,
constraints = '{+DC=dc1: 2, +DC=dc2: 2, +region=region2: 1}';

Populating any user-created database correctly replicates based on the constraints and there are therefore no entries in the system.replication_constraints_stats table.

However, ranges that belong to the system database do not conform to these constraints because they conform to the RANGE system zone configuration instead. However, when querying system.replication_constraints_stats, system ranges can appear in here as a violation, for example:

root@:26257/defaultdb> select * from system.replication_constraint_stats;
  zone_id | subzone_id |    type    |  config   | report_id |        violation_start        | violating_ranges
----------+------------+------------+-----------+-----------+-------------------------------+-------------------
        0 |          0 | constraint | +DC=dc1:2 |         1 | 2021-09-10 14:20:13.682032+00 |               20
        0 |          0 | constraint | +DC=dc2:2 |         1 | 2021-09-10 14:20:13.682032+00 |               15
(2 rows)

The system.replication_constraints_stats table should not be validating system ranges against default ranges. This table should not be populated as a result of the above zone configuration.

Verified on CRDB 21.1.6 and 21.1.8

Jira issue: CRDB-9905

@smcvey smcvey added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Sep 10, 2021
@andreimatei andreimatei added this to Incoming in KV via automation Sep 10, 2021
@blathers-crl blathers-crl bot added the T-kv KV Team label Sep 10, 2021
@lunevalex lunevalex moved this from Incoming to On Hold in KV Oct 20, 2021
@Lukens4242
Copy link
Contributor

I now have a client that is experiencing this scenario and it is causing their internally written health/sanity checks to fail before various maint activities move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-observability A-kv-replication-constraints C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-observability
Projects
KV
On Hold
Development

No branches or pull requests

4 participants