migration_manager: schema version correctness depends on order of feature enabling #16004

denesb · 2023-11-08T14:32:30Z

Summary

migration_manager::init_messaging_service() installs listeners on interesting schema-related features. There are two kinds of action taken on enabling of different schema features:

update_schema() - recalculate schema digest
reload_schema() - reload schema from disk, recalculate version of individual tables

TABLE_DIGEST_INSENSITIVE_TO_EXPIRY triggers a reload_schema() call, the rest trigger update_schema(). However TABLE_DIGEST_INSENSITIVE_TO_EXPIRY doesn't take effect on its own, it needs DIGEST_INSENSITIVE_TO_EXPIRY to be also enabled, for it to take effect. This pretty much always happens with the current order of feature processing. BUT:

The feature container is std::unordered_map, so adding new features, renaming existing ones can change the orders in unexpected ways.
reload_schema() runs in the background and doesn't wait for enabling of DIGEST_INSENSITIVE_TO_EXPIRY.

The end result: any change in the order of features or timing, will result in reload_schema() not taking effect and tables having wrong schema versions in memory. This can be fixed with a reboot.

Details (also see discussion)

Seen in https://jenkins.scylladb.com//job/scylla-master/job/scylla-ci/4504/testReport/junit/bootstrap_test/TestBootstrap/Tests___Sanity_Tests___test_off_strategy_during_bootstrap

The test fails with:

cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.0.56.1:9042 datacenter1>: <Error from server: code=0000 [Server error] message="Schema version fe31106e-176c-3aa2-9212-0878916a48f0 not found">, <Host: 127.0.56.2:9042 datacenter1>: <Error from server: code=0000 [Server error] message="Schema version b7028222-6072-370f-a4eb-7d6637c0bdaf not found">})

Checking the logs of the nodes, fe31106e-176c-3aa2-9212-0878916a48f0 is the schema version, the schema gets on node1, where the table is originally created.
Later, a new node, node2 is added. However, on node2, the schema gets a different schema version:

INFO  2023-11-03 11:32:50,948 [shard 0:stmt] schema_tables - Creating ks.cf id=b5f542c0-7a3c-11ee-8f31-36ede729ddfa version=b7028222-6072-370f-a4eb-7d6637c0bdaf
INFO  2023-11-03 11:32:50,948 [shard 0:stmt] schema_tables - Creating system_distributed.service_levels id=b8c556bd-212d-37ad-9484-690c73a5994b version=ae051e00-84fa-393e-994d-8c75166a899c
INFO  2023-11-03 11:32:50,949 [shard 0:stmt] schema_tables - Creating system_traces.node_slow_log id=bfcc4e62-5b63-3aa1-a1c3-6f5e47f3325c version=3a93a9bd-67aa-3f1a-a679-aa4723b3df0b
INFO  2023-11-03 11:32:50,949 [shard 0:stmt] schema_tables - Creating system_traces.sessions id=c5e99f16-8677-3914-b17e-960613512345 version=96836fad-fdb0-3a72-bd10-4f972b32636a
INFO  2023-11-03 11:32:50,949 [shard 0:stmt] schema_tables - Creating system_traces.node_slow_log_time_idx id=f9706768-aa1e-3d87-9e5c-51a3927c2870 version=8622004e-038b-3d45-8c88-437e6331fe5b
INFO  2023-11-03 11:32:50,950 [shard 0:stmt] schema_tables - Creating system_distributed.cdc_generation_timestamps id=fdf455c4-cfec-3e00-9719-d7a45436c89d version=ed3059fb-5129-30be-ae3a-329e3cb6fc0d
INFO  2023-11-03 11:32:50,952 [shard 0:comp] compaction - [Compact system_schema.columns b8a52080-7a3c-11ee-b2b7-34e75fc33ec1] Compacting [/jenkins/workspace/scylla-master/gating-dtest-release/scylla/.dtest/dtest-3pqqjy_4/test/node2/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mc-4-big-Data.db:level=0:origin=memtable,/jenkins/workspace/scylla-master/gating-dtest-release/scylla/.dtest/dtest-3pqqjy_4/test/node2/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mc-2-big-Data.db:level=0:origin=memtable]
INFO  2023-11-03 11:32:50,961 [shard 0:stmt] schema_tables - Schema version changed to 5767fb1b-bfd8-3ced-8173-1b93726fce0d
INFO  2023-11-03 11:32:50,962 [shard 0:main] raft_group_registry - marking Raft server e078fe75-6334-4d70-9823-ea48bc060caf as alive for raft groups
INFO  2023-11-03 11:32:50,963 [shard 0:strm] raft_group0 - setup_group0: waiting for peers to synchronize state...
INFO  2023-11-03 11:32:50,963 [shard 0:strm] raft_group0_upgrade - wait_for_peers_to_enter_synchronize_state: 127.0.56.1 confirmed that they finished upgrade.
INFO  2023-11-03 11:32:50,963 [shard 0:strm] raft_group0 - setup_group0: the cluster is ready to use Raft. Finishing.
INFO  2023-11-03 11:32:50,964 [shard 0:comp] compaction - [Compact system_schema.columns b8a52080-7a3c-11ee-b2b7-34e75fc33ec1] Compacted 2 sstables to [/jenkins/workspace/scylla-master/gating-dtest-release/scylla/.dtest/dtest-3pqqjy_4/test/node2/data/system_schema/columns-24101c25a2ae3af787c1b40ee1aca33f/mc-6-big-Data.db:level=0]. 22kB to 19kB (~85% of original) in 9ms = 2MB/s. ~256 total partitions merged to 7.
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature TABLE_DIGEST_INSENSITIVE_TO_EXPIRY is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] migration_manager - Reloading schema
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature UUID_SSTABLE_IDENTIFIERS is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature SECONDARY_INDEXES_ON_STATIC_COLUMNS is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature LARGE_COLLECTION_DETECTION is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature COLLECTION_INDEXING is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature PER_TABLE_PARTITIONERS is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature LWT is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature NONFROZEN_UDTS is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature RANGE_SCAN_DATA_VARIANT is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature VIEW_VIRTUAL_COLUMNS is enabled
INFO  2023-11-03 11:32:50,966 [shard 0:strm] features - Feature MD_SSTABLE_FORMAT is enabled
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_distributed.cdc_streams_descriptions_v2 id=0bf73fd7-65b2-36b0-85e5-658131d5df36 version=5dbcd7ee-ee18-3a51-bf55-1c4ce40b7d79
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_traces.sessions_time_idx id=0ebf001c-c1d1-3693-9a63-c3d96ac53318 version=1c4a42d0-5d34-3d49-bcac-b7d757259aa3
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_auth.role_members id=0ecdaa87-f8fb-3e60-88d1-74fb36fe5c0d version=c358d59a-560e-32ce-8c56-d5d64b13ae5d
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_distributed_everywhere.cdc_generation_descriptions_v2 id=234d2227-dd63-3d37-ac5f-c013e2ea9e6e version=4da047b9-32d0-3032-85e6-c419adb5fca7
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_distributed.view_build_status id=5582b59f-8e4e-35e1-b913-3acada51eb04 version=f21c5312-4f07-3670-b1d8-1561d47d48c0
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_auth.roles id=5bc52802-de25-35ed-aeab-188eecebb090 version=f71e5f3a-5fca-321b-a9d1-e07af824a636
INFO  2023-11-03 11:32:50,971 [shard 0:strm] schema_tables - Reloading system_auth.role_attributes id=6b8c7359-a843-33f2-a1d8-5dc6a187436f version=4625931c-1e4a-37ca-b954-ae8d45da8a2a
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading system_traces.events id=8826e8e9-e16a-3728-8753-3bc1fc713c25 version=36007558-238e-30b0-8800-019d3bd6eec7
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading ks.cf id=b5f542c0-7a3c-11ee-8f31-36ede729ddfa version=b7028222-6072-370f-a4eb-7d6637c0bdaf
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading system_distributed.service_levels id=b8c556bd-212d-37ad-9484-690c73a5994b version=ae051e00-84fa-393e-994d-8c75166a899c
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading system_traces.node_slow_log id=bfcc4e62-5b63-3aa1-a1c3-6f5e47f3325c version=3a93a9bd-67aa-3f1a-a679-aa4723b3df0b
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading system_traces.sessions id=c5e99f16-8677-3914-b17e-960613512345 version=96836fad-fdb0-3a72-bd10-4f972b32636a
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading system_traces.node_slow_log_time_idx id=f9706768-aa1e-3d87-9e5c-51a3927c2870 version=8622004e-038b-3d45-8c88-437e6331fe5b
INFO  2023-11-03 11:32:50,972 [shard 0:strm] schema_tables - Reloading system_distributed.cdc_generation_timestamps id=fdf455c4-cfec-3e00-9719-d7a45436c89d version=ed3059fb-5129-30be-ae3a-329e3cb6fc0d
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_distributed.cdc_streams_descriptions_v2 changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_traces.sessions_time_idx changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_auth.role_members changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_distributed_everywhere.cdc_generation_descriptions_v2 changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_distributed.view_build_status changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_auth.roles changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_auth.role_attributes changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_traces.events changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for ks.cf changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_distributed.service_levels changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_traces.node_slow_log changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_traces.sessions changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_traces.node_slow_log_time_idx changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,972 [shard 0:strm] query_processor - Column definitions for system_distributed.cdc_generation_timestamps changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_distributed.cdc_streams_descriptions_v2 changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_traces.sessions_time_idx changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_auth.role_members changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_distributed_everywhere.cdc_generation_descriptions_v2 changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_distributed.view_build_status changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_auth.roles changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_auth.role_attributes changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_traces.events changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for ks.cf changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_distributed.service_levels changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_traces.node_slow_log changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_traces.sessions changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_traces.node_slow_log_time_idx changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,973 [shard 1:strm] query_processor - Column definitions for system_distributed.cdc_generation_timestamps changed, invalidating related prepared statements
INFO  2023-11-03 11:32:50,976 [shard 0:strm] schema_tables - Schema version changed to 599ea985-3978-3457-86d0-b78e4ea3208e
INFO  2023-11-03 11:32:50,979 [shard 0:strm] schema_tables - Schema version changed to 599ea985-3978-3457-86d0-b78e4ea3208e
INFO  2023-11-03 11:32:50,982 [shard 0:strm] schema_tables - Schema version changed to 599ea985-3978-3457-86d0-b78e4ea3208e

After this, node2 proceeds to successfully bootstrap, however when later ks.cf is queried, it fails because on node2, said table has a different version.

The text was updated successfully, but these errors were encountered:

denesb · 2023-11-08T14:33:41Z

Interesting to note, that the two nodes have the same schema versions. Yet they have different versions for ks.cf.

denesb · 2023-11-08T14:37:14Z

Another variant of this can be seen at: https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/4504/testReport/junit/cleanup_test/TestCleanup/Tests___Sanity_Tests___test_cluster_cleanup/.

denesb · 2023-11-08T14:41:01Z

Another can be seen here: https://jenkins.scylladb.com/job/scylla-master/job/gating-dtest-release/5740/artifact/logs-full.release.003/1699011052696_cql_tracing_test.py%3A%3ATestCqlTracing%3A%3Atest_tracing_shutdown/

Here the mismatch happens on a system table (system_traces.events).

denesb · 2023-11-08T14:44:39Z

@kbr-scylla is this related to the recent consistent schema version changes? Maybe there is another corner-case where it fails and actually generates different versions on different nodes.

kbr-scylla · 2023-11-08T14:45:58Z

@denesb please upload test and relevant node logs so they don't get lost after jenkins gets garbage-collected

@kbr-scylla is this related to the recent consistent schema version changes?

these changes got reverted

kbr-scylla · 2023-11-08T14:47:45Z

Are these CI runs on the PR where you add the new cluster feature?

Schema versions can mismatch because of different subset of enabled features on the two nodes, so it might be related

denesb · 2023-11-08T14:48:41Z

Are these CI runs on the PR where you add the new cluster feature?

Schema versions can mismatch because of different subset of enabled features on the two nodes, so it might be related

Yes, all these are from #15890. Which for some reason reproduces this issue very consistently. It must be related to adding the new feature, but I don't yet see how.

denesb · 2023-11-08T14:51:20Z

What I don't understand is: from the POV of a new cluster, all features are new. So how is adding a new feature matter here? These are not upgrade tests.

kbr-scylla · 2023-11-08T14:52:54Z

What if... the string that lists all of the features became too long, and gets truncated somewhere? 😱

denesb · 2023-11-08T14:53:13Z

I even considered that the problem is that I added the new feature member, into the middle of the feature member list. But I see no sign of feature member declaration order mattering at all, in the code. Maybe I missed something?

Another things I considered is that maybe with the new feature we go over some threshold of feature numbers.

These are silly theories but I'm getting desperate.

kbr-scylla · 2023-11-08T14:55:56Z

@xemul @piodul does this ring any bell -- how can adding a new cluster feature (and doing nothing other than that) completely break everything?

denesb · 2023-11-08T15:05:45Z

@denesb please upload test and relevant node logs so they don't get lost after jenkins gets garbage-collected

Logs for the simplest example -- the test mentioned in the opening message.:
dtest-gw1.log
node1.log
node2.log

kbr-scylla · 2023-11-08T15:14:47Z

Or what if there are so many features now, that enabling them takes too long, and now we're hitting some race that we weren't hitting before?

kbr-scylla · 2023-11-08T15:16:18Z

cc @tgrabiec maybe you have some idea

how can adding a new cluster feature (and doing nothing other than that) completely break everything?

piodul · 2023-11-08T15:19:38Z

Just to be clear: all the test runs mentioned here so far contain the whole fix from #15890, not just the line that adds the feature?

Do those runs have topology-on-raft enabled or not?

kbr-scylla · 2023-11-08T15:20:30Z

Do those runs have topology-on-raft enabled or not?

The dtest ones don't

kbr-scylla · 2023-11-08T15:24:42Z

all the test runs mentioned here so far contain the whole fix from #15890, not just the line that adds the feature?

Ahh, good point

@denesb could it be that e.g. we read empty pages when reading schema tables, and there's some kind of bug there that makes us skip parts of mutations, and that leads to schema mismatch?

How can you be sure that your change is not the cause here?

piodul · 2023-11-08T15:31:32Z

What I don't understand is: from the POV of a new cluster, all features are new. So how is adding a new feature matter here? These are not upgrade tests.

Note that all cluster features start as disabled for a joining node. The node verifies that it supports all of the features enabled by the cluster and only then locally marks them as enabled.

kbr-scylla · 2023-11-08T15:32:19Z

@denesb if we get this failure from ONLY adding the feature flag, then it's plausible there's some underlying problem with features, and it would make sense to spend time digging into features implementation, or schema, or whatever.

BUT if, in order to trigger this, you actually have to use the feature to protect your new code -- then I think we should first place most suspicions on your new code.

kbr-scylla · 2023-11-08T15:33:18Z

Note that all cluster features start as disabled for a joining node. The node verifies that it supports all of the features enabled by the cluster and only then locally marks them as enabled.

Ah, perhaps this is racing with queries being executed, and some of them set this "allow empty page" flag, some of them don't? Or even, there could be a single query (a scan perhaps?) that is being executed, and part of it is executed without the flag, but the next part with? @denesb

edit: and the query returns multiple pages

denesb · 2023-11-08T15:40:08Z

@kbr-scylla the tests failed even when I commented out all the code, except for the feature addition. Also, the feature guards a bit used in read-repair, and all schema table reads are local, no?

kbr-scylla · 2023-11-08T16:05:39Z

@kbr-scylla the tests failed even when I commented out all the code, except for the feature addition.

Ok, here: https://jenkins.scylladb.com/job/scylla-master/job/scylla-ci/4503/

we have a run with commit 0375aa0, that has only the feature added, without the if, and it also contains Tests / Sanity Tests / bootstrap_test.TestBootstrap.test_off_strategy_during_bootstrap which appears to have failed due to the same reason

piodul · 2023-11-08T22:17:46Z

I took a short look. I can confirm that this issue reproduces relatively often when I run it locally (~50% of the time). I noticed that, curiously, changing the name of the feature makes the problem go away.

I suspect this can have something to do with the order in which features are enabled. When a node wants to enable features on join after checking them with the cluster, it goes through the registered features (kept in unordered_map) and enables the ones that the cluster already enables. It looks like adding a new feature changes the iteration order of that unordered_map, (changing the new feature's name does that, too). Here is a diff of preprocessed node logs from the dtest, feats1 is without the feature and feats2 is with the new feature:

(dtest-3.11.5) [piodul@scyllapc scylla-dtest]$ cat testlogs/last/node2.log | cut -d' ' -f9- | grep 'Feature.*is enabled'
...
(dtest-3.11.5) [piodul@scyllapc scylla-dtest]$ diff feats1 feats2  -u -U 1000
--- feats1      2023-11-08 23:01:29.046491639 +0100
+++ feats2      2023-11-08 22:53:52.374563787 +0100
@@ -1,29 +1,30 @@
 Feature TABLE_DIGEST_INSENSITIVE_TO_EXPIRY is enabled
 Feature UUID_SSTABLE_IDENTIFIERS is enabled
 Feature SECONDARY_INDEXES_ON_STATIC_COLUMNS is enabled
 Feature LARGE_COLLECTION_DETECTION is enabled
+Feature COLLECTION_INDEXING is enabled
 Feature PER_TABLE_PARTITIONERS is enabled
 Feature LWT is enabled
 Feature NONFROZEN_UDTS is enabled
 Feature RANGE_SCAN_DATA_VARIANT is enabled
-Feature DIGEST_INSENSITIVE_TO_EXPIRY is enabled
 Feature CORRECT_IDX_TOKEN_IN_SECONDARY_INDEX is enabled
 Feature VIEW_VIRTUAL_COLUMNS is enabled
 Feature MD_SSTABLE_FORMAT is enabled
 Feature CDC_GENERATIONS_V2 is enabled
 Feature HINTED_HANDOFF_SEPARATE_CONNECTION is enabled
 Feature CDC is enabled
 Feature SEPARATE_PAGE_SIZE_AND_SAFETY_LIMIT is enabled
 Feature ME_SSTABLE_FORMAT is enabled
 Feature DIGEST_FOR_NULL_VALUES is enabled
 Feature PER_TABLE_CACHING is enabled
 Feature ALTERNATOR_TTL is enabled
 Feature PARALLELIZED_AGGREGATION is enabled
 Feature UDA is enabled
-Feature COLLECTION_INDEXING is enabled
 Feature EMPTY_REPLICA_PAGES is enabled
+Feature DIGEST_INSENSITIVE_TO_EXPIRY is enabled
+Feature EMPTY_REPLICA_MUTATION_PAGES is enabled
 Feature TOMBSTONE_GC_OPTIONS is enabled
 Feature AGGREGATE_STORAGE_OPTIONS is enabled
 Feature TYPED_ERRORS_IN_READ_RPC is enabled
 Feature SCHEMA_COMMITLOG is enabled
 Feature UDA_NATIVE_PARALLELIZED_AGGREGATION is enabled

For some features (including DIGEST_INSENSITIVE_TO_EXPIRY for example), there are callbacks registered which run at the moment when the feature is enabled. Perhaps some of them depended on the order and changing that order makes them fail.

As a side note, maybe we could randomize the order of enabling features, at least in debug mode? I don't think the code should rely on any particular order.

denesb · 2023-11-09T09:37:55Z

I managed to reproduce locally too. Comparing the logs for a failed and successful run:

In both cases, the table on the schema node is created while features are still being negotiated. In both cases, node2 ends up with a different schema version initially.

As the features are enabled, TABLE_DIGEST_INSENSITIVE_TO_EXPIRY kicks off a reload of schema from the disk. In the successful run, this ends up changing the version of the schema, to the same one that node1 calculated before. In the failed run, this reload doesn't change the schema version. The only difference I could find, is that in the successful run, all schema features were already enabled by the time the (background) reload actually got to ks.cf. In the failed run, some of the features (e.g. DIGEST_FOR_NULL_VALUES and DIGEST_INSENSITIVE_TO_EXPIRY) were enabled after the schema was reloaded.

Note that DIGEST_INSENSITIVE_TO_EXPIRY also kicks off an update_schema process, but this apparently doesn't end up recalculating digests.

denesb · 2023-11-09T10:23:32Z

I'm testing a fix...

…to_expiry Currently, when said feature is enable, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct schema versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence (unguaranteed) feature order, and table schemas have a correct version after all features are enabled. Fixes: scylladb#16004

denesb · 2023-11-09T10:36:51Z

Fix here: #16013

denesb · 2023-11-09T10:38:07Z

With the above fix $ pytest --cassandra-dir=~/ScyllaDB/scylla/worktree1/build/dev --keep-test-dir bootstrap_test.py::TestBootstrap::test_off_strategy_during_bootstrap --count=10 all passed for me, on top of #15890. Before, this test would fail roughly 50% of the time.

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence (unguaranteed) feature order, and table schemas have a correct version after all features are enabled. Fixes: scylladb#16004

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence on (unguaranteed) feature order, and to guarantee that table schemas have a correct version after all features are enabled. Fixes: scylladb#16004

kbr-scylla · 2023-11-09T10:51:46Z

As a side note, maybe we could randomize the order of enabling features, at least in debug mode? I don't think the code should rely on any particular order.

Good idea, I created an issue for that: #16014

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence on (unguaranteed) feature order, and to guarantee that table schemas have a correct version after all features are enabled. Fixes: scylladb#16004

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence on (unguaranteed) feature order, and to guarantee that table schemas have a correct version after all features are enabled. In fact, all schema feature notification handlers now kick off a full schema reload, to ensure bugs like this don't creep in, in the future. Fixes: scylladb#16004

mykaul · 2023-11-16T14:54:32Z

Can it this be backported promptly, to unblock 5.4?

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence on (unguaranteed) feature order, and to guarantee that table schemas have a correct version after all features are enabled. In fact, all schema feature notification handlers now kick off a full schema reload, to ensure bugs like this don't creep in, in the future. Fixes: #16004 Closes #16013 (cherry picked from commit 2238144)

denesb · 2023-11-16T15:52:50Z

Backport queued to 5.4.

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence on (unguaranteed) feature order, and to guarantee that table schemas have a correct version after all features are enabled. In fact, all schema feature notification handlers now kick off a full schema reload, to ensure bugs like this don't creep in, in the future. Fixes: #16004 Closes #16013 (cherry picked from commit 2238144)

tgrabiec · 2023-11-21T00:32:04Z

Backported to 5.2

…to_expiry Currently, when said feature is enabled, we recalcuate the schema digest. But this feature also influences how table versions are calculated, so it has to trigger a recalculation of all table versions, so that we can guarantee correct versions. Before, this used to happen by happy accident. Another feature -- table_digest_insensitive_to_expiry -- used to take care of this, by triggering a table version recalulation. However this feature only takes effect if digest_insensitive_to_expiry is also enabled. This used to be the case incidently, by the time the reload triggered by table_digest_insensitive_to_expiry ran, digest_insensitive_to_expiry was already enabled. But this was not guaranteed whatsoever and as we've recently seen, any change to the feature list, which changes the order in which features are enabled, can cause this intricate balance to break. This patch makes digest_insensitive_to_expiry also kick off a schema reload, to eliminate our dependence on (unguaranteed) feature order, and to guarantee that table schemas have a correct version after all features are enabled. In fact, all schema feature notification handlers now kick off a full schema reload, to ensure bugs like this don't creep in, in the future. Fixes: #16004 Closes #16013 (cherry picked from commit 2238144) (cherry picked from commit e31f222)

tgrabiec · 2023-11-21T23:50:05Z

Backported to 5.1

denesb added area/schema changes symptom/ci stability Issues that failed in ScyllaDB CI - tests and framework labels Nov 8, 2023

denesb assigned kbr-scylla Nov 8, 2023

denesb mentioned this issue Nov 8, 2023

gms,service: add a feature to protect the usage of allow_mutation_read_page_without_live_row #15890

Closed

denesb mentioned this issue Nov 8, 2023

[dtest] secondary_indexes_test.py::TestSecondaryIndexes: abort due to uninitialized base schema #15904

Closed

kostja self-assigned this Nov 8, 2023

denesb self-assigned this Nov 9, 2023

denesb mentioned this issue Nov 9, 2023

migration_manager: also reload schema on enabling digest_insensitive_to_expiry #16013

Closed

mykaul added backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed status/release blocker Preventing from a release to be promoted labels Nov 9, 2023

kbr-scylla mentioned this issue Nov 9, 2023

Randomize features enablement order in tests #16014

Open

kbr-scylla mentioned this issue Nov 9, 2023

[dtest] test failed due to a node getting stuck in bootstrap #15393

Closed

scylladb-promoter closed this as completed in 2238144 Nov 14, 2023

scylladb-promoter added the Backport candidate label Nov 14, 2023

denesb removed Backport candidate backport/5.4 Issues that should be backported to 5.4 branch once they'll be fixed labels Nov 16, 2023

piodul mentioned this issue Nov 22, 2023

raft topology: assigning tokens races with setup_group0, causing data path errors #15935

Closed

kostja mentioned this issue Nov 23, 2023

"schema version mismatch detected" error for system.local and system.peers tables with enabled raft #13299

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

migration_manager: schema version correctness depends on order of feature enabling #16004

migration_manager: schema version correctness depends on order of feature enabling #16004

denesb commented Nov 8, 2023 •

edited

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

piodul commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

piodul commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023 •

edited

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

piodul commented Nov 8, 2023

denesb commented Nov 9, 2023 •

edited

denesb commented Nov 9, 2023

denesb commented Nov 9, 2023

denesb commented Nov 9, 2023

kbr-scylla commented Nov 9, 2023

mykaul commented Nov 16, 2023

denesb commented Nov 16, 2023

tgrabiec commented Nov 21, 2023

tgrabiec commented Nov 21, 2023

migration_manager: schema version correctness depends on order of feature enabling #16004

migration_manager: schema version correctness depends on order of feature enabling #16004

Comments

denesb commented Nov 8, 2023 • edited

Summary

Details (also see discussion)

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

denesb commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

piodul commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

piodul commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023 • edited

denesb commented Nov 8, 2023

kbr-scylla commented Nov 8, 2023

piodul commented Nov 8, 2023

denesb commented Nov 9, 2023 • edited

denesb commented Nov 9, 2023

denesb commented Nov 9, 2023

denesb commented Nov 9, 2023

kbr-scylla commented Nov 9, 2023

mykaul commented Nov 16, 2023

denesb commented Nov 16, 2023

tgrabiec commented Nov 21, 2023

tgrabiec commented Nov 21, 2023

denesb commented Nov 8, 2023 •

edited

kbr-scylla commented Nov 8, 2023 •

edited

denesb commented Nov 9, 2023 •

edited