Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tablets: alter keyspace #16723

Merged
merged 18 commits into from
May 29, 2024
Merged

Conversation

ptrsmrn
Copy link
Contributor

@ptrsmrn ptrsmrn commented Jan 10, 2024

This change supports changing replication factor in tablets-enabled keyspaces.
This covers both increasing and decreasing the number of tablets replicas through
first building topology mutations (alter_keyspace_statement.cc) and then
tablets/topology/schema mutations (topology_coordinator.cc).
For the limitations of the current solution, please see the docs changes attached to this PR.

Fixes: #16129

@ptrsmrn ptrsmrn marked this pull request as draft January 10, 2024 20:28
@scylladb-promoter
Copy link
Contributor

🔴 CI State: FAILURE

❌ - Build

Build Details:

  • Duration: 15 min
  • Builder: spider6.cloudius-systems.com

service/storage_service.cc Outdated Show resolved Hide resolved
locator/network_topology_strategy.cc Outdated Show resolved Hide resolved
@ptrsmrn ptrsmrn force-pushed the tablets_alter_keyspace branch 4 times, most recently from 37055ac to abc7a01 Compare January 11, 2024 20:54
configure.py Outdated Show resolved Hide resolved
@ptrsmrn ptrsmrn force-pushed the tablets_alter_keyspace branch 2 times, most recently from ecf18cd to a7b8d74 Compare January 17, 2024 19:59
cql3/statements/alter_keyspace_statement.cc Outdated Show resolved Hide resolved
service/storage_service.cc Outdated Show resolved Hide resolved
cql3/statements/alter_keyspace_statement.cc Outdated Show resolved Hide resolved
@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented Jan 19, 2024

Version 3:

  • made the change compile'able and I'll try to keep it that way
  • had to rebase on top of master, which was painful due to topology builder(s) refactor, but @piodul says he's going to resume his topology refactor PR
  • I'll reply to the comments as for the other changes in this version, basically I am checking if topology state machine processes any other global req from alter_keyspace_statement.

@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented Jan 28, 2024

Version 4:

  1. Rebased on top of master, which eliminated my commit refactoring topology_builder to a separate fail thanks to https://github.com/scylladb/scylladb/pull/16609/files + I got tablets.transition thanks to https://github.com/scylladb/scylladb/pull/16894/files
  2. Added tablet_transition_kind.rf_change, though it's not properly handled everywhere yet
  3. Using wait_for_topology_request_completion() when altering tablets ks in storage_service::alter_tablets_keyspace
  4. Moved altering tablets ks code from alter_keyspace_statement.cc to storage_service.cc to eliminate bringing too many dependencies into cql3 module (where alter_keyspace_statement.cc is).

CC @tgrabiec

locator/tablets.cc Outdated Show resolved Hide resolved
service/storage_service.cc Outdated Show resolved Hide resolved
service/storage_service.cc Outdated Show resolved Hide resolved
cql3/query_processor.hh Outdated Show resolved Hide resolved
@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented Feb 7, 2024

@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented Feb 13, 2024

Version 6:

@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented Feb 19, 2024

CC @tgrabiec

Version 7:

  1. (there is some extra debug code that prints additional stuff, I'll remove it later).
  2. the code compiles and runs, migrations seem to be constructed and run properly (with the exception that I am not really changing anything atm - the code that is going to add/remove tablets replicas will be included soon). What was fixed was mainly:
  • deleting the global topology request once alter rf req is created
  • I moved creating request id out of raft guard down in hopes to fix one of the bugs listed below but it didn't help
  • removed a deadlock caused by creating a raft guard when a raft guard operation has already been started
  • handled loading new topology table columns
  1. There's a couple of bugs that still need to be fixed:
ERROR 2024-02-19 22:21:16,278 [shard 0:stmt] system_keyspace - no entry for request id d0d6d24a-cf6c-11ee-05a6-9c5353029f4f, at: 0xd4f530a /home/piotrs/src/scylla2/build/debug/seastar/libseastar.so+0x55f96ec
...
   seastar::internal::coroutine_traits_base<service::topology_request_state>::promise_type
   --------
   seastar::internal::coroutine_traits_base<seastar::basic_sstring<char, unsigned int, 15u, true> >::promise_type
   --------
   seastar::internal::coroutine_traits_base<void>::promise_type
   --------
   seastar::internal::coroutine_traits_base<seastar::shared_ptr<cql_transport::messages::result_message> >::promise_type
   --------
   seastar::internal::coroutine_traits_base<seastar::shared_ptr<cql_transport::messages::result_message> >::promise_type
   --------
   seastar::internal::coroutine_traits_base<seastar::shared_ptr<cql_transport::messages::result_message> >::promise_type
   --------
   seastar::internal::coroutine_traits_base<seastar::shared_ptr<cql_transport::messages::result_message> >::promise_type
...
  • the other issue is that once I enter:
INFO  2024-02-19 22:21:16,283 [shard 0:strm] raft_topology - entered `tablet migration` transition state

I repeatedly (approx. each ~45-60 seconds) get the following (which should happen only once?):

INFO  2024-02-19 22:21:16,283 [shard 0:strm] load_balancer - Node 1dee4885-c35e-4499-aa89-e06c2823f736: rack=rack1 avg_load=0, tablets=0, shards=3, state=normal
INFO  2024-02-19 22:21:16,283 [shard 0:strm] load_balancer - Prepared 0 migrations in DC datacenter1
INFO  2024-02-19 22:21:16,283 [shard 0:strm] load_balancer - Prepared 0 migration plans, out of which there were 0 tablet migration(s) and 0 resize decision(s)
INFO  2024-02-19 22:21:16,283 [shard 0:strm] raft_topology - updating topology state: Finished tablet migration

@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented Feb 20, 2024

My guess is that waiting for topology request completion doesn't work, because this function doesn't handle global requests.

cql3/query_processor.cc Outdated Show resolved Hide resolved
ptrsmrn and others added 10 commits May 28, 2024 13:55
…ifferent raft commands

Since ALTER KS requires creating topology_change raft command, some
functions need to be extended to handle it. RAFT commands are recognized
by types, so some functions are just going to be parameterized by type,
i.e. made into templates.
These templates are instantiated already, so that only 1 instances of
each template exists across the whole code base, to avoid compiling it
in each translation unit.
This commit adds support for executing ALTER KS for keyspaces with
tablets and utilizes all the previous commits.
The ALTER KS is handled in alter_keyspace_statement, where a global
topology request in generated with data attached to system.topology
table. Then, once topology state machine is ready, it starts to handle
this global topology event, which results in producing mutations
required to change the schema of the keyspace, delete the
system.topology's global req, produce tablets mutations and additional
mutations for a table tracking the lifetime of the whole req. Tracking
the lifetime is necessary to not return the control to the user too
early, so the query processor only returns the response while the
mutations are sent.
This patch removes the support for the "wildcard" replication_factor
option for ALTER KEYSPACE when the keyspace supports tablets.
It will still be supported for CREATE KEYSPACE so that a user doesn't
have to know all datacenter names when creating the keyspace,
but ALTER KEYSPACE will require that and the user will have to
specify the exact change in replication factors they wish to make by
explicitly specifying the datacenter names.
Expanding the replication_factor option in the ALTER case is
unintuitive and it's a trap many users fell into.

See scylladb#8881, scylladb#15391, scylladb#16115
…than 1

We want to ensure that when the replication factor
of a keyspace changes, it changes by at most 1 per DC
if it uses tablets. The rationale for that is to make
sure that the old and new quorums overlap by at least
one node.

After these changes, attempts to change the RF of
a keyspace in any DC by more than 1 will fail.
This commit adds a test verifying that we can only
change the RF of a keyspace for any DC by at most 1
when using tablets.

Fixes scylladb#18029
Up until now we waited until mutations are in place and then returned
directly to the caller of the ALTER statement, but that doesn't imply
that tablets were deleted/created, so we must wait until the whole
processing is done and return only then.
Now the scailing works and test must check it does

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When the test changes RF from 2 to 3, the extra node executes "rebuild"
transition which means that it streams tablets replicas from two other
peers. When doing it, the node receives two sets of sstables with
mutations from the given tablet. The test part that checks if the extra
node received the mutations notices two mutation fragments on the new
replica and errorneously fails by seeing, that RF=3 is not equal to the
number of mutations found, which is 4.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The check is performed by selecting from mutation_fragments(table), but
it's known that this query crashes Scylla when there's no tablet replica
on that node.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
@xemul
Copy link
Contributor

xemul commented May 28, 2024

@ptrsmrn , regarding #18772 -- once the latter is merged, test cases checking how ALTER works from this PR would stop passing, won't they?

@ptrsmrn
Copy link
Contributor Author

ptrsmrn commented May 28, 2024

@ptrsmrn , regarding #18772 -- once the latter is merged, test cases checking how ALTER works from this PR would stop passing, won't they?

Yes. There are many ways to keep things working, I guess, but the way I imagined it working is that #18772 should be merged to 6.0 only, and this PR should be merged to master (which will become 6.1 eventually) and - maybe - backported to 6.0. Of course backporting this PR to 6.0 means #18772 would need to be reverted from 6.0.

The easiest would be to drop #18772, merge this PR to master and then backport to 6.0, but this is against plans of @mykaul.

@xemul
Copy link
Contributor

xemul commented May 28, 2024

If @mykaul 's plan is to ban ALTER for 6.0, then #18772 should be considered for 6.0 branch only

@mykaul
Copy link
Contributor

mykaul commented May 28, 2024

If @mykaul 's plan is to ban ALTER for 6.0, then #18772 should be considered for 6.0 branch only

Nit: it's not my plan, it's our plan.

@scylladb-promoter
Copy link
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Unit Tests Custom
The following new/updated tests ran 100 times for each mode:
🔹 boost/network_topology_strategy_test
🔹 boost/tablets_test
🔹 boost/transport_test
🔹 cql-pytest/test_keyspace
🔹 cql-pytest/test_tablets
🔹 topology_custom/test_tablets
✅ - Container Test
✅ - dtest with topology changes
✅ - dtest
✅ - Unit Tests

Build Details:

  • Duration: 7 hr 43 min
  • Builder: i-0d7e7786c918fc829 (m5ad.12xlarge)

@xemul
Copy link
Contributor

xemul commented May 29, 2024

If @mykaul 's plan is to ban ALTER for 6.0, then #18772 should be considered for 6.0 branch only

Nit: it's not my plan, it's our plan.

Absolutely. However (quoting)

image

so do we merge ALTER rejection to 6.0 only, or does it go via master as well, (or is it up to maintainers to decide on their own)?

@xemul
Copy link
Contributor

xemul commented May 29, 2024

For the record -- this is to be backported to 6.0 once it hits the master

@mykaul
Copy link
Contributor

mykaul commented May 29, 2024

If @mykaul 's plan is to ban ALTER for 6.0, then #18772 should be considered for 6.0 branch only

Nit: it's not my plan, it's our plan.

Absolutely. However (quoting)

image

so do we merge ALTER rejection to 6.0 only, or does it go via master as well, (or is it up to maintainers to decide on their own)?

We've discussed it on the daily call. We'll take the change RF to 6.0. Reject ALTER is not needed at this point.

@scylladb-promoter scylladb-promoter merged commit e74a4b0 into scylladb:master May 29, 2024
12 checks passed
xemul added a commit that referenced this pull request May 30, 2024
This change supports changing replication factor in tablets-enabled keyspaces.
This covers both increasing and decreasing the number of tablets replicas through
first building topology mutations (`alter_keyspace_statement.cc`) and then
tablets/topology/schema mutations (`topology_coordinator.cc`).
For the limitations of the current solution, please see the docs changes attached to this PR.

refs: #16723

* br-backport-alter-ks-tablets:
  test: Do not check tablets mutations on nodes that don't have them
  test: Fix the way tablets RF-change test parses mutation_fragments
  test/tablets: Unmark RF-changing test with xfail
  docs: document ALTER KEYSPACE with tablets
  Return response only when tablets are reallocated
  cql-pytest: Verify RF is changes by at most 1 when tablets on
  cql3/alter_keyspace_statement: Do not allow for change of RF by more than 1
  Reject ALTER with 'replication_factor' tag
  Implement ALTER tablets KEYSPACE statement support
  Parameterize migration_manager::announce by type to allow executing different raft commands
  Introduce TABLET_KEYSPACE event to differentiate processing path of a vnode vs tablets ks
  Extend system.topology with 3 new columns to store data required to process alter ks global topo req
  Allow query_processor to check if global topo queue is empty
  Introduce new global topo `keyspace_rf_change` req
  New raft cmd for both schema & topo changes
  Add storage service to query processor
  tablets: tests for adding/removing replicas
  tablet_allocator: make load_balancer_stats_manager configurable by name
@xemul
Copy link
Contributor

xemul commented May 30, 2024

Backported into 6.0
Had to pick #18644 as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[tablets] Support RF changes using ALTER KEYSPACE