Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ALTER KEYSPACE can break tables with UDT columns #14139

Closed
cvybhu opened this issue Jun 5, 2023 · 19 comments
Closed

ALTER KEYSPACE can break tables with UDT columns #14139

cvybhu opened this issue Jun 5, 2023 · 19 comments
Assignees
Labels
Milestone

Comments

@cvybhu
Copy link
Contributor

cvybhu commented Jun 5, 2023

I've created a 2 node cluster (2 DCs, 1 node in each) of the latest Scylla (5.4.0~dev-0.20230605.d2e089777bf5).

ccm create c2x1-release -n 1:1 --scylla --vnodes --install-dir /home/jancio/code/scylla/build/release

Running the following queries using cqlsh fails:

CREATE KEYSPACE ks WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 1};
CREATE TYPE ks.my_type (a int, b int);
CREATE TABLE ks.tab (p int PRIMARY KEY, m ks.my_type);
ALTER KEYSPACE ks WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 1, 'dc2': 1};
ALTER TABLE ks.tab WITH compression = { 'sstable_compression': 'org.apache.cassandra.io.compress.ZstdCompressor'};

NoHostAvailable: ('Unable to complete the operation against any hosts', {<Host: 127.0.0.1:9042 dc1>: <Error from server: code=0000 [Server error] message="Raft instance is stopped, reason: "background error, std::_Nested_exception<raft::state_machine_error> (State machine error at raft/server.cc:1175): exceptions::invalid_request_exception (Unknown type ks.my_type)"">})

The main problems appears to be:

Unknown type ks.my_type

It looks like the ALTER KEYSPACE query changes the UDT version, but other nodes are not aware of it and still use the old UDTs, and then fail on schema operations like ALTER TABLE or DROP TABLE.

Installation details
Scylla version (or git commit hash): 5.4.0~dev-0.20230605.d2e089777bf5
Cluster size: 2 (2 datacenters, 1 node in each DC)
OS (RHEL/CentOS/Ubuntu/AWS AMI): Fedora 37

(removed references to enterprise repository)

@cvybhu cvybhu added the type/bug label Jun 5, 2023
@mykaul
Copy link
Contributor

mykaul commented Jun 5, 2023

Is that a regression?

@cvybhu
Copy link
Contributor Author

cvybhu commented Jun 5, 2023

Is that a regression?

It is. I ran the same queries on release:2020.1.0 and they completed successfully.

@mykaul
Copy link
Contributor

mykaul commented Jun 5, 2023

Is that a regression?

It is. I ran the same queries on release:2020.1.0 and they completed successfully.

That's an old (enterprise) release. Let's see if we can narrow this down (with OSS versions that'd be great, btw)

@avikivity
Copy link
Member

Please attach the backtrace so we see who can fix it.

@tgrabiec
Copy link
Contributor

tgrabiec commented Jun 5, 2023

query changes the UDT version

What do you mean by "version"? Aren't UDTs identified by name?

@tgrabiec
Copy link
Contributor

tgrabiec commented Jun 5, 2023

Please attach the backtrace so we see who can fix it.

The error is thrown when trying to deserialize schema definition which references the type.

@avikivity
Copy link
Member

The issue is likely that ALTER KEYSPACE somehow forgot the type.

@avikivity
Copy link
Member

migration_manager::prepare_keyspace_update_announcement() calls db::schema_tables::make_create_keyspace_mutations(), with the parameter with_tables_and_types_and_functions defaulting to true. Not only it should have worked with false, but if it's true, then we are passing the types.

@cvybhu
Copy link
Contributor Author

cvybhu commented Jun 5, 2023

Narrowed it down to between 5.0 and 5.1

Works on scylla-5.0.13.0.20230423.a0ca8abe4
Fails on: scylla5.1.11-0.20230522.88eeab78380c

@cvybhu
Copy link
Contributor Author

cvybhu commented Jun 5, 2023

What do you mean by "version"? Aren't UDTs identified by name?

I don't really mean anything, I don't know how this part of the code works x.x

@avikivity
Copy link
Member

A bisect can be the easiest way to identify the problem.

@cvybhu
Copy link
Contributor Author

cvybhu commented Jun 5, 2023

Bisecting...

@michoecho
Copy link
Contributor

Culprit: 5852959

Fix:

diff --git a/data_dictionary/data_dictionary.cc b/data_dictionary/data_dictionary.cc
index a91e9b8769..55411bdf97 100644
--- a/data_dictionary/data_dictionary.cc
+++ b/data_dictionary/data_dictionary.cc
@@ -216,7 +216,7 @@ keyspace_metadata::keyspace_metadata(std::string_view name,
                         std::move(strategy_options),
                         durable_writes,
                         std::move(cf_defs),
-                        user_types_metadata{},
+                        std::move(user_types),
                         storage_options{}) { }

 keyspace_metadata::keyspace_metadata(std::string_view name,

@cvybhu
Copy link
Contributor Author

cvybhu commented Jun 5, 2023

@michoecho could you make a PR with your fix?

@avikivity
Copy link
Member

Wow that is so sad.

michoecho added a commit to michoecho/scylla that referenced this issue Jun 5, 2023
Due to a simple programming oversight, one of keyspace_metadata
constructors is using empty user_types_metadata instead of the
passed one. Fix that.

Fixes scylladb#14139
michoecho added a commit to michoecho/scylla that referenced this issue Jun 5, 2023
@michoecho
Copy link
Contributor

@michoecho could you make a PR with your fix?

Done: #14143

Also wrote a pytest which reproduces the issue (not sure if it's worth adding, but why not): #14144

@mykaul
Copy link
Contributor

mykaul commented Jun 5, 2023

@michoecho could you make a PR with your fix?

Done: #14143

Also wrote a pytest which reproduces the issue (not sure if it's worth adding, but why not): #14144

It's worth adding, IMHO.

@raphaelsc
Copy link
Member

Good catch.

@DoronArazii DoronArazii added this to the 5.4 milestone Jun 5, 2023
avikivity pushed a commit that referenced this issue Jun 6, 2023
Due to a simple programming oversight, one of keyspace_metadata
constructors is using empty user_types_metadata instead of the
passed one. Fix that.

Fixes #14139

Closes #14143

(cherry picked from commit 1a52117)
avikivity pushed a commit that referenced this issue Jun 6, 2023
Due to a simple programming oversight, one of keyspace_metadata
constructors is using empty user_types_metadata instead of the
passed one. Fix that.

Fixes #14139

Closes #14143

(cherry picked from commit 1a52117)
avikivity pushed a commit that referenced this issue Jun 6, 2023
Due to a simple programming oversight, one of keyspace_metadata
constructors is using empty user_types_metadata instead of the
passed one. Fix that.

Fixes #14139

Closes #14143

(cherry picked from commit 1a52117)
@avikivity
Copy link
Member

Backported to 5.1, 5.2, 5.3.

michoecho added a commit to michoecho/scylla that referenced this issue Jun 7, 2023
michoecho added a commit to michoecho/scylla that referenced this issue Jun 12, 2023
nyh pushed a commit that referenced this issue Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants