Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tests in test/cql-pytest/ that fail on Cassandra #16033

Merged
merged 9 commits into from
Nov 15, 2023

Conversation

nyh
Copy link
Contributor

@nyh nyh commented Nov 13, 2023

As a general rule, tests in test/cql-pytest shouldn't just pass on Scylla - they also should not fail on Cassandra; A test that fails on Cassandra may indicate that the test is wrong, or that Scylla's behavior is wrong and the test just enshrines that wrong behavior. Each time we see a test fail on Cassandra we need to check if this is not the case. We also have special markers scylla_only and cassandra_bug to put on tests that we know should fail on Cassandra because it is missing some Scylla-only feature or there is a bug in Cassandra, respectively. Such tests will be xfailed/skipped when running on Cassandra, and not report failures.

Unfortunately, over time more several tests got into our suite in that did not pass on Cassandra. In this series I went over all of them, and fixed each to pass - or be skipped - on Cassandra, in a way that each patch explains.

Fixes #16027

@ptrsmrn ptrsmrn removed the request for review from havaker November 13, 2023 11:49
@nyh
Copy link
Contributor Author

nyh commented Nov 13, 2023

I invited several people who wrote some of the tests involved in this PR to review it.

@github-actions github-actions bot deleted a comment from aws-amplify-us-east-2 bot Nov 13, 2023
# Cassandra 4.1 and above, a missing replication_factor *is* allowed,
# because there is a default_keyspace_rf configuration. See issue #16028.
def test_alter_keyspace_missing_rf(cql, this_dc, scylla_only):
with new_test_keyspace(cql, "WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', '" + this_dc + "' : 1 }") as keyspace:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest to assign RF=3, for the sake of "good usage examples", i.e. having NetworkTopology RF=3 by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work on a single-node cluster, like the one used in cql-pytest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, all tests in test/cql-pytest are single node, so the usual "best practices" don't necessarily apply to them.

Copy link
Contributor

@Deexie Deexie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (compaction strategy validation part)

@@ -6,38 +6,48 @@
# Tests for compaction strategy validation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I addressed the points from the commit message under the issue (#16027 (comment)). I would leave it as is. Changing of scylla's bucket_low message is worth considering, but that's out of scope for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I'm leaving my partial patch for this test as I did it. It means that even after my patch, this test will fail on Cassandra. It's not a disaster - nobody runs it on Cassandra anyway (it's not part of our CI), and hopefully another day you'll look into these issues. None of this is urgent at all.

@scylladb-promoter
Copy link
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Sanity Tests
✅ - Unit Tests

Build Details:

@@ -591,6 +619,7 @@ def test_type_quoting(cql):
assert f"CREATE TYPE \"{ks_name}\".\"{name}\"" in desc
assert f"CREATE TYPE \"{ks_name}\".{name}" not in desc
assert f"CREATE TYPE {ks_name}.\"{name}\"" not in desc
# This was buggy in Cassandra - see CASSANDRA-19019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this was fixed in a later version.

Copy link
Contributor Author

@nyh nyh Nov 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I see now - https://issues.apache.org/jira/browse/CASSANDRA-19019, thanks. I'll fix the comment / marker. Although it's a bit frustrating the test will fail on older Cassandra, we shouldn't be in the business of tracking Cassandra bugs in our test suite, we can just assume it's run on a recent version of Cassandra.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I upgraded my Cassandra to the latest version of Cassandra 4 (4.1.3), and indeed, it works there. I'm changing the test accordingly.

Copy link
Contributor

@denesb denesb Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a note to test/cql-pytest/README.md (in a follow-up PR), about the fact that the tests are expected to pass against the latest stable C* release. Also, in general, I'm missing a "Running the tests against Cassandra" section in said file anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do have a few sentences about Cassandra spread around the README.md, and even in the "philosophy" section a whole part (5) about "Run your test against Cassandra". But maybe I should split that part into a separate section, higher up the file?

I could also explain how to compile Cassandra on modern Linux distros, which isn't trivial because of the Java 17 mess (although you can also take a pre-compiled version of Java and not deal with the compilation).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked again at the text of test/cql-pytest/README.md and saw that almost everything I could possibly say about Cassandra is already written there. Except how to actually compile Cassandra. Could you please tell me a bit more what you were missing there? Or maybe the existing text was just hard for you to find among all the other explanations? Or maybe you were missing the part I admit is missing - how to compile Cassandra?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I had a quick look earlier today, I saw a big block of text, and no header standing out, mentioning "Cassandra". Indeed as I read into the text, I do see you talk about it already, so I guess it is fine.

how to compile Cassandra?

I don't think this is important to describe there, people can find out about that on the C* documentation I suppose.

Yet another test file in cql-pytest which failed when run on Cassandra
(via test/cql-pytest/run-cassandra).

It turns out that when the token() function is used with incorrect
parameters (it needs to be passed all partition-key columns), the
error message is different in ScyllaDB and Cassandra. Both are
reasonable error messages, so if we insist on checking the error
message - we should allow both.

Also the same test called its second partition-key column "ck". This
is confusing, because we usually use the name "ck" to refer to a clustering
key. So just for clarity, we change this name to "pk2". This is not a
functional change in the test.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
…py on Cassandra

Yet another test file in cql-pytest which failed when run on Cassandra
(via test/cql-pytest/run-cassandra).

This patch is only a partial fix - it fixes trivial differences in error
messages, but some potentially-real differences remain so three of the
tests still fail:

1. Trying to set tombstone_threshold to 5.5 is an error in ScyllaDB
   ("must be between 0.0 and 1.0") but allowed in Cassandra.

2. Trying to set bucket_low to 0.0 is an error in ScyllaDB, giving the
   wrong-looking error message "must be between 0.0 and 1.0" (so 0.0 should
   have been fine?!) but allowed in Cassandra.

3. Trying to set timestamp_resolution to SECONDS is an error in ScyllaDB
   ("invalid timestamp resolution SECONDS") but allowed in Cassandra.
   I don't think anybody wants to actually use "SECONDS", but it seems
   legal in Cassandra, so do we need to support it?

The patch also simplifies the test to use cql-pytest's util.py, instead
of cassandra_tests/porting.py. The latter was meant to make porting
existing Cassandra tests easier - not for writing new ones - and made
using a regular expression for testing error messages harder so I
switched to using pytest.raises() whose "match=" accepts a regular
expression.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The tests in test/cql-pytest/test_guardrail_replication_strategy.py
are for a Scylla-only feature that doesn't exist in Cassandra, so
obviously they all fail on Cassandra. Let's mark them all as
scylla_only.

We use an autouse fixture to automatically mark all tests in this file
as scylla-only, instead of marking each one separately.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Yet another test file in cql-pytest which failed when run on Cassandra
(via test/cql-pytest/run-cassandra).

When testing some invalid cases of ALTER TABLE, the test required
that you cannot choose SimpleStrategy without specifying a
replication_factor. As explained in Refs scylladb#16028, this isn't true
in Cassandra 4.1 and up - it now has a default value for
replication_factor and it's no longer required.

So in this patch we split that part of the test to a separate test
function and mark it scylla_only.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The test function test_mv_synchronous_updates checks the
synchronous_updates feature, which is a ScyllaDB extension and
doesn't exist in Cassandra. So it should be marked with "scylla_only"
so that it doesn't fail when running the tests on Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Fixed two tests thich failed when running on Cassandra:

One test waited for a secondary index to appear, but in Cassandra, the
index can be broken (cause a read failure) for a short while and we
need to wait through this failure as well and not fail the entire test.

Another test was for local secondary index, which is a Scylla-only
feature, but we forgot the "scylla_only" tag.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Some error-message checks in this test file (which was translated in
the past from Cassandra) try operations which actually has two errors,
and expected to see one error message - but recent Cassandra prints
the other one. This caused several tests to fail when running on
Cassandra 4.1. Both messages are fine, so let's accept both.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
…l on Cassandra

In commit 52bbc10, we started to allow "IN NULL" - it started to
match nothing instead of being an error as it is in Cassandra. The
commit *incorrectly* "fixed" the existing translated Cassandra unit test
to match the new behavior - but after this "fix" the test started to
fail on Cassandra.

The appropriate fix is just to comment out this part of the test and
not do it. It's a small point where we deliberately decided to deviate
from Cassandra's behavior, so the test it had for this behavior is
irrelevant.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Yet another test file in cql-pytest which failed when run on Cassandra
(via test/cql-pytest/run-cassandra).

Some of the tests checked on Cassandra things that don't exist there
(namely local secondary indexes) and could skip that part. Other tests
need to be skipped completely ("scylla_only") because they rely on a
Scylla-only feature. We have a bit too many of those in this file, but
I don't want to fix this now.

Yet another test found a real bug in Cassandra 4.1.1 (CASSANDRA-17918)
but passes in Cassandra 4.1.2 and up, so there's nothing to fix except
a comment about the situation.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
@nyh
Copy link
Contributor Author

nyh commented Nov 14, 2023

Rebased and re-pushed. For one test which discovered a Cassandra bug so was marked cassandra_bug, the bug was actually fixed recently (it was fixed in Cassandra 4.1.2, I was running Cassandra 4.1.1...), so I removed the cassandra_bug tag.

@github-actions github-actions bot deleted a comment from aws-amplify-us-east-2 bot Nov 14, 2023
@scylladb-promoter
Copy link
Contributor

🟢 CI State: SUCCESS

✅ - Build
✅ - Sanity Tests
✅ - Unit Tests

Build Details:

@scylladb-promoter scylladb-promoter merged commit ba17ae2 into scylladb:master Nov 15, 2023
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Some tests in test/cql-pytest/ fail on Cassandra
5 participants