-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
If ALLOW FILTERING excludes a long string of rows, the scan can stop prematurely #8203
Comments
Previously, we had two tests demonstrating issue #7966. But since then, our understanding of this issue has improved which resulted in issue #8203, so this patch improves those tests and makes them reproduce the new issue. Importantly, we now know that this problem is not specific to a full-table scan, and also happens in a single-partition scan, so we fix the test to demonstrate this (instead of the old test, which missed the problem so the test passed). Both tests pass on Cassandra, and fail on Scylla. Refs #8203. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210302224020.1498868-1-nyh@scylladb.com>
The two cql-pytest tests we have for this issue,
do not fail any more (XPASS). |
@nyh our dear friend Ultrabug fixed the problem recently: datastax/python-driver@1d9077d , but it looks like it's not part of any official release yet... |
Wow. I don't know how this happened, but I have in In any case, it's good to know that this was in fact a Python driver bug, not a Scylla bug. I should send a patch to un-XFAIL these two tests and close this issue - but doing this properly will require some thought on how to have the test PASS on a recent driver versions and SKIP (not fail) on old versions. Or, at the minimum, we need our Jenkins runners to have the recent driver (perhaps they already do?) so at least they will not fail these tests. Thanks @ultrabug for fixing this! |
Even more frustratingly, the official cassandra-driver version I have in /usr/local installed by pip is 3.25.0 (the file is dated March 18th), which does not have this fix, while the one I have in ~/.local is newer (September 19th), is indeed the Scylla fork of the driver, but lists its version as the older 3.24.5. So my checks in the test will need to be very ugly :-( But I'll do it. |
Indeed, we have this patch in the Scylla driver: scylladb/python-driver@6ed53d9 |
@nyh , the fix has also been contributed and merged to cassandra-driver but it's not part of a release yet AFAIK The fix should be present in the latest release of scylla-driver tho, tag |
Thanks. I sent yesterday a patch titled "[PATCH] cql-pytest: XFAILing test was fixed by a Python driver fix" to fix the tests that used to fail because of the driver bug to pass on new-enough drivers and be skipped on older driver versions - checking for either Scylla or Datastax variants of the driver. |
Issue scylladb#8203 describes a bug in a long scan which returns a lot of empty pages (e.g., because most of the results are filtered out). We have two cql-pytest test cases that reproduced this bug - one for a whole-table scan and one for a single-partition scan. It turned out that the bug was not in the Scylla server, but actually in the Python driver which incorrectly stopped the iteration after an empty page even though this page did contain the "more pages" flag. This driver bug was already fixed in the Datastax driver (see scylladb/python-driver@6ed53d9, and in the Scylla fork of the driver: datastax/python-driver@1d9077d So in this patch we drop the XFAIL, and if the driver is not new enough to contain this fix - the test is skipped. Since our Jenkins machines have the latest Scylla fork of the driver and it already contains this fix, these tests will not be skipped - and will run and should pass. Developers who run these tests on their development machine will see these tests either passing or skipped - depending on which version of the driver they have installed. Closes scylladb#8203 Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This issue replaces #7966, because we have a better understanding now of what caused the problem we saw there.
Consider a
SELECT
request with paging, which scans an entire table or just one partition, and also a filter (withALLOW FILTERING
) that excludes all the rows except the last one. In this case, Scylla needs to return a bunch of empty pages (each page probably corresponds to 1 MB of pre-filtering content), finally followed by a page with one match.However, in the cql-pytests:
test_filtering.py
::test_filtering_contiguous_nonmatching_partition_range
test_filtering.py
::test_filtering_contiguous_nonmatching_single_partition
We see that if we ask the Python driver to do the paging and perform the entire scan, the result is wrong: empty, instead of the expected one row.
@denesb investigated (in #7966) and discovered that the Python driver receives an empty page with
has_more_pages=true
(as expected), then asks for another page and gets another empty page withhas_more_pages=true
(again, as expected), but now - the driver does not ask for any more pages. This is wrong. It looks like it may be a bug in the Python driver but we're not yet sure (we didn't yet try a similar test using a different driver).The text was updated successfully, but these errors were encountered: