New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filtering may ignore clustering key restrictions if they form a prefix without a partition key #4541
Comments
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. Fixes scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. Fixes scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. Fixes scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. Fixes scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. This commit also fixes tests in cql_query_test suite, because they now expect more values - columns fetched for filtering will be present as well (only internally, the clients receive only data they asked for). Fixes scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
Hi @psarna I'm in the middle of reviewing your patch (the github UI is less convenient than I thought before we started to experiment with it...), but have a general question about the entire approach or even the problem raised here: If the full ck prefix is being restricted, we can do a quick read - and don't need "filtering" - on each partition. Yes, if the pk isn't restricted, we still need to go through all the partitions (so this is filtering), but we don't strictly need to read the entire partition and filter through it. But maybe you're going here for correctness, not absolute optimal performance? |
@nyh yes, this issue is strictly for correctness - currently clustering restrictions can be ignored in certain corner cases, which yields incorrect results. I didn't take any optimizations involving using ck prefix for each partition into consideration (although we could create a separate issue about that). |
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref #4541 Message-Id: <9656396ee831e29c2b8d3ad4ef90c4a16ab71f4b.1560410018.git.sarna@scylladb.com>
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref #4541 Message-Id: <3612dc1c6c22c59ac9184220a2e7f24e8d18407c.1560410018.git.sarna@scylladb.com>
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. This commit also fixes tests in cql_query_test suite, because they now expect more values - columns fetched for filtering will be present as well (only internally, the clients receive only data they asked for). Fixes #4541 Message-Id: <f08ebae5562d570ece2bb7ee6c84e647345dfe48.1560410018.git.sarna@scylladb.com> (cherry picked from commit adeea0a)
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref #4541 Message-Id: <3612dc1c6c22c59ac9184220a2e7f24e8d18407c.1560410018.git.sarna@scylladb.com> (cherry picked from commit 2c2122e)
@psarna While I was able to backport to 3.1, 3.0 is much harder. Please assist. |
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. This commit also fixes tests in cql_query_test suite, because they now expect more values - columns fetched for filtering will be present as well (only internally, the clients receive only data they asked for). Fixes scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
I pushed a version rebased on 3.0 here: https://github.com/psarna/scylla/commits/fix_ignoring_ck_restrictions_in_filtering_for_3.0 , but I still need to compile and test it on 3.0, which I'll do soon, after dealing with #4540. After I manage to test on 3.0 properly, I'll push it to the list. |
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
" Tests: unit(dev) Refs #4541 " * 'fix_ignoring_ck_restrictions_in_filtering_for_3.0_2' of https://github.com/psarna/scylla: tests: add a test case for filtering clustering key cql3: fix qualifying clustering key restrictions for filtering cql3: fix fetching clustering key columns for filtering
When a column is not present in the select clause, but used for filtering, it usually needs to be fetched from replicas. Sometimes it can be avoided, e.g. if primary key columns form a valid prefix - then, they will be optimized out before filtering itself. However, clustering key prefix can only be qualified for this optimization if the whole partition key is restricted - otherwise the clustering columns still need to be present for filtering. This commit also fixes tests in cql_query_test suite, because they now expect more values - columns fetched for filtering will be present as well (only internally, the clients receive only data they asked for). Fixes scylladb#4541
Clustering key restrictions can sometimes avoid filtering if they form a prefix, but that can happen only if the whole partition key is restricted as well. Ref scylladb#4541
The test cases makes sure that clustering key restriction columns are fetched for filtering if they form a clustering key prefix, but not a primary key prefix (partition key columns are missing). Ref scylladb#4541
" Tests: unit(dev) Refs #4541 " * 'fix_ignoring_ck_restrictions_in_filtering_for_3.0_2' of https://github.com/psarna/scylla: tests: add a test case for filtering clustering key cql3: fix qualifying clustering key restrictions for filtering cql3: fix fetching clustering key columns for filtering
This was released as part of 3.0.8 |
Ref: #3803
The solution for issue #3803 provided a way of fetching columns that are not included in the select clause if they are needed for filtering. However, one corner case is still unsolved: if clustering columns form a prefix, we do not need to fetch them for filtering, but only if the whole partition key is present as well. Without the partition key, clustering key prefix cannot be optimized out.
I have the solution ready, I'll post it soon.
The text was updated successfully, but these errors were encountered: