New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Filtering may ignore clustering key restrictions if they form a prefix without a partition key #4541

Closed

psarna opened this issue Jun 12, 2019 · 5 comments

Assignees

Labels

area/cql type/bug

Milestone

Contributor

psarna commented Jun 12, 2019

Ref: #3803

The solution for issue #3803 provided a way of fetching columns that are not included in the select clause if they are needed for filtering. However, one corner case is still unsolved: if clustering columns form a prefix, we do not need to fetch them for filtering, but only if the whole partition key is present as well. Without the partition key, clustering key prefix cannot be optimized out.

I have the solution ready, I'll post it soon.

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

ad4e6f6

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

d04d7bd

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

573a92f

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

c006667

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

904c053

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

1c88b4c

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

11eb0e3

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

c3b29bf

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

cff77ee

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

9d44639

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

4f558f1

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

c9f6d87

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

e4621eb

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

4bdc153

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

f08ebae

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

84c8238

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

0da3f34

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

This was referenced Jun 12, 2019

Fix ignoring ck restrictions in filtering #4543

Closed

A partition key index may cause a regular query to fail with "No such index" #4539

Closed

slivne added area/cql type/bug Backport 3.0 labels

slivne assigned psarna

Contributor

nyh commented Jun 12, 2019

Hi @psarna I'm in the middle of reviewing your patch (the github UI is less convenient than I thought before we started to experiment with it...), but have a general question about the entire approach or even the problem raised here:

If the full ck prefix is being restricted, we can do a quick read - and don't need "filtering" - on each partition. Yes, if the pk isn't restricted, we still need to go through all the partitions (so this is filtering), but we don't strictly need to read the entire partition and filter through it. But maybe you're going here for correctness, not absolute optimal performance?

Contributor Author

psarna commented Jun 12, 2019

@nyh yes, this issue is strictly for correctness - currently clustering restrictions can be ignored in certain corner cases, which yields incorrect results. I didn't take any optimizations involving using ck prefix for each partition into consideration (although we could create a separate issue about that).

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

d4ba5cc

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

3612dc1

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

nyh pushed a commit that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

c4b9357

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref #4541
Message-Id: <9656396ee831e29c2b8d3ad4ef90c4a16ab71f4b.1560410018.git.sarna@scylladb.com>

nyh pushed a commit that referenced this issue


          tests: add a test case for filtering clustering key

2c2122e

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref #4541
Message-Id: <3612dc1c6c22c59ac9184220a2e7f24e8d18407c.1560410018.git.sarna@scylladb.com>

slivne added this to the 3.2 milestone

scylladb-promoter closed this as completed in

adeea0a

avikivity pushed a commit that referenced this issue


          cql3: fix fetching clustering key columns for filtering

24ddb46

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).

Fixes #4541
Message-Id: <f08ebae5562d570ece2bb7ee6c84e647345dfe48.1560410018.git.sarna@scylladb.com>

(cherry picked from commit adeea0a)

avikivity pushed a commit that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

2c50a48

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref #4541
Message-Id: <9656396ee831e29c2b8d3ad4ef90c4a16ab71f4b.1560410018.git.sarna@scylladb.com>

(cherry picked from commit c4b9357)

avikivity pushed a commit that referenced this issue


          tests: add a test case for filtering clustering key

35f906f

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref #4541
Message-Id: <3612dc1c6c22c59ac9184220a2e7f24e8d18407c.1560410018.git.sarna@scylladb.com>

(cherry picked from commit 2c2122e)

Member

avikivity commented Jun 16, 2019

@psarna While I was able to backport to 3.1, 3.0 is much harder. Please assist.

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

029e89b

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

155adc4

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

81eb1be

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

Contributor Author

psarna commented Jun 17, 2019

I pushed a version rebased on 3.0 here: https://github.com/psarna/scylla/commits/fix_ignoring_ck_restrictions_in_filtering_for_3.0 , but I still need to compile and test it on 3.0, which I'll do soon, after dealing with #4540. After I manage to test on 3.0 properly, I'll push it to the list.

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

c8826a4

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

55953d7

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

avikivity added a commit that referenced this issue


          Merge "Backport fixing ignoring ck restrictions in filtering" from Piotr

270d0a8

"
Tests: unit(dev)
Refs #4541
"

* 'fix_ignoring_ck_restrictions_in_filtering_for_3.0_2' of https://github.com/psarna/scylla:
  tests: add a test case for filtering clustering key
  cql3: fix qualifying clustering key restrictions for filtering
  cql3: fix fetching clustering key columns for filtering

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix fetching clustering key columns for filtering

23df964

When a column is not present in the select clause, but used for
filtering, it usually needs to be fetched from replicas.
Sometimes it can be avoided, e.g. if primary key columns form a valid
prefix - then, they will be optimized out before filtering itself.
However, clustering key prefix can only be qualified for this
optimization if the whole partition key is restricted - otherwise
the clustering columns still need to be present for filtering.

This commit also fixes tests in cql_query_test suite, because they now
expect more values - columns fetched for filtering will be present as
well (only internally, the clients receive only data they asked for).

Fixes scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          cql3: fix qualifying clustering key restrictions for filtering

7dce548

Clustering key restrictions can sometimes avoid filtering if they form
a prefix, but that can happen only if the whole partition key is
restricted as well.

Ref scylladb#4541

psarna added a commit to psarna/scylla that referenced this issue


          tests: add a test case for filtering clustering key

87fd298

The test cases makes sure that clustering key restriction
columns are fetched for filtering if they form a clustering key prefix,
but not a primary key prefix (partition key columns are missing).

Ref scylladb#4541

avikivity added a commit that referenced this issue


          Merge "Backport fixing ignoring ck restrictions in filtering" from Piotr

e80cd9d

"
Tests: unit(dev)
Refs #4541
"

* 'fix_ignoring_ck_restrictions_in_filtering_for_3.0_2' of https://github.com/psarna/scylla:
  tests: add a test case for filtering clustering key
  cql3: fix qualifying clustering key restrictions for filtering
  cql3: fix fetching clustering key columns for filtering

Contributor

slivne commented Jul 29, 2019

This was released as part of 3.0.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment