Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cep 29 not operator #2556

Draft
wants to merge 36 commits into
base: trunk
Choose a base branch
from
Draft

Conversation

pkolaczk
Copy link
Contributor

@pkolaczk pkolaczk commented Aug 8, 2023

  • NOT IN
  • NOT CONTAINS / NOT CONTAINS KEY
  • Inequality operator (!=) for map entries
  • Inequality operator (!=) for clustering columns
  • Both ALLOW FILTERING and indexing (SAI) supported

Ability to use NOT operator in filters on lists, sets and maps.
Requires ALLOW FILTERING.

All those operators are implemented in a ContainsRestriction, therefore
they allow to be mixed: i.e. you can use
multiple CONTAINS, NOT CONTAINS, CONTAINS KEY, NOT CONTAINS KEY
in a single WHERE clause.
Ability to use NOT IN restrictions in CQL WHERE clause.
Can be used on any column with ALLOW FILTERING.

Can be also used on clustering columns w/o ALLOW FILTERING.
Multicolumn NOT IN restrictions are supported
(e.g. (c1, c2) NOT IN ((1, 2), (3, 4))).

NOT IN restrictions are implemented as slice restrictions, therefore
they can be merged with other slice restrictions using <, >, <=, >=.
Also multiple NOT IN restrictions can be merged together.

Overview of the code changes:
  - NOT IN operator has been added to the CQL WHERE clause parser
    definition.

  - SliceRestriction got a new field with a list of terms to skip,
    received from the terms passed in the NOT IN clause.

  - SliceRestriction methods have been adjusted to generate filters
    that filter out the items to skip, as well as generate proper
    clustering bounds, omitting the items to skip.

  - Because the list of items to skip can be given either as a marker
    or a list literal, MarkerOrList has been
    introduced to abstract out the concern of getting the items out of
    the SliceRestriction. This way both SliceRestriction and
    INRestriction could be reduced. INRestriction hierarchy has been
    flattened to a single class.

  - MultiCBuilder has been rewritten. It has been both simplified and
    added new functionality. The specialization for single column
    clusterings has been removed as it had almost doubled the
    complexity for a questionable performance gain. The code for
    computing cartesian products has been simplified and optimized:
    now it has proper O(n^2) complexity instead of O(n^3).
    MultiCBuilder now also tracks the inclusiveness of each clustering
    bound. This is required because when merging NOT IN with inequality
    like <= or >=, we need to mix bounds differing in inclusiveness.

  - MultiColumnRestriction.appendBoundTo has been modified to use
    the new MultiCBuilder and also simplified and documented, as
    this part of code turned out to be extremely hard to understand
    for me.
Now that we implemented NOT IN, both equals and non-equals
can be rewritten as slices.

column != value ---> column NOT IN (value)
column  = value ---> column >= value AND column <= value

And because slices can be freely merged, this opens room for
many interesting queries that were previously
impossible, e.g.:

SELECT ... WHERE c1 != ?
SELECT ... WHERE c1 != ? AND c1 != ? AND ...
SELECT ... WHERE c1 = ? AND (c1, c2) != (v1, v2)
SELECT ... WHERE c1 IN ? AND (c1, c2) != (v1, v2)
NOT CONTAINS, NOT CONTAINS KEY and map entry inequality
can be now used on SAI indexed columns without need for
ALLOW FILTERING.

The result sets are evaluated by taking an iterator
to all primary keys (in all active memtables and sstables)
and anti-joining them with the primary keys
matching the CONTAINS predicate, read from the index.
This way there was no need to change the index structure.
Although a full scan over all primary keys is needed,
the scan involves scanning of primary keys only, without deserializing the whole rows.
For sstables, the primary key trie component is used, which is likely much smaller than
the whole sstable. Therefore, this method is likely faster than filtering, as long
as rows are big enough.
src/antlr/Parser.g Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/cql3/MarkerOrList.java Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/cql3/Relation.java Outdated Show resolved Hide resolved
src/java/org/apache/cassandra/cql3/Relation.java Outdated Show resolved Hide resolved
pkolaczk and others added 14 commits October 24, 2023 14:49
Renamed:
  CBuilder -> ClusteringBuilder
  MultiCBuilder -> MultiClusteringBuilder

Added a longer explanation on how MultiClusteringBuilder works.
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
When computing non-equality searches (NEQ, NOT CONTAINS, NOT CONTAINS
KEY), perform anti-join operation on data of each index separately,
before computing the union. This may cause returning false positives
from the search, but those will be filtered out by post-filtering.

Unfortunately, each index search may return false positive matches,
due to presence of updates and deletes that happened after a flush.
Hence, we cannot compute a set difference of keys returned from one
index with the keys of another index, as subtracting a false positive
would lead to a false negative, i.e. missing keys in the
search results. We have no way to correct false negatives.
pkolaczk and others added 6 commits November 3, 2023 14:56
Supports queries of the following type:
SELECT * FROM t WHERE indexed_column != ?;
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Co-authored-by: Andrés de la Peña <adelapena@users.noreply.github.com>
Comment on lines 395 to 397
boolean supportsSlice = slice.isSupportedBy(columnDef, index);
boolean supportsNeq = index.supportsExpression(columnDef, Operator.NEQ);
return supportsSlice || !skippedValues.isEmpty() && supportsNeq;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementations of SingleColumnRestriction.SliceRestriction#isSupportedBy and MultiColumnRestriction.SliceRestriction#isSupportedBy are identical. Also, both implementations are the only callers of TermSlice#isSupportedBy(ColumnMetadata, Index).

Maybe we could pass the list of skipped values to TermSlice#isSupportedBy and delegate the NEQ check to it, putting similar things together and preventing some duplication:

public boolean isSupportedBy(ColumnMetadata column, Index index, List<MarkerOrTerms> skippedValues)
{
    boolean supported = false;

    if (hasBound(Bound.START))
        supported |= isInclusive(Bound.START) ? index.supportsExpression(column, Operator.GTE)
                : index.supportsExpression(column, Operator.GT);
    if (hasBound(Bound.END))
        supported |= isInclusive(Bound.END) ? index.supportsExpression(column, Operator.LTE)
                : index.supportsExpression(column, Operator.LT);

    supported |= !skippedValues.isEmpty() && index.supportsExpression(column, Operator.NEQ);

    return supported;
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't agree on this one. This is just coincidental duplication - accidentally both single and multi restrictions work the same way here, but there is no fundamental reason they have to. The amount of duplication is also very tiny. And introducing an unrelated parameter to TermSlice is mixing responsibilities which is IMHO worse.

I thought that maybe we could move the skipped values to the TermSlice itself. Then it would make more sense to have TermSlice handle the isSupportedBy logic as it would have all the information it needs. WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, moving the skipped values to the TermSlice sounds good to me.

@adelapena
Copy link
Contributor

@pkolaczk there is a bunch of unresolved nits above that only show up if you click on the 12 hidden conversations Load more... link on the GH web UI, maybe you have missed them?

@adelapena
Copy link
Contributor

adelapena commented Nov 6, 2023

I have noticed that this test returns more rows than expected:

@Test
public void notInTest()
{
    createTable("CREATE TABLE %s (k int PRIMARY KEY, v1 int, v2 int)");
    createIndex("CREATE INDEX ON %s(v1) USING 'sai'");
    createIndex("CREATE INDEX ON %s(v2) USING 'sai'");

    execute("INSERT INTO %s (k, v1, v2) VALUES (1, 1, 4)");
    execute("INSERT INTO %s (k, v1, v2) VALUES (2, 2, 3)");
    execute("INSERT INTO %s (k, v1, v2) VALUES (3, 3, 2)");
    execute("INSERT INTO %s (k, v1, v2) VALUES (4, 4, 1)");

    waitForTableIndexesQueryable(currentTable());

    assertRowsIgnoringOrder(execute("SELECT * FROM %s"),
            row(1, 1, 4),
            row(2, 2, 3),
            row(3, 3, 2),
            row(4, 4, 1));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (1)"),
            row(2, 2, 3),
            row(3, 3, 2),
            row(4, 4, 1));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (4)"),
            row(1, 1, 4),
            row(2, 2, 3),
            row(3, 3, 2));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (1, 4)"),
            row(2, 2, 3),
            row(3, 3, 2)); // Got 1 extra row(s) in result: (k=1, v1=1, v2=4)

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (1, 4) AND v2 NOT IN (3, 2)")); // 2 extra rows
}

However, the test passes if we don't create the indexes and use ALLOW FILTERING instead:

@Test
public void notInTest()
{
    createTable("CREATE TABLE %s (k int PRIMARY KEY, v1 int, v2 int)");

    execute("INSERT INTO %s (k, v1, v2) VALUES (1, 1, 4)");
    execute("INSERT INTO %s (k, v1, v2) VALUES (2, 2, 3)");
    execute("INSERT INTO %s (k, v1, v2) VALUES (3, 3, 2)");
    execute("INSERT INTO %s (k, v1, v2) VALUES (4, 4, 1)");

    assertRowsIgnoringOrder(execute("SELECT * FROM %s"),
            row(1, 1, 4),
            row(2, 2, 3),
            row(3, 3, 2),
            row(4, 4, 1));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (1) ALLOW FILTERING"),
            row(2, 2, 3),
            row(3, 3, 2),
            row(4, 4, 1));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (4) ALLOW FILTERING"),
            row(1, 1, 4),
            row(2, 2, 3),
            row(3, 3, 2));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (1, 4) ALLOW FILTERING"),
            row(2, 2, 3),
            row(3, 3, 2));

    assertRowsIgnoringOrder(execute("SELECT * FROM %s WHERE v1 NOT IN (1, 4) AND v2 NOT IN (3, 2) ALLOW FILTERING"));
}

@pkolaczk
Copy link
Contributor Author

pkolaczk commented Nov 6, 2023

there is a bunch of unresolved nits above that only show up if you click on the 12 hidden conversations Load more... link on the GH web UI, maybe you have missed them?

lol, indeed, I missed them.

@pkolaczk
Copy link
Contributor Author

pkolaczk commented Nov 7, 2023

I reverted the support for != and NOT IN on indexed columns for now as current index search does not allow multiple exclusions.

@pkolaczk pkolaczk marked this pull request as draft November 7, 2023 17:50
@adelapena
Copy link
Contributor

Albeit the reversal of the support for != and NOT IN on indexed columns has made it necessary to use ALLOW FILTERING again, it seems that the indexes are still involved in the query and producing wrong results:

@Test
public void notInTest()
{
    createTable("CREATE TABLE %s (k int PRIMARY KEY, v int)");

    execute("INSERT INTO %s (k, v) VALUES (1, 1)");
    execute("INSERT INTO %s (k, v) VALUES (2, 2)");
    execute("INSERT INTO %s (k, v) VALUES (3, 3)");
    execute("INSERT INTO %s (k, v) VALUES (4, 4)");

    String query = "SELECT * FROM %s WHERE v NOT IN (1, 4) ALLOW FILTERING";
    Object[][] rows = rows(row(2, 2), row(3, 3));

    assertRowsIgnoringOrder(execute(query), rows);

    createIndex("CREATE INDEX ON %s(v) USING 'sai'");
    waitForTableIndexesQueryable();

    assertRowsIgnoringOrder(execute(query), rows); // Got 1 extra row(s) in result: (k=1, v1=1)
}

I suggest we add some additional tests around the use of unsupported operators with SAI indexes and ALLOW FILTERING.

@@ -28,9 +28,9 @@
/**
* Allows to build ClusteringPrefixes, either Clustering or ClusteringBound.
*/
public abstract class CBuilder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I prefer to make these sorts of changes in a dedicated issue, so it looks like we can significantly reduce the number of changed files if we revert it, and keep only necessary changes here.

See the discussion below, we can ignore code formatting changes by listing the relevant commit hashes in a special file (this is not used by the project right now). So if we do this change in another issue and put that commit hash in a special file, it won't pollute the git blame in our IDEs.
https://github.com/orgs/community/discussions/5033
https://git-scm.com/docs/git-blame#Documentation/git-blame.txt-blameignoreRevsFile

This is my humble argument for such not making such a change here :-)

* <p>
* The keys are returned in token-clustering order.
*/
public final class PrimaryKeyMapIterator extends KeyRangeIterator
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The iterator looks very similar to this one, do we still need to keep it?

    private SSTableRowIdKeyRangeIterator(PrimaryKey min,
                                         PrimaryKey max,
                                         long count,
                                         PrimaryKeyMap primaryKeyMap,
                                         QueryContext queryContext,
                                         PostingList postingList)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants