Skip to content

Conversation

@mike-tr-adamson
Copy link
Contributor

  • Adds the case_sensitive, normalize and ascii options to the index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since CASSANDRA-18521 CQLTester#waitForIndexQueryable requires an argument specifying the index name. Its absence here produces a build error. Maybe it has been missed during rebase?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that also CQLTester#waitForTableIndexesQueryable can be used to just wait for all the indexes used by the table.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since waitForTableIndexesQueryable was added to createIndex, I have removed these calls.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's simpler to get rid of ANALYZABLE_TYPES and TypeUtil#isIn and just use (type instanceof StringType) to check if a type is analyzable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can be protected

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add @Override

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this @Override is missed in the last commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, not sure how I missed that one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add @Override

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add @Override

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: add @Override

Comment on lines 22 to 23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need to break the line

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there is some code duplication across the tests in this class. Maybe we could use a utility method such as, for example:

private void test(String input, String expected, AbstractAnalyzer analyzer) throws Exception
{
    ByteBuffer toAnalyze = ByteBuffer.wrap(input.getBytes());
    analyzer.reset(toAnalyze);
    ByteBuffer analyzed = null;
    while (analyzer.hasNext())
    {
        analyzed = analyzer.next();
    }
    String result = ByteBufferUtil.string(analyzed);
    assertEquals(expected, result);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I've done this and generally simplified / tidied this test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks nice now :)

@mike-tr-adamson
Copy link
Contributor Author

@adelapena @bereng Please note that I have just made a number of changes for the Lucene 9.7 upgrade.

 - Adds the case_sensitive, normalize and ascii
   options to the index.
{
NonTokenizingOptions options = NonTokenizingOptions.getDefaultOptions();

assertNotEquals("nip it in the bud", getAnalyzedString("Nip it in the bud", options));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe assertEquals("Nip it in the bud", getAnalyzedString("Nip it in the bud", options)); achieves the same and is more strict by verifying what is exactly returned?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'm not entirely sure why the original test did this because it would have failed on any string.

@adelapena
Copy link
Contributor

@mike-tr-adamson I think it would be good to add a dtest quickly verifying that tokenisation and analysis don't break RFP (classic CASSANDRA-8272). I think something like this would work:

public class ReplicaFilteringProtectionTest extends TestBaseImpl
{
    private static final int REPLICAS = 2;

    @Test
    public void testRFPWithIndexTransformations() throws IOException
    {
        try (Cluster cluster = init(Cluster.build()
                                           .withNodes(REPLICAS)
                                           .withConfig(config -> config.set("hinted_handoff_enabled", false)
                                                                       .set("commitlog_sync", "batch")).start()))
        {
            String tableName = "sai_rfp";
            String fullTableName = KEYSPACE + '.' + tableName;

            cluster.schemaChange("CREATE TABLE " + fullTableName + " (k int PRIMARY KEY, v text)");
            cluster.schemaChange("CREATE CUSTOM INDEX ON " + fullTableName + "(v) USING 'StorageAttachedIndex' " +
                                 "WITH OPTIONS = { 'case_sensitive' : false}");

            // both nodes have the old value
            cluster.coordinator(1).execute("INSERT INTO " + fullTableName + "(k, v) VALUES (0, 'OLD')", ALL);

            String select = "SELECT * FROM " + fullTableName + " WHERE v = 'old'";
            Object[][] initialRows = cluster.coordinator(1).execute(select, ALL);
            assertRows(initialRows, row(0, "OLD"));

            // only one node gets the new value
            cluster.get(1).executeInternal("UPDATE " + fullTableName + " SET v = 'new' WHERE k = 0");

            // querying by the old value shouldn't return the old surviving row
            SimpleQueryResult oldResult = cluster.coordinator(1).executeWithResult(select, ALL);
            assertRows(oldResult.toObjectArrays());
        }
    }
}

@mike-tr-adamson
Copy link
Contributor Author

@adelapena I've added the ReplicaFilteringProtectionTest but, I have to admit, I'm not entirely sure how/why it's working. My main concern is that the test is passing for reasons that aren't the ones that we are testing for. I will have a bit of a dig to confirm this.

@adelapena
Copy link
Contributor

@mike-tr-adamson I think RFP is working because StorageAttachedIndexSearcher #filterReplicaFilteringProtection takes care of applying the filters to the expressions that are used in the coordinator:

public PartitionIterator filterReplicaFilteringProtection(PartitionIterator fullResponse)
{
for (RowFilter.Expression expression : queryController.filterOperation())
{
AbstractAnalyzer analyzer = queryController.getContext(expression).getAnalyzerFactory().create();
try
{
if (analyzer.transformValue())
return applyIndexFilter(fullResponse, Operation.buildFilter(queryController), queryContext);
}
finally
{
analyzer.end();
}
}
// if no analyzer does transformation
return Index.Searcher.super.filterReplicaFilteringProtection(fullResponse);
}

You can artificially see the test failing if you modify that method or if you make DataResolver#needsReplicaFilteringProtection return false.

@mike-tr-adamson
Copy link
Contributor Author

Thank you, that's what I was trying to find.

import org.apache.cassandra.utils.ByteBufferUtil;

import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertNotEquals;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, sorry. I'll get my commit process correct one of these days.

@maedhroz
Copy link
Contributor

Committed as 05dd587

@maedhroz maedhroz closed this Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants