Skip to content

STAR-123 custom lucene analyzer and some language unit tests#192

Merged
jasonrutherglen merged 1 commit intods-trunkfrom
STAR-123
Aug 24, 2021
Merged

STAR-123 custom lucene analyzer and some language unit tests#192
jasonrutherglen merged 1 commit intods-trunkfrom
STAR-123

Conversation

@jasonrutherglen
Copy link
Copy Markdown

added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

Copy link
Copy Markdown

@mike-tr-adamson mike-tr-adamson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done an initial review of the changes here.

My main concern about this is how we are going to document it. I know we've discussed this but it seems to me that usage of these is a bit of a black art. The lucene documentation, I could find, points to the Javadoc for the analyzer/tokenizer classes. I can't see new users getting to grips with that very easily.

Seems to me that we need to, at least, document common usages. Then we can add documentation as we go on.

This is just observation, it doesn't stop these changes getting merged.

Comment thread build.xml Outdated
Comment thread build.xml Outdated
Comment thread build.xml Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely in favour of WITH JSON OPTIONS here. It can be done on a follow on ticket.

Comment thread src/java/org/apache/cassandra/index/sai/StorageAttachedIndex.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/analyzer/LuceneAnalyzer.java Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been left over from another ticket but we do need to figure out what should be done at this point rather than just throwing.

Comment thread src/java/org/apache/cassandra/index/sai/plan/Expression.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/plan/Expression.java Outdated
Comment thread test/unit/org/apache/cassandra/index/sai/virtual/AnalyzerViewTest.java Outdated
@jasonrutherglen
Copy link
Copy Markdown
Author

@mike-tr-adamson I think most users can simply use the analyzer classes of which we can list them. They're available at https://lucene.apache.org/core/7_7_3/analyzers-common/index.html

Copy link
Copy Markdown

@mike-tr-adamson mike-tr-adamson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jasonrutherglen jasonrutherglen force-pushed the STAR-123 branch 2 times, most recently from b065f47 to fb3c70f Compare July 28, 2021 15:03
@jasonrutherglen jasonrutherglen force-pushed the STAR-123 branch 2 times, most recently from a477051 to 49ae00d Compare August 12, 2021 20:35
Comment thread src/java/org/apache/cassandra/index/sai/ColumnContext.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/StorageAttachedIndex.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/ColumnContext.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/analyzer/AbstractAnalyzer.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/analyzer/AbstractAnalyzer.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/analyzer/LuceneAnalyzer.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/plan/Expression.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/virtual/AnalyzerView.java
Comment thread src/java/org/apache/cassandra/index/sai/analyzer/LuceneAnalyzer.java Outdated
@pkolaczk
Copy link
Copy Markdown

I looked at the code-smells reported by Sonar and it complains a lot about too generic exceptions plus a few minor issues.
Can you fix them, please?

Comment thread src/java/org/apache/cassandra/index/sai/disk/SSTableIndexWriter.java Outdated
Comment thread src/java/org/apache/cassandra/index/sai/analyzer/AbstractAnalyzer.java Outdated
{
if (tokenStream == null)
{
throw new IllegalStateException("resetInternal(ByteBuffer term) must be called prior to hasNext()");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is sasi code? I'm commenting on the code you've added / modified. org.apache.cassandra.index.sai.analyzer.LuceneAnalyzer looks like our class, same for AbstractAnalyzer.

This class implements an Iterator but breaks the Iterator contract in a few ways.
One way is fortunately documented on the next() call (next() doesn't work correctly before calling hasNext()).
But another one is having to call reset(...) before calling hasNext().

The top level documentation for the class should say something like that:

This class wraps a lucene `Analyzer` in order to <describe the purpose of this class; what value it adds to the lucene's Analyzer functionality>.

Caution!
Although this class implements `java.util.Iterator`, it does not adhere at all to the iterator contract:
1. Before the first use after construction, you have to call `reset(ByteBuffer)` in order to provide the input data to be analyzed. You are allowed to call `reset` multiple times to change the analyzed input.
2. To obtain the next analyzed element, you have to call `hasNext()` before calling `next()`. The call to `next()` does not advance the iterator, so calling it multiple times will result in the same element (or throwing NoSuchElementException if `hasNext` has never been called before`
3. A call to `hasNext` advances the iterator.  

This is the bare minimum I'd expect. A better documentation would also tell me why we're really making iterators that do not work like proper iterators. I haven't written the code, so I really don't know why - you're the author and you should explain.

@jasonrutherglen
Copy link
Copy Markdown
Author

@pkolaczk

The reply in thread doesn't work from:

#192 (comment)

The iterator style is from sasi. https://github.com/datastax/cassandra/blob/ds-trunk/src/java/org/apache/cassandra/index/sasi/analyzer/AbstractAnalyzer.java

Again... comments directed at why it's this way may be addressed to the sasi devs on a mailing list.

added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue
@sonarqubecloud
Copy link
Copy Markdown

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 1 Code Smell

84.5% 84.5% Coverage
0.0% 0.0% Duplication

@jasonrutherglen jasonrutherglen merged commit 3227a57 into ds-trunk Aug 24, 2021
@jasonrutherglen jasonrutherglen deleted the STAR-123 branch August 24, 2021 18:45
jacek-lewandowski pushed a commit that referenced this pull request May 26, 2022
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
jacek-lewandowski pushed a commit that referenced this pull request May 27, 2022
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
jacek-lewandowski pushed a commit that referenced this pull request Oct 17, 2022
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
jacek-lewandowski pushed a commit that referenced this pull request Oct 18, 2022
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
mfleming pushed a commit that referenced this pull request Jul 10, 2023
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
djatnieks pushed a commit that referenced this pull request Jul 24, 2023
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
djatnieks pushed a commit that referenced this pull request Aug 22, 2023
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
djatnieks pushed a commit that referenced this pull request Sep 12, 2023
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
jacek-lewandowski pushed a commit that referenced this pull request Jan 28, 2024
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)
djatnieks pushed a commit that referenced this pull request Mar 29, 2024
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code
djatnieks pushed a commit that referenced this pull request Apr 1, 2024
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code
djatnieks pushed a commit that referenced this pull request Apr 16, 2024
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code
djatnieks pushed a commit that referenced this pull request Jan 30, 2025
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code
djatnieks pushed a commit that referenced this pull request May 18, 2025
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code
michaelsembwever pushed a commit that referenced this pull request Feb 6, 2026
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code
michaelsembwever pushed a commit that referenced this pull request Feb 10, 2026
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code

 (Rebase of commit cf85206)
michaelsembwever pushed a commit that referenced this pull request Feb 11, 2026
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code

 (Rebase of commit cf85206)
michaelsembwever pushed a commit that referenced this pull request Feb 12, 2026
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code

 (Rebase of commit cf85206)
michaelsembwever pushed a commit that referenced this pull request Feb 14, 2026
added more language tests

added brazilian

cql test passes

added support for setting a lucene analyzer

cql json test passes

fixed up some things

cleanup

added query analyzer

cleanup; added constants

added exception handling in unit test

added bad options unit tests

added char filter

removed comments and extra code

added illegal arg ex to LuceneAnalyzer#hasNext

added stop word support; prior to removal

reworked, no more stop words

added lowercase filter test

added ngram filter test

added simplepattern test; snowball off

added czech and porter

fixed alloc

removed commented out code

removed extra code

fixed minor issues

maybe fixex setMinMax

cleanup

reverted

reverted to a new byte[] per tokenized term

cleanup

cleanup

fixed sasi test

fixed unit test bug

cleanup

refactored for npe

addressed review comments

fixed npe bug

fixed a couple of bugs

removed json_ from options names; applied sonar comments

fixed sonar comments

fixed unit test bug

changed exception thrown

get -> create

fixed minor issue

(cherry picked from commit 3227a57)
(cherry picked from commit add6b8d)
(cherry picked from commit 1c60b2d)
(cherry picked from commit 4d3912f)
(cherry picked from commit 5afe63d)
(cherry picked from commit d6c1172)
(cherry picked from commit 4b7b557)

STAR-123 Fix rebase compile errors and replace lucene-analysis-common with lucene-analyzers-common needed by CC code

 (Rebase of commit cf85206)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants