Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TEST] NPE with seed ED695B6A3A249CA6 #6211

Closed
wants to merge 1 commit into from

Conversation

dadoonet
Copy link
Member

With -Dtests.seed=ED695B6A3A249CA6, in FreqTermsEnumTests term has been skipped so not available in reference array.

With -Dtests.seed=ED695B6A3A249CA6, in `FreqTermsEnumTests` term has been skipped so not available in reference array.
@dadoonet dadoonet self-assigned this May 16, 2014
@markharwood
Copy link
Contributor

I don't think this should go in as I think it is masking something fishy.

If I run the unmodified master test and remove the "if(rarely()" around the writer commit calls the test passes with this seed. If I leave the "rarely" bit in we have issues with deleted items and this seed.
Needs some further investigation into commits and visibility of deleted items

@s1monw
Copy link
Contributor

s1monw commented May 16, 2014

I think this is actually a bug in the test code FilterableTermsEnum claims it has found the term which is true but all docs that contain it are filtered out so we essentially return a docFreq = 0 now the question is should we return the term as found if no document is live? ie use the patch below:

diff --git a/src/main/java/org/elasticsearch/common/lucene/index/FilterableTermsEnum.java b/src/main/java/org/elasticsearch/common/lucene/index/FilterableTermsEnum.java
index 2591dc0..74834f7 100644
--- a/src/main/java/org/elasticsearch/common/lucene/index/FilterableTermsEnum.java
+++ b/src/main/java/org/elasticsearch/common/lucene/index/FilterableTermsEnum.java
@@ -165,6 +165,12 @@ public class FilterableTermsEnum extends TermsEnum {
             currentTotalTermFreq = totalTermFreq;
             current = text;
         }
+        if (found && currentDocFreq == 0) {
+            current = null;
+            currentDocFreq = -1;
+            currentTotalTermFreq = -1;
+            return false;
+        }
         return found;
     }

or fix the test to just ignore that in the filtered case:

diff --git a/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java b/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java
index e05ef1d..4c30e73 100644
--- a/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java
+++ b/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java
@@ -200,10 +200,10 @@ public class FreqTermsEnumTests extends ElasticsearchLuceneTestCase {
                 if (!termsEnum.seekExact(new BytesRef(term))) {
                     continue;
                 }
-                if (docFreq) {
+                if (docFreq && termsEnum.docFreq() > 0) {
                     assertThat("cycle " + i + ", term " + term + ", docFreq", termsEnum.docFreq(), equalTo(reference.get(term).docFreq));
                 }
-                if (totalTermFreq) {
+                if (totalTermFreq && termsEnum.docFreq() > 0) {
                     assertThat("cycle " + i + ", term " + term + ", totalTermFreq", termsEnum.totalTermFreq(), equalTo(reference.get(term).totalTermFreq));
                 }
             }

or alternatively

diff --git a/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java b/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java
index e05ef1d..aac36ba 100644
--- a/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java
+++ b/src/test/java/org/elasticsearch/common/lucene/index/FreqTermsEnumTests.java
@@ -138,6 +138,11 @@ public class FreqTermsEnumTests extends ElasticsearchLuceneTestCase {
                 }
             }
         }
+        for (String term : terms) {
+            if(!referenceFilter.containsKey(term)) {
+               referenceFilter.put(term, new FreqHolder());
+            }
+        }
         filter = new TermsFilter(filterTerms);
     }

@jpountz
Copy link
Contributor

jpountz commented May 17, 2014

+1 to not reporting a term as live if it has a freq of 0.

@s1monw
Copy link
Contributor

s1monw commented May 18, 2014

I agree with @jpountz we should not report the term as live. If we do this we should also fix FreqTermsEnum to only cache the currentDocFreq and currentTotalTermFreq if we actually found the term. if it's not found it should manually put -1 in for those values as seekExact etc. doesn't guarantee to reset those values to -1

given that this is new I guess we should fix it for 1.2

@s1monw
Copy link
Contributor

s1monw commented May 18, 2014

I opened #6221 to fix this...

@dadoonet
Copy link
Member Author

Cool. Closing this one.

@dadoonet dadoonet closed this May 19, 2014
s1monw added a commit to s1monw/elasticsearch that referenced this pull request May 19, 2014
FilterableTermsEnum allows to filter stats by supplying per segment
bits. Today if all docs are filtered out the term is still reported as
live but shouldn't.

Relates to elastic#6211
s1monw added a commit that referenced this pull request May 19, 2014
FilterableTermsEnum allows to filter stats by supplying per segment
bits. Today if all docs are filtered out the term is still reported as
live but shouldn't.

Relates to #6211
@dadoonet dadoonet deleted the test/NPE_FreqTermsEnul branch June 10, 2015 09:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants