Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failure when doing a significant terms aggregation. #7951

Closed
ovidiu opened this Issue Oct 1, 2014 · 2 comments

Comments

Projects
None yet
2 participants
@ovidiu
Copy link

ovidiu commented Oct 1, 2014

Hello.

I'm running: Version: 1.3.0, Build: 1265b14/2014-07-23T13:46:36Z, JVM: 1.7.0_65

I'm trying to do a simple significant terms aggregation and I get an exception:

Command:

curl -s XGET 'localhost:9200/test_index/job/_search?pretty' -d '{
    "query": {
        "filtered": {
            "filter": {
                "terms": {
                    "profession": [
                        "4980"
                    ]
                }
            }
        }
    },
    "size": 0,
    "aggs": {
        "term_cloud": {
            "significant_terms": {
                "field": "fulltext"
            }
        }
    }
}'

Response:

{
  "error" : "SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[w_0mi1Z-Tvm68G3dfr7rUg][test_index][2]: ElasticsearchIllegalArgumentException[supersetFreq > supersetSize, in JLHScore.score(..)]}{[w_0mi1Z-Tvm68G3dfr7rUg][test_index][3]: ElasticsearchIllegalArgumentException[supersetFreq > supersetSize, in JLHScore.score(..)]}{[w_0mi1Z-Tvm68G3dfr7rUg][test_index][4]: ElasticsearchIllegalArgumentException[supersetFreq > supersetSize, in JLHScore.score(..)]}{[w_0mi1Z-Tvm68G3dfr7rUg][test_index][0]: ElasticsearchIllegalArgumentException[supersetFreq > supersetSize, in JLHScore.score(..)]}{[w_0mi1Z-Tvm68G3dfr7rUg][test_index][1]: ElasticsearchIllegalArgumentException[supersetFreq > supersetSize, in JLHScore.score(..)]}]",
  "status" : 400
}

Console:

[2014-10-01 18:02:55,119][DEBUG][action.search.type       ] [Joystick] [test_index][2], node[w_0mi1Z-Tvm68G3dfr7rUg], [P], s[STARTED]: Failed to execute [org.elasticsearch.action.search.SearchRequest@4b58136a]
org.elasticsearch.ElasticsearchIllegalArgumentException: supersetFreq > supersetSize, in JLHScore.score(..)
        at org.elasticsearch.search.aggregations.bucket.significant.heuristics.JLHScore.getScore(JLHScore.java:79)
        at org.elasticsearch.search.aggregations.bucket.significant.InternalSignificantTerms$Bucket.updateScore(InternalSignificantTerms.java:80)
        at org.elasticsearch.search.aggregations.bucket.significant.GlobalOrdinalsSignificantTermsAggregator.buildAggregation(GlobalOrdinalsSignificantTermsAggregator.java:102)
        at org.elasticsearch.search.aggregations.bucket.significant.GlobalOrdinalsSignificantTermsAggregator.buildAggregation(GlobalOrdinalsSignificantTermsAggregator.java:41)
        at org.elasticsearch.search.aggregations.AggregationPhase.execute(AggregationPhase.java:133)
        at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:171)
        at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:261)
        at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:206)
        at org.elasticsearch.search.action.SearchServiceTransportAction$5.call(SearchServiceTransportAction.java:203)
        at org.elasticsearch.search.action.SearchServiceTransportAction$23.run(SearchServiceTransportAction.java:517)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

The mapping for the two fields involved:

"profession" : {"type" : "integer"},
"fulltext" : {"type" : "string"},

@markharwood markharwood self-assigned this Oct 1, 2014

@markharwood

This comment has been minimized.

Copy link
Contributor

markharwood commented Oct 1, 2014

Thanks for reporting this I have reproduced the failure and working on the fix

@markharwood

This comment has been minimized.

Copy link
Contributor

markharwood commented Oct 1, 2014

@brwe Wouldn't mind discussing this - one fix for the above is pretty simple (use a background superset doc-count of IndexReader.maxDoc that includes deleted docs) but it introduces a lot of test failures into your SignificantTermsSignificanceScoreTests class which is very sensitive to score changes caused by the randomized testing framework's habit of deleting docs then re-inserting docs.

markharwood added a commit that referenced this issue Oct 3, 2014

Aggs fix - background count for docs should include deleted docs othe…
…rwise a term’s docFreq (which includes deleted docs) can exceed the number of docs reported in the index and cause an exception.

The randomisation that deletes documents is also removed from tests as this doc-accounting change would mean the specific scores being expected in tests would now be subject to random variability and so fail.

Closes #7951

markharwood added a commit that referenced this issue Oct 3, 2014

Aggs fix - background count for docs should include deleted docs othe…
…rwise a term’s docFreq (which includes deleted docs) can exceed the number of docs reported in the index and cause an exception.

The randomisation that deletes documents is also removed from tests as this doc-accounting change would mean the specific scores being expected in tests would now be subject to random variability and so fail.

Closes #7951

markharwood added a commit that referenced this issue Oct 3, 2014

Aggs fix - background count for docs should include deleted docs othe…
…rwise a term’s docFreq (which includes deleted docs) can exceed the number of docs reported in the index and cause an exception.

The randomisation that deletes documents is also removed from tests as this doc-accounting change would mean the specific scores being expected in tests would now be subject to random variability and so fail.

Closes #7951

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Aggs fix - background count for docs should include deleted docs othe…
…rwise a term’s docFreq (which includes deleted docs) can exceed the number of docs reported in the index and cause an exception.

The randomisation that deletes documents is also removed from tests as this doc-accounting change would mean the specific scores being expected in tests would now be subject to random variability and so fail.

Closes elastic#7951

mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015

Aggs fix - background count for docs should include deleted docs othe…
…rwise a term’s docFreq (which includes deleted docs) can exceed the number of docs reported in the index and cause an exception.

The randomisation that deletes documents is also removed from tests as this doc-accounting change would mean the specific scores being expected in tests would now be subject to random variability and so fail.

Closes elastic#7951
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.