IndexOutOfBoundsException after update from 6.2 to 6.4 #34555

atanasovdragan · 2018-10-17T13:35:44Z

Hi,

I asked the following question at your discussion forum and it turned out that this might be an issue, so I was advised to open an issue here.

Elasticsearch version (bin/elasticsearch --version):
6.4.2

JVM version (java -version):
1.8.0_171

OS version (uname -a if on a Unix-like system):
Ubuntu 18.04 LTS

Description of the problem:

I have a big multi search query which contains pretty complicated aggregations. Recently I updated the Elasticsearch version from 6.2 to 6.4 and when this query is executed using the official PHP package, I am getting an error message in my log file which it looks like this:

full_stack_trace.txt

I have 4 queries in my multi search query, from which only 2 of them are failing (those are the similar ones - one of those two is pasted bellow). The error is logged every time when the "option group" / "option" top hits aggregation need to be done.

query.txt

It is important to mention that this is not the case when I try to execute this query via Kibana. When executed through Kibana I receive correct results. Previously, on 6.2 version, this was not an issue at all.

I noticed that the problem appears at the last two top_hits aggregations. If I remove them, I receive correct results.

I had 1 node and 2 shards on my local machine when the problem actually appeared. When I increased the number of shards to 3 or 5, the problem disappeared.

I checked the the two indexes I am querying in my Kibana monitoring section and this is the strange thing I have noticed:

name: index1
status: Health: green Green
document count: 550
data: 211.4 KB
index rate: 0 /s
search rate: 0.01 /s
unassigned shards: 0

It is strange that I have only 28 indexed documents in index1 for real, so I am confused how this numbers are counted and what they actually represent?

I have a workaround for this problem with increasing the number of shards. But it's strange to allocate 5 shards because of two indexes with 28 and 112 documents stored in them in order to make those aggregations working.

Thanks

The text was updated successfully, but these errors were encountered:

elasticmachine · 2018-10-17T13:40:12Z

Pinging @elastic/es-search-aggs

romseygeek · 2018-10-18T09:38:20Z

For reference, the forum discussion is here: https://discuss.elastic.co/t/indexoutofboundsexception-after-update-from-6-2-to-6-4/152670

This is the relevant bit of the stack trace:

Caused by: java.lang.IndexOutOfBoundsException
    at java.nio.Buffer.checkIndex(Buffer.java:540) ~[?:1.8.0_171]
    at java.nio.DirectByteBuffer.get(DirectByteBuffer.java:253) ~[?:1.8.0_171]
    at org.apache.lucene.store.ByteBufferGuard.getByte(ByteBufferGuard.java:118) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
    at org.apache.lucene.store.ByteBufferIndexInput$SingleBufferImpl.readByte(ByteBufferIndexInput.java:385) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
    at org.apache.lucene.codecs.lucene70.Lucene70NormsProducer$7.longValue(Lucene70NormsProducer.java:263) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]
    at org.apache.lucene.search.similarities.BM25Similarity$BM25DocScorer.score(BM25Similarity.java:257) ~[lucene-core-7.4.0.jar:7.4.0 9060ac689c270b02143f375de0348b7f626adebc - jpountz - 2018-06-18 16:51:45]

... which is a bit scary, because it looks like a bug decoding norms in lucene.

Does this happen even after a reindex? And if so, are you able to share the mappings and documents for this index so that we can reproduce this?

jimczi · 2018-10-18T11:53:28Z

The bug is in Elasticsearch, the nested aggregator does some buffering of documents and this confuses the scorer that is used by the top_hits aggregator after it.
Can you try to change the sort of your top_hits aggregation to:

"top_hits": {
   "sort": "_doc",
   "size": 1
}

This should fix the issue you're seeing. By default the top_hits aggregator uses the score of the query to sort the documents. However in a nested context the score is always the score of the root document.

atanasovdragan · 2018-10-18T14:18:18Z

@jimczi Thanks, it works! However it is still a bug and I guess this is only a workaround. I wonder why this was not an issue on 6.2 version previously?

Thank you very much! It was a real headache. 😃

jimczi · 2018-10-18T16:48:56Z

However it is still a bug and I guess this is only a workaround.

Yes this is why I left the issue open. We need to fix the nested aggregator when scores is required in a sub-aggregation.

The nested agg can defer the collection of children if it is nested under another aggregation. In such case accessing the score in the children aggregation throws an error because the scorer has already advanced to the next parent. This change fixes this error by caching the score of the parent in the nested aggregation. Children aggregations that work on nested documents will be able to access the _score. Also note that the _score in this case is always the parent's score, there is no way to retrieve the score of a nested docs in aggregations. Closes elastic#35985 Closes elastic#34555

The nested agg can defer the collection of children if it is nested under another aggregation. In such case accessing the score in the children aggregation throws an error because the scorer has already advanced to the next parent. This change fixes this error by caching the score of the parent in the nested aggregation. Children aggregations that work on nested documents will be able to access the _score. Also note that the _score in this case is always the parent's score, there is no way to retrieve the score of a nested docs in aggregations. Closes #35985 Closes #34555

nik9000 added the :Search/Search Search-related issues that do not fall into other categories label Oct 17, 2018

romseygeek self-assigned this Oct 18, 2018

romseygeek added the >bug label Oct 18, 2018

jimczi mentioned this issue Nov 28, 2018

Cache the score of the parent document in the nested agg #36019

Merged

jimczi closed this as completed in #36019 Nov 29, 2018

nemphys mentioned this issue Jun 11, 2021

Simple combined_fields query fails with java.lang.IndexOutOfBoundsException #74037

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexOutOfBoundsException after update from 6.2 to 6.4 #34555

IndexOutOfBoundsException after update from 6.2 to 6.4 #34555

atanasovdragan commented Oct 17, 2018 •

edited

elasticmachine commented Oct 17, 2018

romseygeek commented Oct 18, 2018

jimczi commented Oct 18, 2018

atanasovdragan commented Oct 18, 2018

jimczi commented Oct 18, 2018

IndexOutOfBoundsException after update from 6.2 to 6.4 #34555

IndexOutOfBoundsException after update from 6.2 to 6.4 #34555

Comments

atanasovdragan commented Oct 17, 2018 • edited

elasticmachine commented Oct 17, 2018

romseygeek commented Oct 18, 2018

jimczi commented Oct 18, 2018

atanasovdragan commented Oct 18, 2018

jimczi commented Oct 18, 2018

atanasovdragan commented Oct 17, 2018 •

edited