Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundsException #7926

Closed
lnxg33k opened this issue Sep 30, 2014 · 14 comments
Closed

ArrayIndexOutOfBoundsException #7926

lnxg33k opened this issue Sep 30, 2014 · 14 comments
Assignees
Labels
>bug :Search/Search Search-related issues that do not fall into other categories

Comments

@lnxg33k
Copy link

lnxg33k commented Sep 30, 2014

I am using the latest version of elasticsearch and I got this error when I use scroll with large number size and scan as a search type

{"error":"ArrayIndexOutOfBoundsException[-131072]","status":500}

thous it perfectly works woth small sizes

ex.

[01:21:39] lnxg33k@ruined-sec ➜ ~: curl -XGET "http://localhost:9200/dns_logs/pico/_search?search_type=scan&scroll=1m" -d "{
                                   "query": { "match_all": {}},
                                   "size":  100000
                                   }"
{"_scroll_id":"c2Nhbjs1OzUxOjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NTM6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzs1Mjo4Zko2ODZBVVRueVpsbE90WXF4MmpnOzU0OjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NTU6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzsxO3RvdGFsX2hpdHM6NTIwNzY2ODg7","took":132,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":52076688,"max_score":0.0,"hits":[]}}⏎                                                                                                                     [01:21:50] lnxg33k@ruined-sec ➜ ~: curl -XGET "http://localhost:9200/_search/scroll?scroll=1m&scroll_id=c2Nhbjs1OzUxOjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NTM6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzs1Mjo4Zko2ODZBVVRueVpsbE90WXF4MmpnOzU0OjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NTU6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzsxO3RvdGFsX2hpdHM6NTIwNzY2ODg7" > xxx.json
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  262M  100  262M    0     0   129M      0  0:00:02  0:00:02 --:--:--  129M
[01:22:03] lnxg33k@ruined-sec ➜ ~: du -sh xxx.json 
263M    xxx.json
[01:22:07] lnxg33k@ruined-sec ➜ ~: curl -XGET "http://localhost:9200/dns_logs/pico/_search?search_type=scan&scroll=1m" -d "{
                                   "query": { "match_all": {}},
                                   "size":  1000000
                                   }"
{"_scroll_id":"c2Nhbjs1OzU2OjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NTc6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzs1ODo4Zko2ODZBVVRueVpsbE90WXF4MmpnOzU5OjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NjA6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzsxO3RvdGFsX2hpdHM6NTIwNzY2ODg7","took":128,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":52076688,"max_score":0.0,"hits":[]}}⏎                                                                                                                     [01:22:38] lnxg33k@ruined-sec ➜ ~: curl -XGET "http://localhost:9200/_search/scroll?scroll=1m&scroll_id=c2Nhbjs1OzU2OjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NTc6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzs1ODo4Zko2ODZBVVRueVpsbE90WXF4MmpnOzU5OjhmSjY4NkFVVG55WmxsT3RZcXgyamc7NjA6OGZKNjg2QVVUbnlabGxPdFlxeDJqZzsxO3RvdGFsX2hpdHM6NTIwNzY2ODg7"
{"error":"ArrayIndexOutOfBoundsException[null]","status":500}⏎                                                           
@clintongormley
Copy link

Hi @lnxg33k

Just to note: you shouldn't use such big sizes. The whole point of scrolling is that you can keep pulling smaller batches of results until you have enough.

That said, an NPE is always a bug. I've tried replicating this with two shards and 300,000 documents, but it is working fine for me.

Could you provide the stack trace from the logs so that we can investigate further?

thanks

@clintongormley
Copy link

Hi @lnxg33k

Any chance of getting the stack trace please?

@lnxg33k
Copy link
Author

lnxg33k commented Oct 28, 2014

@clintongormley I am sorry but I couldn't reporoduce it anymore and don't have the stack trace.

@clintongormley
Copy link

OK, thanks @lnxg33k

I'll close this issue as we have been unable to replicate, but please feel free to reopen if you see it happen again.

@l15k4
Copy link

l15k4 commented Jun 30, 2015

There is a stack trace with ArrayIndexOutOfBounds that occurs when scrolling, it started after upgrade to 1.6 :

org.elasticsearch.transport.RemoteTransportException: [Book][inet[/172.31.13.26:9300]][indices:data/read/scroll]
Caused by: org.elasticsearch.action.search.ReduceSearchPhaseException: Failed to execute phase [fetch], [reduce] 
    at org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction.finishHim(TransportSearchScrollScanAction.java:190) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction.access$800(TransportSearchScrollScanAction.java:71) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction$1.onResult(TransportSearchScrollScanAction.java:164) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction$1.onResult(TransportSearchScrollScanAction.java:159) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.search.action.SearchServiceTransportAction$22.handleResponse(SearchServiceTransportAction.java:533) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.search.action.SearchServiceTransportAction$22.handleResponse(SearchServiceTransportAction.java:524) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:163) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:132) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) ~[gwiq.jar:0.6-SNAPSHOT]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) ~[na:1.7.0_75]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ~[na:1.7.0_75]
    at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_75]
Caused by: java.lang.ArrayIndexOutOfBoundsException: 70
    at org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction.innerFinishHim(TransportSearchScrollScanAction.java:209) ~[gwiq.jar:0.6-SNAPSHOT]
    at org.elasticsearch.action.search.type.TransportSearchScrollScanAction$AsyncAction.finishHim(TransportSearchScrollScanAction.java:188) ~[gwiq.jar:0.6-SNAPSHOT]
    ... 29 common frames omitted

@l15k4
Copy link

l15k4 commented Jul 1, 2015

I'm doing a lot of scrollings (with sliding time constraint over the same indices) in a for loop. I'll try to clear scroll context after each iteration to see if it helps...

@clintongormley
Copy link

@l15k4 also, are you using scan? and do you have any shard exceptions while scrolling?

@l15k4
Copy link

l15k4 commented Jul 1, 2015

@clintongormley Yes I'm doing scan with range over 110 indices, page-size=70, tried keepAlive=10s-30s. When I didn't get ArrayIndexOutOfBoundsException I got shardFailures, like ~ 10-50 of the same identical failures :

failure.index() == null
failure.shardId == -1`
failure.reason == NodeDisconnectedException

@clintongormley
Copy link

Hi @l15k4

Thanks for the info. I've asked @martijnvg to have a look at it when he has a moment. Any more info that you can provide to help us track it down would be useful. also, why so many node disconnected exceptions? that seems weird. Do you see any exceptions on those nodes?

@l15k4
Copy link

l15k4 commented Jul 1, 2015

Imho it was all caused by leaving too many "15s" scroll contexts alive because I wasn't clearing them and I was performing 8760 tiny scans in for loop (sequentially) ... After I deployed the application with clearing-scroll-context-feature it works like a charm...

Sorry but those logs were temporary, they are gone with the old docker container...

@martijnvg
Copy link
Member

@l15k4 Did the errors occur while there were nodes of mixed versions in the cluster? Or were all nodes on the same version?

@l15k4
Copy link

l15k4 commented Jul 1, 2015

@martijnvg At the time of the error being thrown all 4 nodes were 1.6.0 but a week ago we managed to run cluster [1.6.0, 1.6.0, 1.6.0, 1.5.1] for 3 hours before we noticed it was having yellow status indefinitely... Could it affect future well being of the cluster?

@martijnvg
Copy link
Member

@l15k4 no, but I don't recommend to do this is for a long period of time. Not sure what the cause of the exception was here, but I think the code where the exception occurs can be written in such a way that an ArrayIndexOutOfBoundsException can never occur.

@jpountz
Copy link
Contributor

jpountz commented Aug 24, 2015

Fixed via #11978

@jpountz jpountz closed this as completed Aug 24, 2015
@clintongormley clintongormley added :Search/Search Search-related issues that do not fall into other categories and removed :Scroll labels Feb 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Search Search-related issues that do not fall into other categories
Projects
None yet
Development

No branches or pull requests

5 participants