RemoteTransportException when trying to access :9200/_nodes #5357

daledude · 2014-03-06T12:20:11Z

I have a 3 node ES cluster. I just upgraded from 0.90.11 to 1.0.1 and started experiencing these exceptions. When I try to access curl 'http://server:9200/_nodes?pretty=true' on any of my nodes I get this exception in ES logs:

[2014-03-06 03:52:23,848][DEBUG][action.admin.cluster.node.info] [logserver3-la] failed to execute on node [iPvGOBIQTuOV_YhNAmLAUg]
org.elasticsearch.transport.RemoteTransportException: Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]
    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:148)
    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.IndexOutOfBoundsException: Readable byte limit exceeded: 7711
    at org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.readByte(AbstractChannelBuffer.java:236)
    at org.elasticsearch.transport.netty.ChannelBufferStreamInput.readByte(ChannelBufferStreamInput.java:132)
    at org.elasticsearch.common.io.stream.StreamInput.readString(StreamInput.java:276)
    at org.elasticsearch.common.io.stream.HandlesStreamInput.readString(HandlesStreamInput.java:61)
    at org.elasticsearch.threadpool.ThreadPool$Info.readFrom(ThreadPool.java:597)
    at org.elasticsearch.threadpool.ThreadPoolInfo.readFrom(ThreadPoolInfo.java:65)
    at org.elasticsearch.threadpool.ThreadPoolInfo.readThreadPoolInfo(ThreadPoolInfo.java:55)
    at org.elasticsearch.action.admin.cluster.node.info.NodeInfo.readFrom(NodeInfo.java:224)
    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:146)
    ... 23 more

I created a gist of the output of _nodes that I do get here:
https://gist.github.com/daledude/c6c0fb018d06d1e45a62

The exception in the logs is the same for all nodes. Using ES 1.0.1 and Java HotSpot(TM) 64-Bit Server VM 1.7.0_25 on all nodes.

This is my config which is the same for all nodes except the hosts, rack, zone:

cluster.name: mycluster
node.name: "logserver1-chi"
node.rack: chi1
node.zone: chi
node.master: true
node.data: true

index.number_of_replicas: 0

# cluster discovery
discovery.zen.fd.ping_interval: 15s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 5
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["logserver3-la.domain.com", "logserver2.domain.com"]
cluster.routing.allocation.awareness.attributes: zone

indices.memory.index_buffer_size: 20%
index.translog.flush_threshold_ops: 50000
indices.fielddata.cache.size: 30%
bootstrap.mlockall: true

threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: -1

threadpool.index.type: fixed
threadpool.index.size: 60
threadpool.index.queue_size: -1

action.disable_delete_all_indices: false

The text was updated successfully, but these errors were encountered:

spinscale · 2014-03-07T10:42:49Z

hey,

just to understand what you tried above:

Getting nodes info does not work
Getting the nodes info per node (on the specific node) works, when I interpret your gist

I would like to get more information about that issue. Can you try the following calls and tell me, if any of these requests fail and if so, which:

curl localhost:9200/_nodes/http
curl localhost:9200/_nodes/jvm
curl localhost:9200/_nodes/network
curl localhost:9200/_nodes/os
curl localhost:9200/_nodes/plugins
curl localhost:9200/_nodes/process
curl localhost:9200/_nodes/settings
curl localhost:9200/_nodes/thread_pool
curl localhost:9200/_nodes/transport

daledude · 2014-06-10T07:24:57Z

Sorry for delay. I have just been able to get back to this. None of the curl commands you gave fail.

The below does still fail with "Readable byte limit exceeded".

curl -XGET 'http://localhost:9200/_nodes'

Using es-head and bigdesk I get the error on the node I'm connected to if I try to select any other node.

What I've done since I opened this ticket:
*) Upgraded to jdk 1.7.0_60-b19 and still have the error.
*) Upgraded elasticsearch to 1.2.0 and still had the error.
*) Saw the emergency update and upgraded to 1.2.1. Still had the error. Also, running "elasticsearch-fix-routing-1.0.jar" also produced the error.

java -jar elasticsearch-fix-routing-1.0.jar localhost 9300 myindex count

*) Took down all nodes and wiped out ES data directory. Started only 2 nodes completely fresh and still had the error.
*) Dumped all data using elasticdump tool from community.
*) Downgraded ES to 1.1.2 and imported all data using elasticdump again. Still receive the error.

I'm using the same Centos 6, jdk, ES versions on all nodes. I have the latest es-head and bigdesk plugins installed on some nodes. I don't have any other transport agents on the network (I've taken ES down and sniffed network for any 9200/9300 traffic and there was none). I've used lsof to make sure the processes are not loading any old, or other, files or versions.

I appreciate the assist. I must be missing something. Any further hints on how to troubleshoot this how ever technical?

spinscale · 2014-06-12T06:46:49Z

I think I found it, it is a duplicate of #6325 - I will try to find a fix soon

daledude · 2014-07-09T22:32:50Z

Thanks spinscale. Will this make it in 1.3.0? Or maybe it's in 1.2.2?

As a SizeValue is used for serializing the thread pool size, a negative number resulted in throwing an exception when deserializing (using -ea an assertionerror was thrown). This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value. Closes elastic#6325 Closes elastic#5357

As a SizeValue is used for serializing the thread pool size, a negative number resulted in throwing an exception when deserializing (using -ea an assertionerror was thrown). This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value. Closes #6325 Closes #5357

As a SizeValue is used for serializing the thread pool size, a negative number resulted in throwing an exception when deserializing (using -ea an assertionerror was thrown). This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value. Closes elastic#6325 Closes elastic#5357

spinscale self-assigned this Mar 7, 2014

spinscale mentioned this issue Jun 12, 2014

Allow to serialize negative thread pool sizes #6486

Merged

spinscale closed this as completed in #6486 Jul 16, 2014

spinscale mentioned this issue Sep 6, 2014

java.lang.IndexOutOfBoundsException: Readable byte limit exceeded #7491

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RemoteTransportException when trying to access :9200/_nodes #5357

RemoteTransportException when trying to access :9200/_nodes #5357

daledude commented Mar 6, 2014

spinscale commented Mar 7, 2014

daledude commented Jun 10, 2014

spinscale commented Jun 12, 2014

daledude commented Jul 9, 2014

RemoteTransportException when trying to access :9200/_nodes #5357

RemoteTransportException when trying to access :9200/_nodes #5357

Comments

daledude commented Mar 6, 2014

spinscale commented Mar 7, 2014

daledude commented Jun 10, 2014

spinscale commented Jun 12, 2014

daledude commented Jul 9, 2014