Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RemoteTransportException when trying to access :9200/_nodes #5357

Closed
daledude opened this issue Mar 6, 2014 · 4 comments · Fixed by #6486
Closed

RemoteTransportException when trying to access :9200/_nodes #5357

daledude opened this issue Mar 6, 2014 · 4 comments · Fixed by #6486
Assignees

Comments

@daledude
Copy link

daledude commented Mar 6, 2014

I have a 3 node ES cluster. I just upgraded from 0.90.11 to 1.0.1 and started experiencing these exceptions. When I try to access curl 'http://server:9200/_nodes?pretty=true' on any of my nodes I get this exception in ES logs:

[2014-03-06 03:52:23,848][DEBUG][action.admin.cluster.node.info] [logserver3-la] failed to execute on node [iPvGOBIQTuOV_YhNAmLAUg]
org.elasticsearch.transport.RemoteTransportException: Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]
Caused by: org.elasticsearch.transport.TransportSerializationException: Failed to deserialize response of type [org.elasticsearch.action.admin.cluster.node.info.NodeInfo]
    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:148)
    at org.elasticsearch.transport.netty.MessageChannelHandler.messageReceived(MessageChannelHandler.java:125)
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:296)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.unfoldAndFireMessageReceived(FrameDecoder.java:462)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDecoder.java:443)
    at org.elasticsearch.common.netty.handler.codec.frame.FrameDecoder.messageReceived(FrameDecoder.java:303)
    at org.elasticsearch.common.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:70)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
    at org.elasticsearch.common.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:559)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:268)
    at org.elasticsearch.common.netty.channel.Channels.fireMessageReceived(Channels.java:255)
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
    at org.elasticsearch.common.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
    at org.elasticsearch.common.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at org.elasticsearch.common.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.elasticsearch.common.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
Caused by: java.lang.IndexOutOfBoundsException: Readable byte limit exceeded: 7711
    at org.elasticsearch.common.netty.buffer.AbstractChannelBuffer.readByte(AbstractChannelBuffer.java:236)
    at org.elasticsearch.transport.netty.ChannelBufferStreamInput.readByte(ChannelBufferStreamInput.java:132)
    at org.elasticsearch.common.io.stream.StreamInput.readString(StreamInput.java:276)
    at org.elasticsearch.common.io.stream.HandlesStreamInput.readString(HandlesStreamInput.java:61)
    at org.elasticsearch.threadpool.ThreadPool$Info.readFrom(ThreadPool.java:597)
    at org.elasticsearch.threadpool.ThreadPoolInfo.readFrom(ThreadPoolInfo.java:65)
    at org.elasticsearch.threadpool.ThreadPoolInfo.readThreadPoolInfo(ThreadPoolInfo.java:55)
    at org.elasticsearch.action.admin.cluster.node.info.NodeInfo.readFrom(NodeInfo.java:224)
    at org.elasticsearch.transport.netty.MessageChannelHandler.handleResponse(MessageChannelHandler.java:146)
    ... 23 more

I created a gist of the output of _nodes that I do get here:
https://gist.github.com/daledude/c6c0fb018d06d1e45a62

The exception in the logs is the same for all nodes. Using ES 1.0.1 and Java HotSpot(TM) 64-Bit Server VM 1.7.0_25 on all nodes.

This is my config which is the same for all nodes except the hosts, rack, zone:

cluster.name: mycluster
node.name: "logserver1-chi"
node.rack: chi1
node.zone: chi
node.master: true
node.data: true

index.number_of_replicas: 0

# cluster discovery
discovery.zen.fd.ping_interval: 15s
discovery.zen.fd.ping_timeout: 60s
discovery.zen.fd.ping_retries: 5
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: ["logserver3-la.domain.com", "logserver2.domain.com"]
cluster.routing.allocation.awareness.attributes: zone

indices.memory.index_buffer_size: 20%
index.translog.flush_threshold_ops: 50000
indices.fielddata.cache.size: 30%
bootstrap.mlockall: true

threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: -1

threadpool.index.type: fixed
threadpool.index.size: 60
threadpool.index.queue_size: -1

action.disable_delete_all_indices: false
@spinscale spinscale self-assigned this Mar 7, 2014
@spinscale
Copy link
Contributor

hey,

just to understand what you tried above:

  • Getting nodes info does not work
  • Getting the nodes info per node (on the specific node) works, when I interpret your gist

I would like to get more information about that issue. Can you try the following calls and tell me, if any of these requests fail and if so, which:

curl localhost:9200/_nodes/http
curl localhost:9200/_nodes/jvm
curl localhost:9200/_nodes/network
curl localhost:9200/_nodes/os
curl localhost:9200/_nodes/plugins
curl localhost:9200/_nodes/process
curl localhost:9200/_nodes/settings
curl localhost:9200/_nodes/thread_pool
curl localhost:9200/_nodes/transport

@daledude
Copy link
Author

Sorry for delay. I have just been able to get back to this. None of the curl commands you gave fail.

The below does still fail with "Readable byte limit exceeded".

curl -XGET 'http://localhost:9200/_nodes'

Using es-head and bigdesk I get the error on the node I'm connected to if I try to select any other node.

What I've done since I opened this ticket:
*) Upgraded to jdk 1.7.0_60-b19 and still have the error.
*) Upgraded elasticsearch to 1.2.0 and still had the error.
*) Saw the emergency update and upgraded to 1.2.1. Still had the error. Also, running "elasticsearch-fix-routing-1.0.jar" also produced the error.

java -jar elasticsearch-fix-routing-1.0.jar localhost 9300 myindex count

*) Took down all nodes and wiped out ES data directory. Started only 2 nodes completely fresh and still had the error.
*) Dumped all data using elasticdump tool from community.
*) Downgraded ES to 1.1.2 and imported all data using elasticdump again. Still receive the error.

I'm using the same Centos 6, jdk, ES versions on all nodes. I have the latest es-head and bigdesk plugins installed on some nodes. I don't have any other transport agents on the network (I've taken ES down and sniffed network for any 9200/9300 traffic and there was none). I've used lsof to make sure the processes are not loading any old, or other, files or versions.

I appreciate the assist. I must be missing something. Any further hints on how to troubleshoot this how ever technical?

@spinscale
Copy link
Contributor

I think I found it, it is a duplicate of #6325 - I will try to find a fix soon

@daledude
Copy link
Author

daledude commented Jul 9, 2014

Thanks spinscale. Will this make it in 1.3.0? Or maybe it's in 1.2.2?

spinscale added a commit to spinscale/elasticsearch that referenced this issue Jul 16, 2014
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).

This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.

Closes elastic#6325
Closes elastic#5357
spinscale added a commit that referenced this issue Jul 16, 2014
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).

This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.

Closes #6325
Closes #5357
spinscale added a commit that referenced this issue Jul 16, 2014
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).

This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.

Closes #6325
Closes #5357
spinscale added a commit that referenced this issue Jul 16, 2014
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).

This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.

Closes #6325
Closes #5357
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).

This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.

Closes elastic#6325
Closes elastic#5357
mute pushed a commit to mute/elasticsearch that referenced this issue Jul 29, 2015
As a SizeValue is used for serializing the thread pool size, a negative number
resulted in throwing an exception when deserializing (using -ea an assertionerror
was thrown).

This fixes a check for changing the serialization logic, so that negative numbers are read correctly, by adding an internal UNBOUNDED value.

Closes elastic#6325
Closes elastic#5357
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants