Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to send join request to master #25860

Closed
PKFresher opened this issue Jul 24, 2017 · 7 comments
Closed

failed to send join request to master #25860

PKFresher opened this issue Jul 24, 2017 · 7 comments
Labels

Comments

@PKFresher
Copy link

ES version 5.3.0
jvm:1.8.0_77
os:2.6.32-431.el6.x86_64 #1 SMP Fri Nov 22 03:15:09 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

start cluster,rolling start nodes , two nodes con't join in the cluster
this appearance that the two nodes join the cluster then left repeat

the log :
2017-07-24T20:15:30,076][INFO ][o.e.c.s.ClusterService ] [s-xiasha-10-2-34-52.hx-2] detected_master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}, reason: zen-disco-receive(from master [master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302} committed version [347391]])
[2017-07-24T20:15:33,085][INFO ][o.e.d.z.ZenDiscovery ] [s-xiasha-10-2-34-52.hx-2] master_left [{s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2017-07-24T20:15:33,085][WARN ][o.e.d.z.ZenDiscovery ] [s-xiasha-10-2-34-52.hx-2] master left (reason = failed to ping, tried [3] times, each with maximum [30s] timeout), current nodes: nodes:
{s10-2-178-3.hx-2}{p6KGC9CxT5eCOs2AFOci-g}{kpR_wRjYS36lK0YgML5qgA}{10.2.178.3}{10.2.178.3:9302}
{s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}, master
{s-xiasha-10-2-34-58.hx-2}{i1WtXAFJSOeYS98x9C7r9w}{EEp3eLMeTSSuoljVW8hhvQ}{10.2.34.58}{10.2.34.58:9302}
{s10-2-25-18.hx-2}{T8Bo_B8KRamjAIVBn23KOA}{bxhEEUDGTqKPWl2y6KgQGQ}{10.2.25.18}{10.2.25.18:9302}
{s10-2-24-18.hx-2}{FjKi7-tjSl6kXOY8a3kzKw}{kw1WLJIERiisTtnwid6hig}{10.2.24.18}{10.2.24.18:9302}
{s10-2-22-10.hx-2}{VVauA39OQyiV2lacypxPXQ}{_p1BuxRGREe0pFCx5KehoA}{10.2.22.10}{10.2.22.10:9302}
{s-xiasha-10-2-34-52.hx-2}{lx3lrJZIQWWTuFG-95lGyA}{jtcEEhiJR6Sftmu4vpPIww}{10.2.34.52}{10.2.34.52:9302}, local
{s10-2-22-11.hx-2}{Ak_NpY33S-eEMO3pzBYj-A}{sSzP7RY7RRmbwrWPfqQ3bw}{10.2.22.11}{10.2.22.11:9302}
{s10-2-22-9.hx-2}{QbNCfIRTSl6pXNmBS3KV9g}{0Z_KGBRzSPmE4rlukZinrg}{10.2.22.9}{10.2.22.9:9302}

[2017-07-24T20:15:36,301][INFO ][o.e.c.s.ClusterService ] [s-xiasha-10-2-34-52.hx-2] detected_master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}, reason: zen-disco-receive(from master [master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302} committed version [347393]])
[2017-07-24T20:15:45,877][INFO ][o.e.d.z.ZenDiscovery ] [s-xiasha-10-2-34-52.hx-2] master_left [{s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}], reason [failed to ping, tried [3] times, each with maximum [30s] timeout]
[2017-07-24T20:15:45,877][WARN ][o.e.d.z.ZenDiscovery ] [s-xiasha-10-2-34-52.hx-2] master left (reason = failed to ping, tried [3] times, each with maximum [30s] timeout), current nodes: nodes:
{s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}, master
{s10-2-178-3.hx-2}{p6KGC9CxT5eCOs2AFOci-g}{kpR_wRjYS36lK0YgML5qgA}{10.2.178.3}{10.2.178.3:9302}
{s10-2-24-18.hx-2}{FjKi7-tjSl6kXOY8a3kzKw}{kw1WLJIERiisTtnwid6hig}{10.2.24.18}{10.2.24.18:9302}
{s-xiasha-10-2-34-58.hx-2}{i1WtXAFJSOeYS98x9C7r9w}{EEp3eLMeTSSuoljVW8hhvQ}{10.2.34.58}{10.2.34.58:9302}
{s-xiasha-10-2-34-52.hx-2}{lx3lrJZIQWWTuFG-95lGyA}{jtcEEhiJR6Sftmu4vpPIww}{10.2.34.52}{10.2.34.52:9302}, local
{s10-2-22-10.hx-2}{VVauA39OQyiV2lacypxPXQ}{_p1BuxRGREe0pFCx5KehoA}{10.2.22.10}{10.2.22.10:9302}
{s10-2-22-9.hx-2}{QbNCfIRTSl6pXNmBS3KV9g}{0Z_KGBRzSPmE4rlukZinrg}{10.2.22.9}{10.2.22.9:9302}
{s10-2-25-18.hx-2}{T8Bo_B8KRamjAIVBn23KOA}{bxhEEUDGTqKPWl2y6KgQGQ}{10.2.25.18}{10.2.25.18:9302}
{s10-2-22-11.hx-2}{Ak_NpY33S-eEMO3pzBYj-A}{sSzP7RY7RRmbwrWPfqQ3bw}{10.2.22.11}{10.2.22.11:9302}

[2017-07-24T20:15:51,881][INFO ][o.e.d.z.ZenDiscovery ] [s-xiasha-10-2-34-52.hx-2] failed to send join request to master [{s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}], reason [RemoteTransportException[[s10-2-178-2.hx-2][10.2.178.2:9302][internal:discovery/zen/join]]; nested: ConnectTransportException[[s-xiasha-10-2-34-52.hx-2][10.2.34.52:9302] connect_timeout[30s]]; nested: IOException[Connection timed out: 10.2.34.52/10.2.34.52:9302]; ]
[2017-07-24T20:15:55,096][INFO ][o.e.c.s.ClusterService ] [s-xiasha-10-2-34-52.hx-2] detected_master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302}, reason: zen-disco-receive(from master [master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302} committed version [347399]])
[2017-07-24T20:16:05,100][INFO ][o.e.c.s.ClusterService ] [s-xiasha-10-2-34-52.hx-2] added {{s-xiasha-10-2-34-36.hx-2}{UXZNJejWTjKpDywrKatdmw}{ztfSVyiYQaWM9bwXhC5HTQ}{10.2.34.36}{10.2.34.36:9302},}, reason: zen-disco-receive(from master [master {s10-2-178-2.hx-2}{9Qq5QHRUSH6vwl0d7qxtwQ}{O_2d5pofRP-VA_Q6jA6D_w}{10.2.178.2}{10.2.178.2:9302} committed version [347402]])
[2017-07-24T20:16:08,101][WARN ][o.e.c.NodeConnectionsService] [s-xiasha-10-2-34-52.hx-2] failed to connect to node {s-xiasha-10-2-34-36.hx-2}{UXZNJejWTjKpDywrKatdmw}{ztfSVyiYQaWM9bwXhC5HTQ}{10.2.34.36}{10.2.34.36:9302} (tried [1] times)
org.elasticsearch.transport.ConnectTransportException: [s-xiasha-10-2-34-36.hx-2][10.2.34.36:9302] connect_timeout[30s]
at org.elasticsearch.transport.netty4.Netty4Transport.connectToChannels(Netty4Transport.java:370) ~[?:?]
at org.elasticsearch.transport.TcpTransport.openConnection(TcpTransport.java:495) ~[elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.transport.TcpTransport.connectToNode(TcpTransport.java:460) ~[elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:314) ~[elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.transport.TransportService.connectToNode(TransportService.java:301) ~[elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.cluster.NodeConnectionsService.validateNodeConnected(NodeConnectionsService.java:121) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.cluster.NodeConnectionsService.connectToNodes(NodeConnectionsService.java:87) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:780) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:633) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.cluster.service.ClusterService$UpdateTask.run(ClusterService.java:1117) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:544) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:238) [elasticsearch-5.3.0.jar:5.3.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:201) [elasticsearch-5.3.0.jar:5.3.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_77]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_77]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_77]
Caused by: io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection timed out: 10.2.34.36/10.2.34.36:9302
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.8.0_77]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) ~[?:1.8.0_77]
at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:346) ~[?:?]
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:340) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:630) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:527) ~[?:?]
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:481) ~[?:?]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:441) ~[?:?]
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) ~[?:?]
... 1 more

@ywelsch
Copy link
Contributor

ywelsch commented Jul 24, 2017

Can you provide the logs from the master node (s10-2-178-2.hx-2) as well? It looks like you have quite a number of cluster state updates on the master (committed version [347402]), can you also provide the _cluster/pending_tasks output from the master?

@talevy talevy added the v5.3.0 label Jul 24, 2017
@PKFresher
Copy link
Author

today , i find the same issue
#22189
#24696

I use this setting
transport.type:netty3
http.type:netty3
the problem was solved .
but i do not the reason.

@PKFresher
Copy link
Author

#22452
the reason

@onehorsetown
Copy link

For those of you from the future that ended up here like I did from Googling "failed to send join request to master", take a look at #21405. If you were a dummy like me and cloned your VM or copied an existing elasticsearch installation from one machine to another, you will get this error and also the message found existing node.

Simply delete the nodes folder on the duplicated machine (e.g /var/lib/elasticsearch/nodes), and restart the cluster.

@iShiBin
Copy link

iShiBin commented Oct 21, 2018

@onehorsetown It works! Thanks a lot.

@amishra-ic
Copy link

Superb and thanks it worked for me. I was quite helpless and stuck with "failed to send join request" error and your way to delete node helped me. Thank you so much.

--Abhay Mishra (Architect)

@JosanSun
Copy link

JosanSun commented Jun 29, 2019

This helps.
For Windows, the node file location is [ES-HOME]/bin/data/nodes. Try to delete this folder.

ES version: 6.4.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants