Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails to shutdown elasticsearch on YARN #658

Closed
jzillmann opened this issue Jan 13, 2016 · 4 comments
Closed

Fails to shutdown elasticsearch on YARN #658

jzillmann opened this issue Jan 13, 2016 · 4 comments

Comments

@jzillmann
Copy link

Hi there,

when stoping a elastic search cluster on YARN (apache-2.6) the YARN app itself is killed but it leaves the ElasticSearch process running. Turns out that in ApplicationMaster#close() the rpc.finishAM(); fails with

6/01/13 11:59:18 WARN ipc.Client: Exception encountered while connecting to the server : org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1452523578157_0043_000001 not found in AMRMTokenSecretManager. Exception in thread "main" org.elasticsearch.hadoop.yarn.am.EsYarnAmException: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1452523578157_0043_000001 not found in AMRMTokenSecretManager. at org.elasticsearch.hadoop.yarn.am.AppMasterRpc.unregisterAM(AppMasterRpc.java:73) at org.elasticsearch.hadoop.yarn.am.AppMasterRpc.finishAM(AppMasterRpc.java:66) at org.elasticsearch.hadoop.yarn.am.ApplicationMaster.close(ApplicationMaster.java:92) at org.elasticsearch.hadoop.yarn.am.ApplicationMaster.main(ApplicationMaster.java:106) Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: appattempt_1452523578157_0043_000001 not found in AMRMTokenSecretManager. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53) at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:104) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:94) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy8.finishApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.unregisterApplicationMaster(AMRMClientImpl.java:378) at org.elasticsearch.hadoop.yarn.am.AppMasterRpc.unregisterAM(AppMasterRpc.java:71) ... 3 more Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): appattempt_1452523578157_0043_000001 not found in AMRMTokenSecretManager. at org.apache.hadoop.ipc.Client.call(Client.java:1468) at org.apache.hadoop.ipc.Client.call(Client.java:1399) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) at com.sun.proxy.$Proxy7.finishApplicationMaster(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.finishApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:91) ... 12 more

and thus cluster.close(); is never called.

@costin
Copy link
Member

costin commented Jan 13, 2016

What version of ES-Hadoop/YARN are you using and what Hadoop distro?

@jzillmann
Copy link
Author

Its elasticsearch-yarn-2.1.2 and apache-2.6.0.

costin added a commit that referenced this issue Jan 16, 2016
Pick up new Tokens used after registration
Document sys.prop option
Update requirements

relates #658
@costin
Copy link
Member

costin commented Jan 16, 2016

Hi,

I've tested the yarn integration on a Hadoop 2.7.1 distro but couldn't reproduce the problem. However it's a pseudo-configuration so that might explain it. Fwiw, note that there are several important bugs regarding tokens in 2.6 so upgrading to the latest 2.6 (2.6.2) or 2.7.1 might fix some things.
In addition, I have also changed the code to make sure the cluster is first stopped before updating the status to the AM. In addition the token management has been improved and hopefully it addresses your problem.

The fix is in master and will be available in the next dev build. Will be ported over the weekend to 2.1.3

@costin costin closed this as completed Jan 16, 2016
@jzillmann
Copy link
Author

That sounds awesome thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants