Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch service hangs in stopping status with ReadonlyREST plugin #265

Closed
askids opened this issue Sep 4, 2017 · 29 comments
Closed

Comments

@askids
Copy link

askids commented Sep 4, 2017

hi,

I wasn't sure if this issue will be looked at by Elastic or ROR. So just adding the issue link here as well. Details are available in below link.

#26483

Thanks!

@askids
Copy link
Author

askids commented Sep 4, 2017

Just as I expected, Elastic has aksed to track it under ROR only and closed the issue. Steps to recreate the issue is present on the original link.

@askids
Copy link
Author

askids commented Sep 11, 2017

@sscarduzio is this expected behaviour for free version of ROR? I know that security changes needs a cluster restart and I assumed that it means, restarting the ES service and NOT restarting the machine itself.

Can anyone else confirm, if they face similar issue on Windows?

@sscarduzio
Copy link
Owner

No, when we say "restart node" we always mean restart the process.

About your issue, I know very little about windows, but I suppose the service restarter waits for the PID to disappear from the process list before launching the new process.
That means the Java process of ES is still there hanging.

  • Does it respond to queries when it's in this state?
  • Can you take a thread dump of the java process while in this state?

@askids
Copy link
Author

askids commented Sep 13, 2017

Ok. Let me check and get back to you on both those items.

@askids askids closed this as completed Sep 13, 2017
@askids askids reopened this Sep 13, 2017
@askids
Copy link
Author

askids commented Sep 13, 2017

dump.txt

The http end point does not work once i get the error. Even in the log, it shows as node is closed (like a normal shutdown would have). I have also attached the jstack thread dump.

[2017-09-13T01:24:31,250][INFO ][o.e.n.Node               ] [ESURPPOC2-node1] stopping ...
[2017-09-13T01:24:31,581][INFO ][o.e.x.m.j.p.l.CppLogMessageHandler] [controller/6252] [Main.cc@168] Ml controller exiting
[2017-09-13T01:24:31,586][INFO ][o.e.x.m.j.p.NativeController] Native controller process has stopped - no new native processes can be started
[2017-09-13T01:24:31,860][INFO ][o.e.n.Node               ] [ESURPPOC2-node1] stopped
[2017-09-13T01:24:31,861][INFO ][o.e.n.Node               ] [ESURPPOC2-node1] closing ...
[2017-09-13T01:24:31,926][INFO ][o.e.n.Node               ] [ESURPPOC2-node1] closed

@viceice
Copy link

viceice commented Oct 19, 2017

Same problem here:

  • Windows Server 2008 R2 SP1
  • Java 8u131
  • ES 5.4.3
  • ROR 1.6.11

@sscarduzio
Copy link
Owner

Very difficult for me to test this. I hope the OSS community will help!

@ld57
Copy link

ld57 commented Nov 14, 2017

OKay guys, let s try to get ride of this!

I will investigate using retroactive version.

Kr

fred

@ld57
Copy link

ld57 commented Nov 20, 2017

Hi guys,

i tested a released version 1.16.14_pre3 for my es 2.4.5 and @sscarduzio fixed it. :)

@askids
Copy link
Author

askids commented Nov 21, 2017

Can I get a download link for this new version? I can test it on my end on v5.5.1.

@sscarduzio
Copy link
Owner

I have released a new official version, get it on the website!

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Jan 16, 2018

I'm using the newest version (readonlyrest-1.16.14_es5.1.1) but still getting this problem myself. Is there a fix?
@sscarduzio

@sscarduzio
Copy link
Owner

@shlomi-toren-sp will have a look, what versions are you guys using at SP? Just 5.1.1?

@shlomi-toren-sp
Copy link

Yes. ROR is the only plugin we're using.

@sscarduzio
Copy link
Owner

Yes but are only only using ES 5.1.1? Or are you going to use newer versions of Elasticsearch as well?

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Jan 17, 2018 via email

@sscarduzio
Copy link
Owner

@shlomi-toren-sp I've been trying this with 5.1.x. Do you read this in the logs after giving the shutdown?

[2018-01-17T19:26:56,403][INFO ][t.b.r.c.s.e.ESShutdownObservable] Shutting down ROR resources...

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Jan 18, 2018 via email

@sscarduzio
Copy link
Owner

@shlomi-toren-sp can you get a thread dump please? I'd like to see what threads remain hanging.

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Jan 29, 2018 via email

@sscarduzio
Copy link
Owner

sscarduzio commented Jan 30, 2018

@shlomi-toren-sp you passed me a file of 114 MB, are you sure you didn't send me the heap dump? I requested the thread dump:

$ kill -3 `jps |grep lasticsearch |awk '{print $1}'`

Oh wait you are in windows. I have no idea, but probably using jstack will work.

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Jan 30, 2018 via email

@sscarduzio
Copy link
Owner

@shlomi-toren-sp if you don't have jstack, you can use JVisualVM (super useful piece of software).

screen shot 2018-01-30 at 10 41 59

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Feb 4, 2018 via email

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Feb 4, 2018 via email

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Feb 6, 2018 via email

@shlomi-toren-sp
Copy link

shlomi-toren-sp commented Feb 6, 2018 via email

@sscarduzio
Copy link
Owner

OK this is fixed, thanks Shlomi for reporting and testing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants