-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raise node disconnected even if the transport is stopped #5918
Conversation
during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case. Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects closes elastic#5918
so I worked on this a bit and I had a test as well as a fix for the test here s1monw@19af66b I didn't work further on it since we spoke but the fix you have didn't fix my test maybe you can investigate based on that? feel free to take my test as well |
one symptom of this is a test failure like this:
just for the record... |
I was testing on the One additional thing, I am wondering if we should "wait for an acceptable" time to have all the callbacks raised before we exit the stop, or node disconnect callback (when we are in stop mode)? This will ensure the service has hopefully properly stopped, and log if it didn't stop within a timeout. Btw, we do that anyhow in our ThreadPool class, so it might be enough there |
yeah I think it will be ok in the threadpool cases IMO. But I wonder if we should have more tests trigger this stuff. I need to think about it but I'd like to assert that the grace period is enough in the thread pools?! |
@s1monw aye!, that would be great to assert on in our test infra, that the ThreadPool always exits within the timeout, and if not, its something need fixing... So are we good with this change then and letting thread pool to have the graceful shutdown period? |
I will pull this in, and we can open a separate issue for thread pool assertion |
during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case. Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects closes #5918
during the stop process, we raise network disconnect, so it is valid to raise then while we are in stop mode, and actually, we should not miss any events in such a case.
Typically, this is not a problem, since its during the normal shutdown process on the JVM, but when running a reused cluster within the JVM (like in our test infra with the shared cluster), we should properly raise those node disconnects