Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZOOKEEPER-4203: Leader swallows the ZooKeeperServer.State.ERROR from Leader.LearnerCnxAcceptor in some concurrency condition #1596

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

functioner
Copy link

The fix of ZOOKEEPER-4203 is implemented. In my local machine, it is able to pass all test cases.

…Leader.LearnerCnxAcceptor in some concurrency condition
@functioner
Copy link
Author

I believe that in the CI server, multiple test cases fail due to the network connection binding issue, and the QuorumMainTest fails because the ZK snapshot is compromised by some test cases that have failed. Thus, I think this patch can be merge into master if the reviewers agree with my design.

@tisonkun
Copy link
Member

retest please

1 similar comment
@tisonkun
Copy link
Member

retest please

@eolivelli
Copy link
Contributor

@tisonkun I am sorry but the magic word does not work anymore.
Let me force a rerun manually

@eolivelli
Copy link
Contributor

I have restarted the github actions job, but the Jenkins job is too old.
I guess you have to rebase/merge with current master and push

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a part from CI,
I believe that this fix is not correct.

We have to deal with all of the accesses to the state field.
With this change you are only working on this method.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @functioner for sharing your work.

I hope we will get the fix soon and see it merged to master branch

synchronized (stateChangeMutex) {
if (this.state == State.ERROR) {
if (state == State.RUNNING || state == State.INITIAL) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should add a relevant log in order to see if this case is happening.

Creating a test that reproduces the problem and demonstrates that this fix actually resolves the problem would be better

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add the test, do we need to consider using a fault injection tool? See ZOOKEEPER-3601 and #1135. I have provided Byteman injection script in ZOOKEEPER-4203 so it can be somehow translated into a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with @eolivelli on adding a LOG.warn (if not LOG.error).

For the test, I would suggest mocking and/or overriding some methods to explicitly generate the fault, if possible. I haven't tried doing so, but perhaps LearnerTest.java (which subclasses LearnerZooKeeperServer) can serve as inspiration?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with @eolivelli on adding a LOG.warn (if not LOG.error).

Okay, I will add the log later.

For the test, I would suggest mocking and/or overriding some methods to explicitly generate the fault, if possible. I haven't tried doing so, but perhaps LearnerTest.java (which subclasses LearnerZooKeeperServer) can serve as inspiration?

Okay, I will take a try.
BTW, I think introducing a fault injection tool like #1135 may make it easier to write a test for it, and is helpful for the community. Another bug I propose (#1582) also require such tool to reproduce/test. I commented at #1135 but haven't got response. Do you have any suggestion or opinion? Thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @functioner, I am very interested in this problem, I have written a test case, I don't konw if it can be used to test this problem.Maybe you can use it as a reference.
TestCase
Byteman btm script

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to use stateChangeMutex to handle concurrent access to state you have to use it at every read/write access.

Using AtomicReference is better and easier to understand.
I suggest to switch to AtomicReference and drop stateChangeMutex

@@ -189,9 +189,18 @@ public void dumpConf(PrintWriter pwriter) {
pwriter.print(self.getQuorumVerifier().toString());
}

private final Object stateChangeMutex = new Object();

@Override
protected void setState(State state) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a closer look I find that setState can be a single implementation in ZooKeeperServer. Just as is there. And you can apply this check as well as logging to verify as @eolivelli said.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @eolivelli that setState() has to be "careful," because ZooKeeperServerListenerImpl.notifyStopping calls it from "random" threads.

But @Tison's point still holds; it seems to me that this could be simplified by making setState() a synchronized method. Do you have a specific reason of using a separate object?

(The parent setState() does not need to be synchronized because the field is volatile.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But @Tison's point still holds; it seems to me that this could be simplified by making setState() a synchronized method. Do you have a specific reason of using a separate object?

(The parent setState() does not need to be synchronized because the field is volatile.)

@ztzg The rationale for not making setState() a synchronized method is that if a thread is running another synchronized method of this ZooKeeperServer object for some time, it may block the setState() invoked by another thread, and this invocation may be critical.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @eolivelli that setState() has to be "careful," because ZooKeeperServerListenerImpl.notifyStopping calls it from "random" threads.

Actually, the single-node ZooKeeperServer can handle this issue with the handler:

protected void setState(State state) {
this.state = state;
// Notify server state changes to the registered shutdown handler, if any.
if (zkShutdownHandler != null) {
zkShutdownHandler.handle(state);
} else {
LOG.debug(
"ZKShutdownHandler is not registered, so ZooKeeper server"
+ " won't take any action on ERROR or SHUTDOWN server state changes");
}
}

because it is registered at the very beginning:
zkServer.registerServerShutdownHandler(new ZooKeeperServerShutdownHandler(shutdownLatch));

This handling logic is:
public void handle(State state) {
if (state == State.ERROR || state == State.SHUTDOWN) {
shutdownLatch.countDown();
}
}

and then
// Watch status of ZooKeeper server. It will do a graceful shutdown
// if the server is not running or hits an internal error.
shutdownLatch.await();
shutdown();

However, in the case of QuorumZooKeeperServer, including Leader, Follower, etc., the aforementioned logic is not used.

Copy link
Contributor

@ztzg ztzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @functioner,

Thank you for the comprehensive report, and for the patch!

I could "manually" reproduce the problem, and can confirm that your proposed change prevents the system from getting stuck.

I have left a couple of comments on the current implementation.

One thing I am wondering (which was not introduced by your code) is why we catch and continue on SaslExceptions, but "crash" on other IOExceptions? Not a major point as the server will now recover, but perhaps we could just accept that accept can fail? @eolivelli, @symat?

@@ -189,9 +189,18 @@ public void dumpConf(PrintWriter pwriter) {
pwriter.print(self.getQuorumVerifier().toString());
}

private final Object stateChangeMutex = new Object();

@Override
protected void setState(State state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @eolivelli that setState() has to be "careful," because ZooKeeperServerListenerImpl.notifyStopping calls it from "random" threads.

But @Tison's point still holds; it seems to me that this could be simplified by making setState() a synchronized method. Do you have a specific reason of using a separate object?

(The parent setState() does not need to be synchronized because the field is volatile.)

synchronized (stateChangeMutex) {
if (this.state == State.ERROR) {
if (state == State.RUNNING || state == State.INITIAL) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

100% agree with @eolivelli on adding a LOG.warn (if not LOG.error).

For the test, I would suggest mocking and/or overriding some methods to explicitly generate the fault, if possible. I haven't tried doing so, but perhaps LearnerTest.java (which subclasses LearnerZooKeeperServer) can serve as inspiration?

Copy link
Contributor

@ztzg ztzg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry; "Approved" by accident. See my comments.)

@functioner
Copy link
Author

functioner commented Mar 6, 2021

One thing I am wondering (which was not introduced by your code) is why we catch and continue on SaslExceptions, but "crash" on other IOExceptions? Not a major point as the server will now recover, but perhaps we could just accept that accept can fail? @eolivelli, @symat?

@ztzg I think it can be explained with:

} catch (SocketException e) {
error = true;
if (stop.get()) {
LOG.warn("Exception while shutting down acceptor.", e);
} else {
throw e;
}
} catch (SaslException e) {
LOG.error("Exception while connecting to quorum learner", e);
error = true;
} catch (Exception e) {
error = true;
throw e;
} finally {

In the case of SaslException, the exception is not thrown again, then it will not be caught by:
} catch (Exception e) {
LOG.warn("Exception while accepting follower", e);
if (fail.compareAndSet(false, true)) {
handleException(getName(), e);
halt();
}
} finally {

In this case, ZooKeeperServer.State.ERROR is not set, everything works well temporally, and the critical code

finishes, so, if there is any error later, it can be detect by:

Then, the quorum can handle it well.

However, if the IOException is not SaslException, it will be thrown again by (line 528 or 535):

} catch (SocketException e) {
error = true;
if (stop.get()) {
LOG.warn("Exception while shutting down acceptor.", e);
} else {
throw e;
}
} catch (SaslException e) {
LOG.error("Exception while connecting to quorum learner", e);
error = true;
} catch (Exception e) {
error = true;
throw e;
} finally {

In this case, the ZooKeeperServer.State.ERROR may be set before the critical code:

Then this error state is covered by the running state. The symptom described by this issue occurs.

@lanicc
Copy link
Contributor

lanicc commented Mar 21, 2021

I think the exception in Leader.LearnerCnxAcceptor should not be handled by zkServer, but should be handled internally by Leader.Because Acceptor and zkServer are not closely related, both Acceptor and zkServer are used as internal components of Leader, and Leader combines them to complete the function of leader.
When an exception occurs in Leader.LearnerCnxAcceptor,the error status should be fed back to Leader, that is, Leader.isRunning can find this error status.
My solution:

  • Leader.isRunning
    image
  • Make Leader.LearnerCnxAcceptor extends ZooKeeperThread
    image

Follower can do the same.

@functioner
Copy link
Author

Follower can do the same.

@lanicc, IMO, the root problem this issue exposes is that the exception caught by ZooKeeperCriticalThread is translated into the state change in zkServer. Even if you deal with Acceptor separately in Leader. There could be other similar issues in ZooKeeperCriticalThread, affecting Leader or Follower. Either way, the logic of state change needs improvement, either in zkServer, or in Leader/Follower as you proposed. Maybe the fix you provided here is reasonable, but I think it should be a separate PR, dealing with Acceptor. The current PR should focus on improving the state change logic.

What do you think? @eolivelli @ztzg

@functioner
Copy link
Author

I've added LOG.warn when this case is happening. You can take a look @eolivelli @ztzg
@lanicc Thanks for the test case you provide! It inspires me a lot.
I implemented a test case without using Byteman. It's basically overriding lots of methods; thanks @ztzg for the idea.
Anybody can approve the CI workflow?

@functioner
Copy link
Author

@maoling Sorry to bother you. But the other guys might be busy recently. Can you help review my patch & test case? And approve the CI workflow? Thank you!

synchronized (stateChangeMutex) {
if (this.state == State.ERROR) {
if (state == State.RUNNING || state == State.INITIAL) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you want to use stateChangeMutex to handle concurrent access to state you have to use it at every read/write access.

Using AtomicReference is better and easier to understand.
I suggest to switch to AtomicReference and drop stateChangeMutex

@functioner
Copy link
Author

functioner commented Dec 2, 2021

@eolivelli The state check & modification within setState method should be atomic as a whole, so AtomicReference is not enough (including the updateAndGet methods and so on). Now I use synchronized for the methods which read/write state.

@@ -457,7 +461,7 @@ public void run() {
CountDownLatch latch = new CountDownLatch(serverSockets.size());

serverSockets.forEach(serverSocket ->
executor.submit(new LearnerCnxAcceptorHandler(serverSocket, latch)));
executor.submit(createLearnerCnxAcceptorHandler(serverSocket, latch)));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FutureReturnValueIgnored: Return value of methods returning Future must be checked. Ignoring returned Futures suppresses exceptions thrown from the code that completes the Future. (details)
(at-me in a reply with help or ignore)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sonatype-lift ignore

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've recorded this as ignored for this pull request. If you change your mind, just comment @sonatype-lift unignore.

@functioner
Copy link
Author

@eolivelli @ztzg I'm following up on whether we will merge or improve this patch soon? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants