Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RATIS-1446. Avoid leader election for invalid conf #560

Merged
merged 6 commits into from Dec 15, 2021

Conversation

Xushaohong
Copy link
Contributor

@Xushaohong Xushaohong commented Dec 9, 2021

What changes were proposed in this pull request?

Candidate shall not start leader election in these cases in case of possible NPE caused by conf.getPeer().getPriority().
With this patch, the candidate will become a follower again when receiving the future appendEntries request.

Phenomenon:
The bootstrapped follower becomes the leader after the election timeout somehow, with an empty raft conf. Then in the process of yieldLeaderToHigherPriorityPeer, NPE happens as (null).getPriority().
Setconfiguration action takes place at the end of appendEntriesAsync. If we make it possible to raise electiontimeout before this action, we can replay the NPE case.

1private void yieldLeaderToHigherPriorityPeer() {
2  if (!server.getInfo().isLeader()) {
3    return;
4  }
5  final RaftConfigurationImpl conf = server.getRaftConf();
6  int leaderPriority = conf.getPeer(server.getId()).getPriority(); 

Possible reason:
Network delay/loss? The leader did not send entries in time or the follower did not receive entries.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-1446

How was this patch tested?

Manual simulation in UT.

@Xushaohong
Copy link
Contributor Author

@szetszwo Actually, I am not sure whether we should make a specific check for each getPriority() use. Right now, I just prevent the NPE from the source. Currently, conf.getPeer().getPriority() only exists in leader election and leader state.

Comment on lines 586 to 591
// Candidate shall not start leader election in these cases in case of
// possible NPE caused by conf.getPeer().getPriority()
if (!getRaftConf().containsInBothConfs(getId())) {
LOG.warn("{} find invalid configuration {}, skip start leader election", this, getRaftConf());
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conf may be changed later on. Therefore, let's don't check it here and check null in LeaderElection.getHigherPriorityPeers(..) as below.

  private Set<RaftPeerId> getHigherPriorityPeers(RaftConfiguration conf) {
    final Optional<Integer> priority = Optional.ofNullable(conf.getPeer(server.getId())).map(RaftPeer::getPriority);
    return conf.getAllPeers().stream()
        .filter(peer -> priority.filter(p -> peer.getPriority() > p).isPresent())
        .map(RaftPeer::getId)
        .collect(Collectors.toSet());
  }

@szetszwo
Copy link
Contributor

szetszwo commented Dec 9, 2021

..., I am not sure whether we should make a specific check for each getPriority() use.

That's a good point. We should check. I just have walked through the code. There are two other cases needed to be fixed:

diff --git a/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderStateImpl.java b/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderStateImpl.java
index 03850114..d1413040 100644
--- a/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderStateImpl.java
+++ b/ratis-server/src/main/java/org/apache/ratis/server/impl/LeaderStateImpl.java
@@ -916,13 +916,13 @@ class LeaderStateImpl implements LeaderState {
 
     for (LogAppender logAppender : senders.getSenders()) {
       FollowerInfo followerInfo = logAppender.getFollower();
-      RaftPeerId followerID = followerInfo.getPeer().getId();
-      int followerPriority = conf.getPeer(followerID).getPriority();
-
+      final RaftPeer follower = followerInfo.getPeer();
+      final int followerPriority = follower.getPriority();
       if (followerPriority <= leaderPriority) {
         continue;
       }
 
+      final RaftPeerId followerID = follower.getId();
       final TermIndex leaderLastEntry = server.getState().getLastEntry();
       if (leaderLastEntry == null) {
         LOG.info("{} send StartLeaderElectionRequest to follower:{} on term:{} because follower's priority:{} " +
diff --git a/ratis-server/src/main/java/org/apache/ratis/server/impl/VoteContext.java b/ratis-server/src/main/java/org/apache/ratis/server/impl/VoteContext.java
index 1ef721aa..fde198fe 100644
--- a/ratis-server/src/main/java/org/apache/ratis/server/impl/VoteContext.java
+++ b/ratis-server/src/main/java/org/apache/ratis/server/impl/VoteContext.java
@@ -144,7 +144,11 @@ class VoteContext {
     }
 
     // Check priority
-    final int priority = impl.getRaftConf().getPeer(impl.getId()).getPriority();
+    final RaftPeer peer = conf.getPeer(impl.getId());
+    if (peer == null) {
+      return reject("our server " + impl.getId() + " is not in the conf");
+    }
+    final int priority = peer.getPriority();
     if (priority <= candidate.getPriority()) {
       return log(true, "our priority " + priority + " <= candidate's priority " + candidate.getPriority());
     } else {

@Xushaohong
Copy link
Contributor Author

The conf may be changed later on. Therefore, let's don't check it here and check null in LeaderElection.getHigherPriorityPeers(..) as below.

The check here in the getHigherPriorityPeers is not enough. The possible case I tested is that bootstrapped follower changes role from follower to candidate and start leader election, as it has an empty conf, it will become leader directly due to if (others.isEmpty()) { r = new ResultAndTerm(Result.PASSED, electionTerm);. Once it becomes another leader, it won't receive any requests from the original leader as the term is increased. Meanwhile, the original leader stuck in sending appendEntries to 'follower'.

I sightly changed the check place. PTAL
@szetszwo

@szetszwo
Copy link
Contributor

..., as it has an empty conf ...

This is an invalid test case since the conf (i.e. the group) is incorrect in the beginning. We must either set the group correct before startup or use AdminApi.setConfiguration to change the conf after startup.

@Xushaohong
Copy link
Contributor Author

..., as it has an empty conf ...

This is an invalid test case since the conf (i.e. the group) is incorrect in the beginning. We must either set the group correct before startup or use AdminApi.setConfiguration to change the conf after startup.

The bootstrapped follower could start with empty conf and set conf through leader's appendEntries request or Installsnapshot request. Somehow caused election timeout would let this follower start leader with empty conf.
@szetszwo

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The additional changes are good although the empty-conf test case is invalid. Some comments inlined; see also https://issues.apache.org/jira/secure/attachment/13037236/560_reivew.patch

@Xushaohong
Copy link
Contributor Author

Xushaohong commented Dec 10, 2021

Thanks for all the comments above. I have fixed all. @szetszwo

With NOT_IN_CONF check added in submitRequestAndWaitResult, RaftReconfigurationBaseTest would have some UT failed. I checked it and found it clashes with PRE_VOTE and thus added the phase check.

@szetszwo
Copy link
Contributor

@Xushaohong , thanks for the update.

TestLeaderElectionWithNetty.testTransferLeader failed. I also can reproduce it locally. Could you take a look?

@Xushaohong
Copy link
Contributor Author

@Xushaohong , thanks for the update.

TestLeaderElectionWithNetty.testTransferLeader failed. I also can reproduce it locally. Could you take a look?

Currently, the setConf request would not update the FollowerInfo in LogAppenderBase, so we have to get the latest followerInfo from conf like the previous version. I have reverted it back and added an NPE check.
@szetszwo Could you help confirm it?

@szetszwo
Copy link
Contributor

With NOT_IN_CONF check added in submitRequestAndWaitResult, RaftReconfigurationBaseTest would have some UT failed. I checked it and found it clashes with PRE_VOTE and thus added the phase check.

We should fix the RaftReconfigurationBaseTest but not having the phase check.

@@ -271,6 +272,9 @@ private boolean shouldRun(long electionTerm) {

private ResultAndTerm submitRequestAndWaitResult(Phase phase, RaftConfigurationImpl conf, long electionTerm)
throws InterruptedException {
if (!conf.containsInConf(server.getId()) && phase == Phase.ELECTION) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fix the unit tests but not adding the phase check here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should fix the unit tests but not adding the phase check here.

OK, let's find out what's wrong with UT. @szetszwo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help approve the workflow? : )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Xushaohong , the build finished. Please take a look.

@Xushaohong
Copy link
Contributor Author

I reviewed the failure UT of testRemovePeers. The fault is due to after removing the peers out of the conf, the division which should quit the raft group will become a candidate and start the election. During the prevote process, originally it will send out the request and then the other peers should send back should shutdown to turn this server down. But for our case, we just return new ResultAndTerm(Result.NOT_IN_CONF), and won't go to the SHUTDOWN branch.
@szetszwo Shall we move up the line case NOT_IN_CONF: above case SHUTDOWN or still use my original phase check?

@szetszwo
Copy link
Contributor

... Shall we move up the line case NOT_IN_CONF: above case SHUTDOWN or still use my original phase check?

This is a good idea. Let's shutdown the server when it is not in conf. Thanks.

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants