HDDS-5202. Use scm#checkLeader before processing client requests .#2229
HDDS-5202. Use scm#checkLeader before processing client requests .#2229bharatviswa504 merged 2 commits intoapache:masterfrom
Conversation
There was a problem hiding this comment.
Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one.
There was a problem hiding this comment.
I don't know whether ratis now has the feature of leader-lease, with which scm will not need to check this every time.
There was a problem hiding this comment.
It's not implemented yet, tracked by https://issues.apache.org/jira/browse/RATIS-1273
There was a problem hiding this comment.
The reason behind adding this check is that read requests should not be processed before becoming leader becomes ready.
The reason is We do not want to serve the reads based on not up-to-date SCM DB. For writes as this is going via ratis, this is already taken care.
There was a problem hiding this comment.
Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one
Good idea, let me update based on that.
There was a problem hiding this comment.
Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one.
@GlenGeng As we have suggested Leader handling in Client, even though it throws error NotLeader Exception to the client, but it passes the leaderAddress in the case of LeaderNotReady it will be the same node address so retry will happen on the same Node. So, I believe we are good here. (We can keep simple as we have suggested leader handling in SCM)
There was a problem hiding this comment.
public boolean checkLeader() {
// For NON-HA setup, the node will always be the leader
if (!SCMHAUtils.isSCMHAEnabled(configuration)) {
Preconditions.checkArgument(scmContext.isLeader());
return true;
} else {
// FOR HA setup, the node has to be the leader and ready to serve
// requests.
return scmContext.isLeader() && getScmHAManager().getRatisServer()
.getDivision().getInfo().isLeaderReady();
}
}
public boolean isLeaderReady() {
return this.isLeader() && RaftServerImpl.this.getRole().isLeaderReady();
}
in checkLeader , the above isLeaderReady will be called and another isLeader will be called ( not scmcontext.isLeader()), which is implemented by ratis.
scmcontext.isLeader is notified and thus changed by ratis. so, maybe here we can just only use getScmHAManager().getRatisServer().getDivision().getInfo().isLeaderReady() to checkLeader(), and I think that is enough. what do you think?
There was a problem hiding this comment.
Ya we can use getScmHAManager().getRatisServer().getDivision().getInfo().isLeaderReady() in checkLeader.
Thanks for the suggestion, I will change it.
There was a problem hiding this comment.
Updated, PTAL.
|
Thank You @GlenGeng and @JacksonYao287 for the review. |
|
+1. |
|
Thanks @bharatviswa504 for the work. Thanks @GlenGeng for the review. |
|
Thank You @GlenGeng and @JacksonYao287 for the review. |
…pache#2229) (cherry picked from commit f8a06e0) Change-Id: I514fb26992a5b562dacc1768b721940cb03c6c03
What changes were proposed in this pull request?
SCM server should start accepting requests when it is leader and isLeaderReady.
We need isLeaderReady also because Statemachine should apply all the log committed transactions to start accepting requests.
So, instead of scmContext#isLeader, use scm#checkLeader.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-5202
How was this patch tested?
Existing CI