Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ private SCMBlockLocationResponse.Builder createSCMBlockResponse(
@Override
public SCMBlockLocationResponse send(RpcController controller,
SCMBlockLocationRequest request) throws ServiceException {
if (!scm.getScmContext().isLeader()) {
if (!scm.checkLeader()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know whether ratis now has the feature of leader-lease, with which scm will not need to check this every time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not implemented yet, tracked by https://issues.apache.org/jira/browse/RATIS-1273

Copy link
Contributor Author

@bharatviswa504 bharatviswa504 May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason behind adding this check is that read requests should not be processed before becoming leader becomes ready.

The reason is We do not want to serve the reads based on not up-to-date SCM DB. For writes as this is going via ratis, this is already taken care.

Copy link
Contributor Author

@bharatviswa504 bharatviswa504 May 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one

Good idea, let me update based on that.

Copy link
Contributor Author

@bharatviswa504 bharatviswa504 May 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one.

@GlenGeng As we have suggested Leader handling in Client, even though it throws error NotLeader Exception to the client, but it passes the leaderAddress in the case of LeaderNotReady it will be the same node address so retry will happen on the same Node. So, I believe we are good here. (We can keep simple as we have suggested leader handling in SCM)

RatisUtil.checkRatisException(
scm.getScmHAManager().getRatisServer().triggerNotLeaderException(),
scm.getBlockProtocolRpcPort(), scm.getScmId());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ public StorageContainerLocationProtocolServerSideTranslatorPB(
public ScmContainerLocationResponse submitRequest(RpcController controller,
ScmContainerLocationRequest request) throws ServiceException {
// not leader or not belong to admin command.
if (!scm.getScmContext().isLeader()
if (!scm.checkLeader()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  public boolean checkLeader() {
    // For NON-HA setup, the node will always be the leader
    if (!SCMHAUtils.isSCMHAEnabled(configuration)) {
      Preconditions.checkArgument(scmContext.isLeader());
      return true;
    } else {
      // FOR HA setup, the node has to be the leader and ready to serve
      // requests.
      return scmContext.isLeader() && getScmHAManager().getRatisServer()
          .getDivision().getInfo().isLeaderReady();
    }
  }

    public boolean isLeaderReady() {
      return this.isLeader() && RaftServerImpl.this.getRole().isLeaderReady();
    }

in checkLeader , the above isLeaderReady will be called and another isLeader will be called ( not scmcontext.isLeader()), which is implemented by ratis.

scmcontext.isLeader is notified and thus changed by ratis. so, maybe here we can just only use getScmHAManager().getRatisServer().getDivision().getInfo().isLeaderReady() to checkLeader(), and I think that is enough. what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya we can use getScmHAManager().getRatisServer().getDivision().getInfo().isLeaderReady() in checkLeader.
Thanks for the suggestion, I will change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, PTAL.

&& !ADMIN_COMMAND_TYPE.contains(request.getCmdType())) {
RatisUtil.checkRatisException(
scm.getScmHAManager().getRatisServer().triggerNotLeaderException(),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1471,8 +1471,8 @@ public boolean checkLeader() {
} else {
// FOR HA setup, the node has to be the leader and ready to serve
// requests.
return scmContext.isLeader() && getScmHAManager().getRatisServer()
.getDivision().getInfo().isLeaderReady();
return getScmHAManager().getRatisServer().getDivision().getInfo()
.isLeaderReady();
}
}

Expand Down