HDDS-5202. Use scm#checkLeader before processing client requests . by bharatviswa504 · Pull Request #2229 · apache/ozone

bharatviswa504 · 2021-05-10T10:27:11Z

What changes were proposed in this pull request?

SCM server should start accepting requests when it is leader and isLeaderReady.

We need isLeaderReady also because Statemachine should apply all the log committed transactions to start accepting requests.

So, instead of scmContext#isLeader, use scm#checkLeader.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5202

How was this patch tested?

Existing CI

GlenGeng-awx · 2021-05-10T12:27:27Z

...java/org/apache/hadoop/hdds/scm/protocol/ScmBlockLocationProtocolServerSideTranslatorPB.java

Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one.

I don't know whether ratis now has the feature of leader-lease, with which scm will not need to check this every time.

It's not implemented yet, tracked by https://issues.apache.org/jira/browse/RATIS-1273

The reason behind adding this check is that read requests should not be processed before becoming leader becomes ready.

The reason is We do not want to serve the reads based on not up-to-date SCM DB. For writes as this is going via ratis, this is already taken care.

Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one

Good idea, let me update based on that.

Should we separate NotLeaderException and LeaderNotReadyException ? client should failover for the former one, and retry the same SCM for the latter one.

@GlenGeng As we have suggested Leader handling in Client, even though it throws error NotLeader Exception to the client, but it passes the leaderAddress in the case of LeaderNotReady it will be the same node address so retry will happen on the same Node. So, I believe we are good here. (We can keep simple as we have suggested leader handling in SCM)

JacksonYao287 · 2021-05-12T07:13:34Z

.../apache/hadoop/hdds/scm/protocol/StorageContainerLocationProtocolServerSideTranslatorPB.java

public boolean checkLeader() { // For NON-HA setup, the node will always be the leader if (!SCMHAUtils.isSCMHAEnabled(configuration)) { Preconditions.checkArgument(scmContext.isLeader()); return true; } else { // FOR HA setup, the node has to be the leader and ready to serve // requests. return scmContext.isLeader() && getScmHAManager().getRatisServer() .getDivision().getInfo().isLeaderReady(); } } public boolean isLeaderReady() { return this.isLeader() && RaftServerImpl.this.getRole().isLeaderReady(); }

in checkLeader , the above isLeaderReady will be called and another isLeader will be called ( not scmcontext.isLeader()), which is implemented by ratis.

scmcontext.isLeader is notified and thus changed by ratis. so, maybe here we can just only use getScmHAManager().getRatisServer().getDivision().getInfo().isLeaderReady() to checkLeader(), and I think that is enough. what do you think?

Ya we can use getScmHAManager().getRatisServer().getDivision().getInfo().isLeaderReady() in checkLeader.
Thanks for the suggestion, I will change it.

Updated, PTAL.

bharatviswa504 · 2021-05-13T10:20:07Z

Thank You @GlenGeng and @JacksonYao287 for the review.
I have addressed/replied to the review comments.

GlenGeng-awx · 2021-05-13T12:03:50Z

+1.
Thanks @bharatviswa504 for the work. Thanks @JacksonYao287 for the review. Waiting for CI.

JacksonYao287 · 2021-05-13T12:23:52Z

Thanks @bharatviswa504 for the work. Thanks @GlenGeng for the review.
LGTM. +1

bharatviswa504 · 2021-05-13T14:19:52Z

Thank You @GlenGeng and @JacksonYao287 for the review.

…pache#2229) (cherry picked from commit f8a06e0) Change-Id: I514fb26992a5b562dacc1768b721940cb03c6c03

bharatviswa504 requested a review from bshashikant May 10, 2021 10:27

bharatviswa504 added the scm-ha label May 10, 2021

GlenGeng-awx reviewed May 10, 2021

View reviewed changes

JacksonYao287 reviewed May 12, 2021

View reviewed changes

bharatviswa504 added 2 commits May 13, 2021 15:48

HDDS-5202. Use scm#checkLeader before processing client requests .

df1000c

fix review comments

0983aaf

bharatviswa504 force-pushed the HDDS-5202 branch from 7741c01 to 0983aaf Compare May 13, 2021 10:18

bharatviswa504 requested a review from GlenGeng-awx May 13, 2021 10:18

bharatviswa504 merged commit f8a06e0 into apache:master May 13, 2021

Conversation

bharatviswa504 commented May 10, 2021

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

GlenGeng-awx May 10, 2021

Choose a reason for hiding this comment

Uh oh!

JacksonYao287 May 10, 2021

Choose a reason for hiding this comment

Uh oh!

GlenGeng-awx May 11, 2021

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 May 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 May 13, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JacksonYao287 May 12, 2021

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 May 13, 2021

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 May 13, 2021

Choose a reason for hiding this comment

Uh oh!

bharatviswa504 commented May 13, 2021

Uh oh!

GlenGeng-awx commented May 13, 2021

Uh oh!

JacksonYao287 commented May 13, 2021

Uh oh!

bharatviswa504 commented May 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

bharatviswa504 May 11, 2021 •

edited

Loading

bharatviswa504 May 11, 2021 •

edited

Loading

bharatviswa504 May 13, 2021 •

edited

Loading