HDDS-10972. Reduce the default watch timeout configuration in DatanodeRatisServerConfig #6772

ivandika3 · 2024-06-04T10:35:19Z

What changes were proposed in this pull request?

Reduce the default duration of Ratis server watch timeout hdds.ratis.raft.server.watch.timeout to reduce the amount of time client needs to wait during two-way commit (i.e. one DN is not reachable).

The watch timeout configuration affects the upper bound of how long the watch request will wait during a block write, let's set it to 10s as a good starting point.

See the related tickets for context.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10972

How was this patch tested?

Existing tests (thanks for @smengcl for validating the CI earlier). Clean CI: https://github.com/ivandika3/ozone/actions/runs/9365515025

Tested with ozone-ha docker-compose environment with one DN down (using "docker container pause")

Without watch timeout configuration change -> client needs to wait for 3 minutes (the default hdds.ratis.raft.server.watch.timeout configuration)

time ozone sh key put /vol1/bucket1/key1 ~/LICENSE.txt --type RATIS -r THREE
real 3m3.037s
user 0m9.957s
sys 0m1.106s

hdds.ratis.raft.server.watch.timeout=30s -> client needs to wait for around 30s

time ozone sh key put /vol1/bucket1/key1 ~/LICENSE.txt --type RATIS -r THREE
real 0m33.897s
user 0m9.990s
sys 0m1.346s

hdds.ratis.raft.server.watch.timeout=10s -> client needs to wait for around 10s

time ozone sh key put /vol1/bucket1/key1 ~/LICENSE.txt --type RATIS -r THREE
real	0m13.319s
user	0m9.966s
sys	0m1.102s

…eRatisServerConfig

ChenSammi · 2024-06-05T05:17:11Z

@ivandika3 , thanks for proposing this change. I also feel 180s is a little bit too long. It's good to change it to a smaller default value. But this 10s seems too aggressive. 10s is too easy to timeout in a write heavy environment, with all three healthy DN, which makes most of write requests watch will fall back to MAJORITY_COMMITTED watch.

ChenSammi · 2024-06-05T05:18:37Z

hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/conf/DatanodeRatisServerConfig.java

  )
-  private long watchTimeOut = Duration.ofSeconds(180).toMillis();
+  private long watchTimeOut = Duration.ofSeconds(10).toMillis();


BTW, it looks like this getWatchTimeOut() is not called anywhere.

Thanks for the review. Yes, getWatchTimeOut is not called anywhere, which was confusing me at first.

However, the configuration hdds.ratis.raft.server.watch.timeout will be processed in RatisHelper#createRaftServerProperties, which will trim the hdds.ratis. prefix and set it to the Ratis properties. Checked previously that changing the configuration actually updated the Ratis server watch timeout configuration.

This means that some calls to DatanoteRatisServerConfig in XceiverServerRatis might not be necessary (e.g. , DatanoteRatisServerConfig#getClientPoolSize). However, we need to double check before removing these calls since some of the DatanodeRatisServerConfig property keys do not correspond to any Raft server property keys and need to be set explciitly (e.g. DatanodeRatisServerConfig#getStreamRequestThreads).

We can remove the getWatchTimeOut if needed.

I see. Thanks for the explanation. It's OK to keep the getWatchTimeOut.

ivandika3 · 2024-06-05T05:49:39Z

@ChenSammi Thanks for the review.

But this 10s seems too aggressive. 10s is too easy to timeout in a write heavy environment, with all three healthy DN, which makes most of write requests watch will fall back to MAJORITY_COMMITTED watch

I see, we can increase it to higher value (e.g. 30s). Currently I don't think our cluster has encountered a workload situation where a normal block write can last more than 10s. Could you help to suggest a reasonable value based on the write-heavy workload? Thank you.

szetszwo

@ivandika3 , thanks a lot for working on this!

+1 the change looks good.

ChenSammi · 2024-06-05T08:18:00Z

@ChenSammi Thanks for the review.

But this 10s seems too aggressive. 10s is too easy to timeout in a write heavy environment, with all three healthy DN, which makes most of write requests watch will fall back to MAJORITY_COMMITTED watch

I see, we can increase it to higher value (e.g. 30s). Currently I don't think our cluster has encountered a situation where a single block write can last more than 10s. Could you help to suggest a reasonable value based on the write-heavy workload? Thank you.

Here are some data I got when I did some stress writing test times ago,

Based on these data, I would suggest use 30s too.

ivandika3 · 2024-06-05T08:24:18Z

@ChenSammi Thank you for the providing the data. Updated to 30s. PTAL.

ChenSammi

Thanks @ivandika3 .

ivandika3 · 2024-06-06T01:27:45Z

Thank you for the reviews @szetszwo @ChenSammi , merged.

ChenSammi · 2024-06-06T03:22:20Z

@ivandika3, after the patch is merged, you can resolve the JIRA https://issues.apache.org/jira/browse/HDDS-10972 and add "1.5.0" as the fixed version.

ivandika3 · 2024-06-06T03:24:34Z

Thanks for the reminder @ChenSammi . Updated the ticket.

…eRatisServerConfig (apache#6772)

HDDS-10972. Reduce the default watch timeout configuration in Datanod…

344ce0e

…eRatisServerConfig

ivandika3 marked this pull request as ready for review June 4, 2024 12:50

ChenSammi reviewed Jun 5, 2024

View reviewed changes

ChenSammi requested a review from szetszwo June 5, 2024 06:05

szetszwo approved these changes Jun 5, 2024

View reviewed changes

Update the default timeout to 30s

28f39f3

ChenSammi approved these changes Jun 5, 2024

View reviewed changes

ivandika3 merged commit d11752f into apache:master Jun 6, 2024
39 checks passed

ivandika3 self-assigned this Jun 6, 2024

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Jun 17, 2024

HDDS-10972. Reduce the default watch timeout configuration in Datanod…

65efe49

…eRatisServerConfig (apache#6772)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-10972. Reduce the default watch timeout configuration in DatanodeRatisServerConfig #6772

HDDS-10972. Reduce the default watch timeout configuration in DatanodeRatisServerConfig #6772

ivandika3 commented Jun 4, 2024 •

edited

Loading

ChenSammi commented Jun 5, 2024

ChenSammi Jun 5, 2024

ivandika3 Jun 5, 2024

ChenSammi Jun 5, 2024

ivandika3 commented Jun 5, 2024 •

edited

Loading

szetszwo left a comment

ChenSammi commented Jun 5, 2024

ivandika3 commented Jun 5, 2024

ChenSammi left a comment

ivandika3 commented Jun 6, 2024

ChenSammi commented Jun 6, 2024

ivandika3 commented Jun 6, 2024

HDDS-10972. Reduce the default watch timeout configuration in DatanodeRatisServerConfig #6772

HDDS-10972. Reduce the default watch timeout configuration in DatanodeRatisServerConfig #6772

Conversation

ivandika3 commented Jun 4, 2024 • edited Loading

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

ChenSammi commented Jun 5, 2024

ChenSammi Jun 5, 2024

Choose a reason for hiding this comment

ivandika3 Jun 5, 2024

Choose a reason for hiding this comment

ChenSammi Jun 5, 2024

Choose a reason for hiding this comment

ivandika3 commented Jun 5, 2024 • edited Loading

szetszwo left a comment

Choose a reason for hiding this comment

ChenSammi commented Jun 5, 2024

ivandika3 commented Jun 5, 2024

ChenSammi left a comment

Choose a reason for hiding this comment

ivandika3 commented Jun 6, 2024

ChenSammi commented Jun 6, 2024

ivandika3 commented Jun 6, 2024

ivandika3 commented Jun 4, 2024 •

edited

Loading

ivandika3 commented Jun 5, 2024 •

edited

Loading