Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8494. Adjust replication queue limits for out-of-service nodes #4645

Merged
merged 7 commits into from May 4, 2023

Conversation

adoroszlai
Copy link
Contributor

What changes were proposed in this pull request?

When a datanode switches to a decommissioning state, it will adjust the size of the replication supervisor thread pool higher, and if the node returns to the In Service state, it will return to the lower thread pool limit.

Similarly when scheduling commands, SCM can allocate more commands to the decommissioning host, as it should process them more quickly due to the lower load and increased threadpool.

  • Scale the size of executor thread pool and command queue for replication in datanode if state changes between in-service and out-of-service
  • Similarly scale the limit of pending replication commands at SCM
  • Simplify TestReplicationSupervisor#testMaxQueueSize to avoid the use of thread pool (possible source of intermittent failures recently 1, 2, 3)

https://issues.apache.org/jira/browse/HDDS-8494

How was this patch tested?

Added unit test.

Tested in ozone compose environment with 6 nodes: created RATIS and EC keys, decommissioned and recommissioned one of the datanodes.

2023-05-02 16:37:20,303 [Command processor thread] INFO replication.ReplicationSupervisor: Node state updated to DECOMMISSIONING, scaling executor pool size to 20
...
2023-05-02 16:39:16,353 [Command processor thread] INFO replication.ReplicationSupervisor: Node state updated to IN_SERVICE, scaling executor pool size to 10
```

@adoroszlai adoroszlai self-assigned this May 3, 2023
@adoroszlai adoroszlai requested a review from sodonnel May 3, 2023 10:36
@adoroszlai adoroszlai marked this pull request as ready for review May 3, 2023 10:54
@sodonnel
Copy link
Contributor

sodonnel commented May 3, 2023

Change looks good. The only two suggests I have:

  1. I wonder if we should make the "scaling factor" which we increase the limit and thread pool by configurable with a default of 2? If we make it configurable, should it be a decimal rather than an integer so we can scale by 1.5, 2.5 etc? I guess the config would need to apply on both the DN and RM, so I am not 100% sure where we should define it. It would be a shame to need two configs as they could get out of sync.

  2. Might be good to also add a test based on testSendThrottledReplicateContainerCommand in the TestReplicationManager class to validate the decommissioning host is picked as a target when all nodes are over the original limit. This is kind of covered in the excluded nodes test, but excludes nodes are only updated when the new command pushes it over the limit. This test would ensure decommissioning nodes are still picked if they are over the original limit but under the extended limit.

@adoroszlai
Copy link
Contributor Author

Thanks @sodonnel for the review. Added a test for testSendThrottledReplicateContainerCommand. I'll add the config in a follow-up task, need more time to think about it.

Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for adding the test and for cleaning up the repeated code in the existing tests!

@adoroszlai adoroszlai merged commit 74adc9a into apache:master May 4, 2023
27 checks passed
@adoroszlai adoroszlai deleted the HDDS-8494 branch May 4, 2023 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants