Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-10612. Add Robot test to verify Container Balancer for RATIS containers #6457

Merged
merged 3 commits into from
Apr 2, 2024

Conversation

afilpp
Copy link
Contributor

@afilpp afilpp commented Mar 29, 2024

What changes were proposed in this pull request?

HDDS-10612. Add Robot test to verify Container Balancer for RATIS containers

Currently there are only unit tests for Container Balancer and no acceptance tests at all. At a minimum, we should add a Robot test to verify Container Balancer for RATIS containers. And probably in the future we should add robot test for EC case.

Test case:

  1. Move 1 datanode to maintenance mode (we use 4 datanodes in this test)
  2. Create multiple keys (after loading the data, we check that 3 datanodes are ~60% busy, and the one that is in maintenance mode is empty)
  3. Start datanode recommission (wait until datanode recommissioning is completed)
  4. Start container balancer (wait until container balancer is completed)
  5. Check results (after balancing on all 4 datanodes, we should see approximately the same data distribution.)

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10612

How was this patch tested?

Added Robot test

@adoroszlai adoroszlai self-requested a review March 29, 2024 13:59
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @afilpp for the patch. Overall looks good, added some minor comments.

It would be nice to create the environment as an add-on for ozone-ha instead of a completely separate one, but we can check if it's feasible in a follow-up task.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @afilpp for updating the patch, LGTM.

Copy link
Contributor

@myskov myskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @siddhantsangwan please take a look

Copy link
Contributor

@siddhantsangwan siddhantsangwan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The robot test logic LGTM.

@myskov myskov merged commit 129cdc1 into apache:master Apr 2, 2024
25 checks passed
@ivandika3
Copy link
Contributor

ivandika3 commented Apr 4, 2024

Seems there is an intermittent failure on the acceptance test

https://github.com/apache/ozone/actions/runs/8546730074/job/23418032793

@afilpp Could you take a look?

Edit: Can refer to the comment in HDDS-10612 for possible root cause.

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 16, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants