-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10644. Intermittent failure in testBalancer.robot #6481
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the fix. LGTM.
Left some questions and comments, but there is no need to update the patch if not needed.
${output} = Execute ozone admin container list --state OPEN | ||
${output} = Execute ozone admin container list --state OPEN | ||
Should Be Empty ${output} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From ContainerBalancerSelectionCriteria#shouldBeExcluded
, the container chosen needs to be CLOSED (ContainerBalancerCriteria#isContainerClosed
). The "All container is closed" check whether there is no OPEN containers. Therefore, I think there is a chance where the container is still CLOSING
when the container balancer start, and the container will be excluded during the container balancer iteration.
However, I think the chance is very small, since the time pass between the close container command and the start of the container balancer should be large enough for the container to be closed. So it should be fine as it is now.
Execute ozone admin container close "${container}" | ||
EXIT FOR LOOP IF "${container}" == "${EMPTY}" | ||
${message} = Execute And Ignore Error ozone admin container close "${container}" | ||
Run Keyword If '${message}' != '${EMPTY}' Should Contain ${message} is in closing state | ||
${output} = Execute ozone admin container info "${container}" | ||
Should contain ${output} CLOS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason of CLOS
is to include both CLOSING
and CLOSED
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes for this test we don't need to wait for the containers to be completely closed (this may take too long). Therefore we are happy with both “closed” and “closing” statuses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afilpp for the patch, LGTM
@ivandika3 thank you for reviewing the patch |
(cherry picked from commit 6b92a37)
What changes were proposed in this pull request?
HDDS-10644. Intermittent failure in testBalancer.robot
The problem was that there was a delay between the close container event sent to event queue and container close event being processed.
To improve test stability, we need to ignore the exception due to a duplicate container close request so that it doesn't cause the acceptance test to fail.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10644
How was this patch tested?
The test passed successfully more than 10 times in a row