-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky deployment recovery test #10938
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This test was flaky. After analyzing why this is we came to the conclusion that this test is currently not testing what is intended, which would be verifying that a deployment works after recovering from a failed deployment partition. We still want a test for this, but it proves difficult to this in the engine tests. Instead, we will create a new integration which takes care of this.
saig0
approved these changes
Nov 8, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@remcowesterhoud Looks good. 👍
I have some minor suggestions. Please have a look.
qa/integration-tests/src/test/java/io/camunda/zeebe/it/clustering/DeploymentClusteredTest.java
Show resolved
Hide resolved
qa/integration-tests/src/test/java/io/camunda/zeebe/it/clustering/DeploymentClusteredTest.java
Show resolved
Hide resolved
This test verifies that a deployment gets redistributed in the event of a deployment partition restart. In order to achieve this we pause one of the partition of the cluster. This is to make sure we don't process the deployment before we stop the deployment partition. Once we've verified that the 3rd partition has finished it's deployment as expected we will resume the second partition and restart the deployment partition. After this the test verifies that the deployment is completed as expected.
remcowesterhoud
force-pushed
the
10731_deployment_recovery_test_flaky
branch
from
November 8, 2022 14:21
2bf1ce8
to
f43905e
Compare
bors merge |
Build succeeded:
|
Successfully created backport PR #10942 for |
Successfully created backport PR #10943 for |
ghost
pushed a commit
that referenced
this pull request
Nov 8, 2022
9 tasks
remcowesterhoud
added
the
version:8.1.4
Marks an issue as being completely or in parts released in 8.1.4
label
Nov 22, 2022
This pull request was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
The
MultiPartitionDeploymentRecoveryTest
is flaky. After analyzing why this is we came to the conclusion that this test is currently not testing what is intended, which would be verifying that a deployment works after recovering from a failed deployment partition.In order to fix this flakiness we have deleted this test and replaced with a different integration test. This test verifies that a deployment gets redistributed in the event of a deployment partition restart. In order to achieve this we pause one of the partition of the cluster. This is to make sure we don't process the deployment before we stop the deployment partition.
Once we've verified that the 3rd partition has finished it's deployment as expected we will resume the second partition and restart the deployment partition. After this the test verifies that the deployment is completed as expected.
Related issues
closes #10731
Definition of Done
Not all items need to be done depending on the issue and the pull request.
Code changes:
backport stable/1.3
) to the PR, in case that fails you need to create backports manually.Testing:
Documentation:
Please refer to our review guidelines.