Fix flaky deployment recovery test #10938

remcowesterhoud · 2022-11-08T13:50:35Z

Description

The MultiPartitionDeploymentRecoveryTest is flaky. After analyzing why this is we came to the conclusion that this test is currently not testing what is intended, which would be verifying that a deployment works after recovering from a failed deployment partition.

In order to fix this flakiness we have deleted this test and replaced with a different integration test. This test verifies that a deployment gets redistributed in the event of a deployment partition restart. In order to achieve this we pause one of the partition of the cluster. This is to make sure we don't process the deployment before we stop the deployment partition.

Once we've verified that the 3rd partition has finished it's deployment as expected we will resume the second partition and restart the deployment partition. After this the test verifies that the deployment is completed as expected.

Related issues

closes #10731

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

The changes are backwards compatibility with previous versions
If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

There are unit/integration tests that verify all acceptance criterias of the issue
New tests are written to ensure backwards compatibility with further versions
The behavior is tested manually
The change has been verified by a QA run
The impact of the changes is verified by a benchmark

Documentation:

The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

This test was flaky. After analyzing why this is we came to the conclusion that this test is currently not testing what is intended, which would be verifying that a deployment works after recovering from a failed deployment partition. We still want a test for this, but it proves difficult to this in the engine tests. Instead, we will create a new integration which takes care of this.

saig0

@remcowesterhoud Looks good. 👍

I have some minor suggestions. Please have a look.

qa/integration-tests/src/test/java/io/camunda/zeebe/it/clustering/DeploymentClusteredTest.java

github-actions · 2022-11-08T14:07:17Z

Test Results

  949 files -     1   949 suites - 1 1h 38m 50s ⏱️ + 5m 39s
7 649 tests - 111 7 642 ✔️ - 88 7 💤 ±0 0 ❌ ±0
7 847 runs - 111 7 838 ✔️ - 88 9 💤 ±0 0 ❌ ±0

Results for commit f43905e. ± Comparison against base commit b1bb9e1.

♻️ This comment has been updated with latest results.

This test verifies that a deployment gets redistributed in the event of a deployment partition restart. In order to achieve this we pause one of the partition of the cluster. This is to make sure we don't process the deployment before we stop the deployment partition. Once we've verified that the 3rd partition has finished it's deployment as expected we will resume the second partition and restart the deployment partition. After this the test verifies that the deployment is completed as expected.

remcowesterhoud · 2022-11-08T14:21:20Z

bors merge

ghost · 2022-11-08T14:45:42Z

Build succeeded:

Test summary

backport-action · 2022-11-08T14:46:42Z

Successfully created backport PR #10942 for stable/8.0.

backport-action · 2022-11-08T14:46:48Z

Successfully created backport PR #10943 for stable/8.1.

10943: [Backport stable/8.1] Fix flaky deployment recovery test r=remcowesterhoud a=backport-action # Description Backport of #10938 to `stable/8.1`. relates to #10731 Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>

remcowesterhoud requested review from saig0 and korthout November 8, 2022 13:50

remcowesterhoud added backport stable/8.0 labels Nov 8, 2022

saig0 approved these changes Nov 8, 2022

View reviewed changes

qa/integration-tests/src/test/java/io/camunda/zeebe/it/clustering/DeploymentClusteredTest.java Show resolved Hide resolved

qa/integration-tests/src/test/java/io/camunda/zeebe/it/clustering/DeploymentClusteredTest.java Show resolved Hide resolved

remcowesterhoud force-pushed the 10731_deployment_recovery_test_flaky branch from 2bf1ce8 to f43905e Compare November 8, 2022 14:21

ghost merged commit 12fddeb into main Nov 8, 2022

ghost deleted the 10731_deployment_recovery_test_flaky branch November 8, 2022 14:45

backport-action mentioned this pull request Nov 8, 2022

[Backport stable/8.0] Fix flaky deployment recovery test #10942

Closed

backport-action mentioned this pull request Nov 8, 2022

[Backport stable/8.1] Fix flaky deployment recovery test #10943

Merged

remcowesterhoud removed the backport stable/8.0 label Nov 8, 2022

remcowesterhoud mentioned this pull request Nov 9, 2022

feat: support escalation end events #10914

Merged

9 tasks

remcowesterhoud added the version:8.1.4 Marks an issue as being completely or in parts released in 8.1.4 label Nov 22, 2022

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky deployment recovery test #10938

Fix flaky deployment recovery test #10938

remcowesterhoud commented Nov 8, 2022

saig0 left a comment

github-actions bot commented Nov 8, 2022 •

edited

Loading

remcowesterhoud commented Nov 8, 2022

ghost commented Nov 8, 2022

backport-action commented Nov 8, 2022

backport-action commented Nov 8, 2022

Fix flaky deployment recovery test #10938

Fix flaky deployment recovery test #10938

Conversation

remcowesterhoud commented Nov 8, 2022

Description

Related issues

Definition of Done

saig0 left a comment

Choose a reason for hiding this comment

github-actions bot commented Nov 8, 2022 • edited Loading

Test Results

remcowesterhoud commented Nov 8, 2022

ghost commented Nov 8, 2022

backport-action commented Nov 8, 2022

backport-action commented Nov 8, 2022

github-actions bot commented Nov 8, 2022 •

edited

Loading