Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flaky MultiPartitionDeploymentLifecycleTest #10065

Merged
merged 2 commits into from
Aug 17, 2022

Conversation

remcowesterhoud
Copy link
Contributor

@remcowesterhoud remcowesterhoud commented Aug 15, 2022

Description

Test was flaky because it was asserted that the COMPLETE commands and COMPLETED events were written in order.
This is not always as the case. The COMPLETE commands are a result of partition 2 and 3 sending the command to partition 1. There is no guarantee that partition 1 will first receive both commands before it starts processing them.
As long as we assert that we receive the COMPLETE command of partition X before the COMPLETED event of partition X the test should be accurate without being flaky.

Related issues

closes #9964

Definition of Done

Not all items need to be done depending on the issue and the pull request.

Code changes:

  • The changes are backwards compatibility with previous versions
  • If it fixes a bug then PRs are created to backport the fix to the last two minor versions. You can trigger a backport by assigning labels (e.g. backport stable/1.3) to the PR, in case that fails you need to create backports manually.

Testing:

  • There are unit/integration tests that verify all acceptance criterias of the issue
  • New tests are written to ensure backwards compatibility with further versions
  • The behavior is tested manually
  • The change has been verified by a QA run
  • The impact of the changes is verified by a benchmark

Documentation:

  • The documentation is updated (e.g. BPMN reference, configuration, examples, get-started guides, etc.)
  • New content is added to the release announcement
  • If the PR changes how BPMN processes are validated (e.g. support new BPMN element) then the Camunda modeling team should be informed to adjust the BPMN linting.

Please refer to our review guidelines.

Test was flaky because it was asserted that the COMPLETE commands and COMPLETED events were written in order.
This is not always as the case. The COMPLETE commands are a result of partition 2 and 3 sending the command to partition 1. There is no guarantee that partition 1 will first receive both commands before it starts processing them.
As long as we assert that we receive the COMPLETE command of partition X before the COMPLETED event of partition X the test should be accurate without being flaky.
@remcowesterhoud remcowesterhoud marked this pull request as ready for review August 15, 2022 10:17
@github-actions
Copy link
Contributor

github-actions bot commented Aug 15, 2022

Test Results

   847 files  +    1     847 suites  +1   1h 41m 19s ⏱️ + 2m 41s
6 571 tests +210  6 559 ✔️ +209  11 💤 ±0  1 +1 
6 755 runs  +210  6 743 ✔️ +209  11 💤 ±0  1 +1 

For more details on these failures, see this check.

Results for commit 907b0e4. ± Comparison against base commit 403c25a.

♻️ This comment has been updated with latest results.

@remcowesterhoud
Copy link
Contributor Author

@pihme Just pinging you to make sure you've seen this PR

@pihme
Copy link
Contributor

pihme commented Aug 17, 2022

Thanks for the reminder. I wasn't aware of it.

Copy link
Contributor

@pihme pihme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement, but I think it can be improved further

@pihme
Copy link
Contributor

pihme commented Aug 17, 2022

Oh, and maybe we want to backport this

@remcowesterhoud
Copy link
Contributor Author

@pihme I have removed all the hardcoded indexes, please have another look!

@remcowesterhoud
Copy link
Contributor Author

We can backport them. The test wasn't flaky in previous versions though, as it was introduced with the new inter partition command sending.

Copy link
Contributor

@pihme pihme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm 🎉

@remcowesterhoud
Copy link
Contributor Author

bors merge

@zeebe-bors-camunda
Copy link
Contributor

Build succeeded:

@zeebe-bors-camunda zeebe-bors-camunda bot merged commit 099d016 into main Aug 17, 2022
@zeebe-bors-camunda zeebe-bors-camunda bot deleted the 9964_falky_test branch August 17, 2022 10:21
@backport-action
Copy link
Collaborator

Backport failed for stable/1.3, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally.

git fetch origin stable/1.3
git worktree add -d .worktree/backport-10065-to-stable/1.3 origin/stable/1.3
cd .worktree/backport-10065-to-stable/1.3
git checkout -b backport-10065-to-stable/1.3
ancref=$(git merge-base 396f1854c7da6ae6866897200af34b797379b85d 369e5a1edbcdcf08cca41b464faa7aa3602ce72b)
git cherry-pick -x $ancref..369e5a1edbcdcf08cca41b464faa7aa3602ce72b

@backport-action
Copy link
Collaborator

Successfully created backport PR #10091 for stable/8.0.

zeebe-bors-camunda bot added a commit that referenced this pull request Aug 17, 2022
10091: [Backport stable/8.0] Fix flaky MultiPartitionDeploymentLifecycleTest r=remcowesterhoud a=backport-action

# Description
Backport of #10065 to `stable/8.0`.

relates to #9964

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
zeebe-bors-camunda bot added a commit that referenced this pull request Aug 17, 2022
10098: Backport 10065 to stable/1.3 r=remcowesterhoud a=remcowesterhoud

# Description
Backport of #10065 to `stable/1.3`.

relates to #9964

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
zeebe-bors-camunda bot added a commit that referenced this pull request Aug 23, 2022
10091: [Backport stable/8.0] Fix flaky MultiPartitionDeploymentLifecycleTest r=remcowesterhoud a=backport-action

# Description
Backport of #10065 to `stable/8.0`.

relates to #9964

Co-authored-by: Remco Westerhoud <remco@westerhoud.nl>
@saig0 saig0 added release/8.0.8 version:1.3.14 Marks an issue as being completely or in parts released in 1.3.14 labels Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
version:1.3.14 Marks an issue as being completely or in parts released in 1.3.14
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flaky MultiPartitionDeploymentLifecycleTest.shouldTestLifecycle
4 participants