-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ReplayStateRandomizedPropertyTest failed on message event subprocess #7778
Comments
I tried to reproduce it locally with the given seed. On version The test fails on version Edit: 🤔 the seed of a different version may result in a different test case. If the process generator changes over the versions then it may create a different process for the same seed. |
Let's put aside the issue of the generator versions (which, yeah, we should talk about 😅) - do we suspect this is a real bug that still occurs in 1.2.0? And what would be the impact? |
I checked the test case again. It is a bug in the process generator. The event subprocess should be triggered but the process instance passed already all wait states.
|
Please also look at the info/cases provided in #8266 |
I did some more digging into this test. The only way I could reproduce this was by checking out the stable 1.0 branch and use the seeds mentioned in this issue.
Since it's been a while since this specific issue occurred, and the ones of Nico and Ole are not reproducible (and don't seem to be flaky), there's not much else I can do. If this issue occurs again we should tackle it sooner so we won't have the reproduction challenges. Closing as discussed with @npepinpe |
It happened again I think, or at least something quite similar. I've saved the build forever, so please delete it once done: https://ci.zeebe.camunda.cloud/job/camunda-cloud/job/zeebe/job/staging/1517/ Test and seed: shouldRestoreStateAtEachStepInExecution[TestDataRecord{processSeed=3582242324797643651, executionPathSeed=-4134109124413832626}] – ReplayStateRandomizedPropertyTest(random-testrun) Failures
Logs
|
EDIT: Please ignore, it works as expected |
This is then a bug, correct? Could you evaluate its severity so we can triage it as such? |
Ah no sorry, it's clearly Friday. The |
Potentially interesting diff is the
It feels like we may have a race condition in when we collect the state to compare. If that would be the case, then it makes sense that the replayed state is a bit further than the processing state. It would also mean that this is not a bug, but just a flaky test. The test code seems to match my idea here, because it pauses the engine (which in turn pauses the stream processor), but doesn't verify that the stream processor has fully paused before collecting the state. The pausing of the stream processor just means it will finish processing the current command and then halt when it should normally read the next command. I think this could easily cause problems depending on how much of the command was already processed (resulting in a few state changes, but not yet all), assuming the state collection happens on the same db transaction. Which I think it does (only the query api uses its own separate transaction context). So I see 2 solutions:
I'll end my day here, but just wanted to quickly note my findings. |
@korthout lets talk about that next week |
Possibly another case: https://ci.zeebe.camunda.cloud/job/camunda-cloud/job/zeebe/job/zell-merge-appender/3/ |
8702: [Backport stable/1.3] Unflake ReplayStateRandomizedPropertyTest r=saig0 a=korthout ## Description <!-- Please explain the changes you made here. --> Backports #8698 to `stable/1.3`. Was manually cherry-picked, but no conflicts had to be resolved. ## Related issues <!-- Which issues are closed by this PR or are related --> closes #7778 Co-authored-by: Nico Korthout <nico.korthout@camunda.com> Co-authored-by: Nico Korthout <korthout@users.noreply.github.com>
8703: [Backport stable/1.2] Unflake ReplayStateRandomizedPropertyTest r=saig0 a=korthout ## Description <!-- Please explain the changes you made here. --> Backports #8698 to `stable/1.2`. Was manually cherry-picked, conflicts needed to be resolved in a9e2096. ## Related issues <!-- Which issues are closed by this PR or are related --> closes #7778 Co-authored-by: Nico Korthout <nico.korthout@camunda.com> Co-authored-by: Nico Korthout <korthout@users.noreply.github.com>
Summary
Failures
shouldRestoreStateAtEachStepInExecution[TestDataRecord{processSeed=5617794289662345787, executionPathSeed=5385515384750986693}] – ReplayStateRandomizedPropertyTest(java-testrun)
Hypotheses
Logs
Logs
The text was updated successfully, but these errors were encountered: