Possible Race condition on canceling a workflow instance #4352
Labels
kind/bug
Categorizes an issue or PR as a bug
scope/broker
Marks an issue or PR to appear in the broker section of the changelog
severity/mid
Marks a bug as having a noticeable impact but with a known workaround
support
Marks an issue as related to a customer support request
Describe the bug
Reported by a cloud user was that it seem that operate get out of sync. But there real problem was that the workflow instance get stuck during canceling the instance.
Imagine the following workflow:
Task B is completed and during completing and taking the next sequence flow the workflow instance is canceled by the user. What now can happen is that the canceling will not clean up correctly all scopes and the instance get stuck. In our case only the Task A and the Sub Process was terminated correctly. The multi instance and workflow instance was still alive.
To Reproduce
This is also reproducible via an engine unit test and the following process
multiBug.bpmn.txt.
Test
Be aware that this is a race condition, which means that the test might not fail on the first try.
Expected behavior
The workflow instance can be terminated without any problems.
Log/Stacktrace
We have extracted the records from the failed scenario you can find them in the
records.txt
We see that only the task and sub process are terminated and the sequence flow after the task B is taken. Actually the same sequence flow seems to be taken twice, but it has different scope ids, maybe related to the problem. Be aware that we can't share here the actual bpmn process to protect our user. So this means the output above doesn't match to our current model which we have shown in the example.
BUT if we run the test we can see similar output
output-test.txt
Environment:
The text was updated successfully, but these errors were encountered: