IllegalStateException when writing decision evaluation event #9272

remcowesterhoud · 2022-05-02T15:08:18Z

Describe the bug

When trying to write the decision evaluation event an IllegalArgumentException is thrown. This is because when searching for decision by decision requirements key multiple results with the same decision id are returned:

    final var decisionKeysByDecisionId =
        decisionState
            .findDecisionsByDecisionRequirementsKey(decision.getDecisionRequirementsKey())
            .stream()
            .collect(
                Collectors.toMap(
                    persistedDecision -> bufferAsString(persistedDecision.getDecisionId()),
                    DecisionInfo::new));

These duplicate decision id cause the toMap function to fail, as no merge function is provided.

The found decisions do all have a different version.

To Reproduce

It was a challenge to reproduce this issue but I found a way to do this. It requires 2 DRD's that both contain a decision with the same id and a process which contains a business rule task referencing this decision id.

Repro files.zip

Next follow these steps:

Deploy translateDay.dmn
Deploy translateMonth.dmn
Without making any changes redeploy translateDay.dmn
Deploy translateProcess.dmn
Start a PI: zbctl create instance translateProcess --insecure --variables '{"day":"monday","month":"april"}'

At this point an exception should be thrown.

Expected behavior

No exception should occur.

Log/Stacktrace

Full Stacktrace

java.lang.IllegalStateException: Duplicate key translate (attempted merging values DecisionInfo[key=2251799813685250, version=1] and DecisionInfo[key=2251799813685255, version=3])
	at java.util.stream.Collectors.duplicateKeyException(Collectors.java:135) ~[?:?]
	at java.util.stream.Collectors.lambda$uniqKeysMapAccumulator$1(Collectors.java:182) ~[?:?]
	at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169) ~[?:?]
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1625) ~[?:?]
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509) ~[?:?]
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499) ~[?:?]
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921) ~[?:?]
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:682) ~[?:?]
	at io.camunda.zeebe.engine.processing.bpmn.behavior.BpmnDecisionBehavior.writeDecisionEvaluationEvent(BpmnDecisionBehavior.java:233) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.behavior.BpmnDecisionBehavior.lambda$evaluateDecision$3(BpmnDecisionBehavior.java:114) ~[classes/:?]
	at io.camunda.zeebe.util.Either$Right.flatMap(Either.java:366) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.behavior.BpmnDecisionBehavior.evaluateDecision(BpmnDecisionBehavior.java:109) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor$CalledDecisionBehavior.lambda$onActivate$0(BusinessRuleTaskProcessor.java:89) ~[classes/:?]
	at io.camunda.zeebe.util.Either$Right.flatMap(Either.java:366) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor$CalledDecisionBehavior.onActivate(BusinessRuleTaskProcessor.java:89) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor.onActivate(BusinessRuleTaskProcessor.java:40) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor.onActivate(BusinessRuleTaskProcessor.java:21) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processEvent$2(BpmnStreamProcessor.java:128) ~[classes/:?]
	at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:381) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processEvent(BpmnStreamProcessor.java:127) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processRecord$0(BpmnStreamProcessor.java:110) ~[classes/:?]
	at io.camunda.zeebe.util.Either$Right.ifRightOrLeft(Either.java:381) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processRecord(BpmnStreamProcessor.java:107) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.TypedRecordProcessor.processRecord(TypedRecordProcessor.java:58) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.lambda$processInTransaction$3(ProcessingStateMachine.java:300) ~[classes/:?]
	at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run(ZeebeTransaction.java:84) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.processInTransaction(ProcessingStateMachine.java:290) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.processCommand(ProcessingStateMachine.java:253) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.tryToReadNextRecord(ProcessingStateMachine.java:213) ~[classes/:?]
	at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.readNextRecord(ProcessingStateMachine.java:189) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorJob.invoke(ActorJob.java:79) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorJob.execute(ActorJob.java:44) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorTask.execute(ActorTask.java:122) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask(ActorThread.java:97) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorThread.doWork(ActorThread.java:80) ~[classes/:?]
	at io.camunda.zeebe.util.sched.ActorThread.run(ActorThread.java:189) ~[classes/:?]

Environment:

OS: Cloud
Zeebe Version: Seen on 8.0.0, also happens on latest main

The text was updated successfully, but these errors were encountered:

remcowesterhoud · 2022-05-02T15:16:40Z

My hypotheses on what happens:

translateDay is deployed. A new key is generated for the DRG. The translate decision is deployed with version 1.
translateMonth is deployed. A new key is generated for the DRG. The translate decision already exists, so the version gets incremented to version 2.
translateDay gets redeployed. There were no changed so the DRG keeps the same key. The translate decision is different from the one in version 2. The version gets incremented to version 3.
When we search for the decisions by the DRG key we will find 2 as the key didn't change on the deploy. We will find both versions 1 and 3.

It is a bit of a strange situation which occurs when there are multiple decisions containing the same id. I see a couple of solutions here:

We don't allow a DRG to be deployed if it contains decisions with an id that's already being used. This causes unpredictable behaviour anyway as it's unclear which of the decision gets called. It seems like it is always the latest deployed (highest version).
We generate a new key for the DRG when a change in one of the decision has been detected.
We add a merge function which grabs the latest version available.

Thoughts @korthout & @saig0?

remcowesterhoud · 2022-05-02T15:19:04Z

Marked as mid severity as it is an edge-case that's easily resolvable if it does occur.

korthout · 2022-05-03T14:01:21Z

Nice work figuring out what happened @remcowesterhoud 🎉 I really like it

We don't allow a DRG to be deployed if it contains decisions with an id that's already being used. This causes unpredictable behaviour anyway as it's unclear which of the decision gets called. It seems like it is always the latest deployed (highest version).

I don't think it's unpredictable behavior. We wanted it to evaluate the latest deployed version of the decision with the decision id. Similar to call activity that starts an instance of the latest deployed process with the process id. I understand that these are somewhat different because the decision is inside another layer (the DRG), but I think that keeping the behavior aligned between processes and decisions keeps things the easiest to understand for users.

In addition, we should not reject DRGs with decision ids that are already in use, because it allows a user to replace the existing decision which might be useful while we don't have undeploy decision api.

We generate a new key for the DRG when a change in one of the decisions has been detected.

Again, I would align with Process deployments. Please correct me if I'm wrong, but I think deploying a process a second time with some differences leads to the same key with a different version.

We add a merge function which grabs the latest version available.

I think we should do this as it fixes the problem without changing the key and version logic 👍

saig0 · 2022-05-04T03:28:34Z

Please correct me if I'm wrong, but I think deploying a process a second time with some differences leads to the same key with a different version.

No. The key of a process is unique. A new version of the process gets a new key.

remcowesterhoud · 2022-05-04T07:54:57Z

@saig0 Would you prefer option 2 over option 3 in this case?

saig0 · 2022-05-04T11:21:22Z

Would you prefer option 2 over option 3 in this case?

Yes. I prefer to generate a new key and increase the version of the DRG if the version of one of the containing decisions is increased.

9475: [Backport stable/8.0] Increase DRG version if another DRG was deployed with the same decisions r=saig0 a=backport-action # Description Backport of #9466 to `stable/8.0`. relates to #9272 Co-authored-by: Philipp Ossler <philipp.ossler@gmail.com>

remcowesterhoud added kind/bug Categorizes an issue or PR as a bug area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) team/process-automation labels May 2, 2022

remcowesterhoud added the severity/mid Marks a bug as having a noticeable impact but with a known workaround label May 2, 2022

npepinpe added this to the 8.1 milestone May 3, 2022

menski assigned saig0 May 13, 2022

saig0 mentioned this issue May 31, 2022

Increase DRG version if another DRG was deployed with the same decisions #9466

Merged

10 tasks

zeebe-bors-camunda bot closed this as completed in e848940 Jun 1, 2022

backport-action mentioned this issue Jun 1, 2022

[Backport stable/8.0] Increase DRG version if another DRG was deployed with the same decisions #9475

Merged

remcowesterhoud added the version:8.1.0-alpha2 label Jun 7, 2022

npepinpe added the release/8.0.4 label Jul 4, 2022

Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IllegalStateException when writing decision evaluation event #9272

IllegalStateException when writing decision evaluation event #9272

remcowesterhoud commented May 2, 2022 •

edited

remcowesterhoud commented May 2, 2022 •

edited

remcowesterhoud commented May 2, 2022

korthout commented May 3, 2022 •

edited

saig0 commented May 4, 2022

remcowesterhoud commented May 4, 2022

saig0 commented May 4, 2022

IllegalStateException when writing decision evaluation event #9272

IllegalStateException when writing decision evaluation event #9272

Comments

remcowesterhoud commented May 2, 2022 • edited

remcowesterhoud commented May 2, 2022 • edited

remcowesterhoud commented May 2, 2022

korthout commented May 3, 2022 • edited

saig0 commented May 4, 2022

remcowesterhoud commented May 4, 2022

saig0 commented May 4, 2022

remcowesterhoud commented May 2, 2022 •

edited

remcowesterhoud commented May 2, 2022 •

edited

korthout commented May 3, 2022 •

edited