Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch processing might drop responses #11848

Closed
korthout opened this issue Feb 28, 2023 · 4 comments · Fixed by #11865
Closed

Batch processing might drop responses #11848

korthout opened this issue Feb 28, 2023 · 4 comments · Fixed by #11865
Assignees
Labels
area/observability Marks an issue as observability related area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) component/stream-platform kind/bug Categorizes an issue or PR as a bug severity/high Marks a bug as having a noticeable impact on the user with no known workaround target:8.2 Issue must be completed before this target release version:8.1.10 Marks an issue as being completely or in parts released in 8.1.10 version:8.2.0 Marks an issue as being completely or in parts released in 8.2.0

Comments

@korthout
Copy link
Member

korthout commented Feb 28, 2023

Describe the bug

I encountered this bug when a test in zeebe-process-test became flaky.

The test creates a process instance and waits for the result. The process instance can only be completed by publishing a message. Depending on which arrives at the engine first, the publish message rpc receives a response.

After talking to @Zelldon and @oleschoenburg about this, we believe this problem arises because of batch processing. The implementation assumes that only a single response will be written by the Engine in a batch of processed commands. This assumption is invalid.

To Reproduce

@Test
void shouldPublishMessage() {
  // given
  zeebeClient
      .newDeployCommand()
      .addProcessModel(
          Bpmn.createExecutableProcess("process")
              .startEvent()
              .intermediateCatchEvent()
              .message(message -> message.name("a").zeebeCorrelationKeyExpression("key"))
              .endEvent()
              .done(),
          "process.bpmn")
      .send()
      .join();

  final ZeebeFuture<ProcessInstanceResult> processInstanceResult =
      zeebeClient
          .newCreateInstanceCommand()
          .bpmnProcessId("process")
          .latestVersion()
          .variables(Map.of("key", "key-1"))
          .withResult()
          .send();

  // when
  zeebeClient
      .newPublishMessageCommand()
      .messageName("a")
      .correlationKey("key-1")
      .variables(Map.of("message", "correlated"))
      .send()
      // test will fail here due to timeout (no response is written)
      .join();

  // then
  assertThat(processInstanceResult.join().getVariablesAsMap())
      .containsEntry("message", "correlated");


  RecordStream.of(zeebeEngine.getRecordStreamSource()).print(true);
}

Expected behavior

🏗️ With batch processing enabled, a call to publish a message should be responded to, even during the processing of a call to create process instance with result. I.e., batch processing should allow sending multiple responses (at max one per processed command).

🔧 We should have a test for this case in Zeebe, not just in zeebe-process-test.

Log/Stacktrace

Full Stacktrace

===== Test failed! Printing records from the stream:
[main] INFO io.camunda.zeebe.process.test.filters.logger.IncidentLogger - 
[main] INFO io.camunda.zeebe.process.test.filters.logger.RecordStreamLogger - 
The following records have been recorded during this test:
| COMMAND             DEPLOYMENT                         CREATE                        | (Processes: [process.bpmn])
| EVENT               PROCESS                            CREATED                       | (Process: process.bpmn)
| EVENT               DEPLOYMENT                         CREATED                       | (Processes: [process.bpmn])
| EVENT               DEPLOYMENT                         FULLY_DISTRIBUTED             | 
| COMMAND             PROCESS_INSTANCE_CREATION          CREATE_WITH_AWAITING_RESULT   | (Process id: process), (Variables: [key -> key-1]), (default start)
| COMMAND             MESSAGE                            PUBLISH                       | (Message name: a), (Correlation key: key-1), (Variables: [message -> correlated])
| EVENT               VARIABLE                           CREATED                       | (Name: key), (Value: "key-1")
| COMMAND             PROCESS_INSTANCE                   ACTIVATE_ELEMENT              | (Element id: process), (Element type: PROCESS), (Event type: UNSPECIFIED), (Process id: process)
| EVENT               PROCESS_INSTANCE_CREATION          CREATED                       | (Process id: process), (Variables: [key -> key-1]), (default start)
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATING            | (Element id: process), (Element type: PROCESS), (Event type: UNSPECIFIED), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATED             | (Element id: process), (Element type: PROCESS), (Event type: UNSPECIFIED), (Process id: process)
| COMMAND             PROCESS_INSTANCE                   ACTIVATE_ELEMENT              | (Element id: startEvent_c1396063-2126-4e7d-9b23-728433976c77), (Element type: START_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATING            | (Element id: startEvent_c1396063-2126-4e7d-9b23-728433976c77), (Element type: START_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATED             | (Element id: startEvent_c1396063-2126-4e7d-9b23-728433976c77), (Element type: START_EVENT), (Event type: NONE), (Process id: process)
| COMMAND             PROCESS_INSTANCE                   COMPLETE_ELEMENT              | (Element id: startEvent_c1396063-2126-4e7d-9b23-728433976c77), (Element type: START_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETING            | (Element id: startEvent_c1396063-2126-4e7d-9b23-728433976c77), (Element type: START_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETED             | (Element id: startEvent_c1396063-2126-4e7d-9b23-728433976c77), (Element type: START_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   SEQUENCE_FLOW_TAKEN           | (Element id: sequenceFlow_044ea2f3-d53b-42d3-80b0-12a42e8c0fb9), (Element type: SEQUENCE_FLOW), (Event type: UNSPECIFIED), (Process id: process)
| COMMAND             PROCESS_INSTANCE                   ACTIVATE_ELEMENT              | (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Element type: INTERMEDIATE_CATCH_EVENT), (Event type: MESSAGE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATING            | (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Element type: INTERMEDIATE_CATCH_EVENT), (Event type: MESSAGE), (Process id: process)
| EVENT               PROCESS_MESSAGE_SUBSCRIPTION       CREATING                      | (Message name: a), (Correlation key: key-1), (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), 
| COMMAND             MESSAGE_SUBSCRIPTION               CREATE                        | (Message name: a), (Correlation key: key-1), 
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATED             | (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Element type: INTERMEDIATE_CATCH_EVENT), (Event type: MESSAGE), (Process id: process)
| EVENT               MESSAGE_SUBSCRIPTION               CREATED                       | (Message name: a), (Correlation key: key-1), 
| COMMAND             PROCESS_MESSAGE_SUBSCRIPTION       CREATE                        | (Message name: a), 
| EVENT               PROCESS_MESSAGE_SUBSCRIPTION       CREATED                       | (Message name: a), (Correlation key: key-1), (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), 
| EVENT               MESSAGE                            PUBLISHED                     | (Message name: a), (Correlation key: key-1), (Variables: [message -> correlated])
| EVENT               MESSAGE_SUBSCRIPTION               CORRELATING                   | (Message name: a), (Correlation key: key-1), (Variables: [message -> correlated])
| COMMAND             PROCESS_MESSAGE_SUBSCRIPTION       CORRELATE                     | (Message name: a), (Variables: [message -> correlated])
| EVENT               PROCESS_MESSAGE_SUBSCRIPTION       CORRELATED                    | (Message name: a), (Correlation key: key-1), (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Variables: [message -> correlated])
| EVENT               PROCESS_EVENT                      TRIGGERING                    | (Target element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Variables: [message -> correlated])
| COMMAND             PROCESS_INSTANCE                   COMPLETE_ELEMENT              | (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Element type: INTERMEDIATE_CATCH_EVENT), (Event type: MESSAGE), (Process id: process)
| COMMAND             MESSAGE_SUBSCRIPTION               CORRELATE                     | (Message name: a), (Correlation key: ), 
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETING            | (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Element type: INTERMEDIATE_CATCH_EVENT), (Event type: MESSAGE), (Process id: process)
| EVENT               VARIABLE                           CREATED                       | (Name: message), (Value: "correlated")
| EVENT               PROCESS_MESSAGE_SUBSCRIPTION       DELETING                      | (Message name: a), (Correlation key: key-1), (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), 
| COMMAND             MESSAGE_SUBSCRIPTION               DELETE                        | (Message name: a), (Correlation key: ), 
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETED             | (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), (Element type: INTERMEDIATE_CATCH_EVENT), (Event type: MESSAGE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   SEQUENCE_FLOW_TAKEN           | (Element id: sequenceFlow_ab3a104b-84af-4391-96f0-8127e429aa1e), (Element type: SEQUENCE_FLOW), (Event type: UNSPECIFIED), (Process id: process)
| COMMAND             PROCESS_INSTANCE                   ACTIVATE_ELEMENT              | (Element id: endEvent_e95246b3-d75f-4935-8025-7e47a039acff), (Element type: END_EVENT), (Event type: NONE), (Process id: process)
| EVENT               MESSAGE_SUBSCRIPTION               CORRELATED                    | (Message name: a), (Correlation key: key-1), (Variables: [message -> correlated])
| EVENT               MESSAGE_SUBSCRIPTION               DELETED                       | (Message name: a), (Correlation key: key-1), 
| COMMAND             PROCESS_MESSAGE_SUBSCRIPTION       DELETE                        | (Message name: a), 
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATING            | (Element id: endEvent_e95246b3-d75f-4935-8025-7e47a039acff), (Element type: END_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_ACTIVATED             | (Element id: endEvent_e95246b3-d75f-4935-8025-7e47a039acff), (Element type: END_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETING            | (Element id: endEvent_e95246b3-d75f-4935-8025-7e47a039acff), (Element type: END_EVENT), (Event type: NONE), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETED             | (Element id: endEvent_e95246b3-d75f-4935-8025-7e47a039acff), (Element type: END_EVENT), (Event type: NONE), (Process id: process)
| COMMAND             PROCESS_INSTANCE                   COMPLETE_ELEMENT              | (Element id: process), (Element type: PROCESS), (Event type: UNSPECIFIED), (Process id: process)
| EVENT               PROCESS_MESSAGE_SUBSCRIPTION       DELETED                       | (Message name: a), (Correlation key: key-1), (Element id: intermediateCatchEvent_16fb6e2f-46f6-480c-8412-fbdeb761ed3f), 
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETING            | (Element id: process), (Element type: PROCESS), (Event type: UNSPECIFIED), (Process id: process)
| EVENT               PROCESS_INSTANCE                   ELEMENT_COMPLETED             | (Element id: process), (Element type: PROCESS), (Event type: UNSPECIFIED), (Process id: process)

io.camunda.zeebe.client.api.command.ClientStatusException: deadline exceeded after 9.999819664s. [closed=[], open=[[remote_addr=/127.0.0.1:52797]]]

	at io.camunda.zeebe.client.impl.ZeebeClientFutureImpl.transformExecutionException(ZeebeClientFutureImpl.java:93)
	at io.camunda.zeebe.client.impl.ZeebeClientFutureImpl.join(ZeebeClientFutureImpl.java:50)
	at io.camunda.zeebe.process.test.engine.EngineClientTest.shouldPublishMessage(EngineClientTest.java:164)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
	at java.base/java.lang.reflect.Method.invoke(Method.java:578)
	at org.junit.platform.commons.util.ReflectionUtils.invokeMethod(ReflectionUtils.java:727)
	at org.junit.jupiter.engine.execution.MethodInvocation.proceed(MethodInvocation.java:60)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$ValidatingInvocation.proceed(InvocationInterceptorChain.java:131)
	at org.junit.jupiter.engine.extension.TimeoutExtension.intercept(TimeoutExtension.java:156)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestableMethod(TimeoutExtension.java:147)
	at org.junit.jupiter.engine.extension.TimeoutExtension.interceptTestMethod(TimeoutExtension.java:86)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker$ReflectiveInterceptorCall.lambda$ofVoidMethod$0(InterceptingExecutableInvoker.java:103)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.lambda$invoke$0(InterceptingExecutableInvoker.java:93)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain$InterceptedInvocation.proceed(InvocationInterceptorChain.java:106)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.proceed(InvocationInterceptorChain.java:64)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.chainAndInvoke(InvocationInterceptorChain.java:45)
	at org.junit.jupiter.engine.execution.InvocationInterceptorChain.invoke(InvocationInterceptorChain.java:37)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:92)
	at org.junit.jupiter.engine.execution.InterceptingExecutableInvoker.invoke(InterceptingExecutableInvoker.java:86)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.lambda$invokeTestMethod$7(TestMethodTestDescriptor.java:217)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.invokeTestMethod(TestMethodTestDescriptor.java:213)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:138)
	at org.junit.jupiter.engine.descriptor.TestMethodTestDescriptor.execute(TestMethodTestDescriptor.java:68)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:151)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.invokeAll(SameThreadHierarchicalTestExecutorService.java:41)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$6(NodeTestTask.java:155)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$8(NodeTestTask.java:141)
	at org.junit.platform.engine.support.hierarchical.Node.around(Node.java:137)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.lambda$executeRecursively$9(NodeTestTask.java:139)
	at org.junit.platform.engine.support.hierarchical.ThrowableCollector.execute(ThrowableCollector.java:73)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.executeRecursively(NodeTestTask.java:138)
	at org.junit.platform.engine.support.hierarchical.NodeTestTask.execute(NodeTestTask.java:95)
	at org.junit.platform.engine.support.hierarchical.SameThreadHierarchicalTestExecutorService.submit(SameThreadHierarchicalTestExecutorService.java:35)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestExecutor.execute(HierarchicalTestExecutor.java:57)
	at org.junit.platform.engine.support.hierarchical.HierarchicalTestEngine.execute(HierarchicalTestEngine.java:54)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:147)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:127)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:90)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.lambda$execute$0(EngineExecutionOrchestrator.java:55)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.withInterceptedStreams(EngineExecutionOrchestrator.java:102)
	at org.junit.platform.launcher.core.EngineExecutionOrchestrator.execute(EngineExecutionOrchestrator.java:54)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:114)
	at org.junit.platform.launcher.core.DefaultLauncher.execute(DefaultLauncher.java:86)
	at org.junit.platform.launcher.core.DefaultLauncherSession$DelegatingLauncher.execute(DefaultLauncherSession.java:86)
	at org.junit.platform.launcher.core.SessionPerRequestLauncher.execute(SessionPerRequestLauncher.java:53)
	at com.intellij.junit5.JUnit5IdeaTestRunner.startRunnerWithArgs(JUnit5IdeaTestRunner.java:57)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
	at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:18)
	at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
	at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
	at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
Caused by: java.util.concurrent.ExecutionException: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999819664s. [closed=[], open=[[remote_addr=/127.0.0.1:52797]]]
	at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396)
	at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073)
	at io.camunda.zeebe.client.impl.ZeebeClientFutureImpl.join(ZeebeClientFutureImpl.java:48)
	... 69 more
Caused by: io.grpc.StatusRuntimeException: DEADLINE_EXCEEDED: deadline exceeded after 9.999819664s. [closed=[], open=[[remote_addr=/127.0.0.1:52797]]]
	at io.grpc.Status.asRuntimeException(Status.java:539)
	at io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:487)
	at io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:576)
	at io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:70)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInternal(ClientCallImpl.java:757)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:736)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
	at java.base/java.lang.Thread.run(Thread.java:1589)

Environment:

  • Zeebe Version: Current snapshot (likely also in the upcoming 8.2.0-alpha5)
@korthout korthout added kind/bug Categorizes an issue or PR as a bug severity/high Marks a bug as having a noticeable impact on the user with no known workaround area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) area/observability Marks an issue as observability related component/stream-platform labels Feb 28, 2023
@korthout
Copy link
Member Author

🔧 We should have a test for this case in Zeebe, not just in zeebe-process-test.

This is the part that Zeebe Process Automation can do.

@korthout
Copy link
Member Author

Warning
When this bug is resolved in Zeebe, we must re-enable the flaky test in zeebe-process-test. This test was disabled in camunda/zeebe-process-test#677

@lenaschoenburg
Copy link
Member

The problem is a bit wider even. CreateProcessInstanceWithResult can throw away any response that led to the completion of a process instance. If the process ends with a job, the job worker completing this job will not receive a response acknowledging the completion.

@Zelldon
Copy link
Member

Zelldon commented Feb 28, 2023

@oleschoenburg and me will look at it tomorrow.

@Zelldon Zelldon added the target:8.2 Issue must be completed before this target release label Feb 28, 2023
ghost pushed a commit to camunda/zeebe-process-test that referenced this issue Feb 28, 2023
677: Sync with Zeebe SNAPSHOT r=koevskinikola a=korthout

## Description

<!-- Please explain the changes you made here. -->

There were several issues stacked that stopped zeebe-process-test from passing the CI against the latest Zeebe SNAPSHOT version. I've resolved these:

- Add while true starting at key to InMemoryDbColumnFamily
- Add delete resource RPC gateway implementation
- Add logger for `COMMAND_DISTRIBUTION` records
- Disable a flaky test due to bug camunda/camunda#11848

I suggest to review this by commit, as these go into more details.

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #673 

<!-- Cut-off marker
_All lines under and including the cut-off marker will be removed from the merge commit message_

## Definition of Ready

Please check the items that apply, before requesting a review.

You can find more details about these items in our wiki page about [Pull Requests and Code Reviews](https://github.com/camunda/zeebe/wiki/Pull-Requests-and-Code-Reviews).

* [ ] I've reviewed my own code
* [ ] I've written a clear changelist description
* [ ] I've narrowly scoped my changes
* [ ] I've separated structural from behavioural changes
-->

## Definition of Done

<!-- Please check the items that apply, before merging or (if possible) before requesting a review. -->

_Not all items need to be done depending on the issue and the pull request._

Code changes:
* [x] The changes are backwards compatibility with previous versions
* [ ] If it fixes a bug then PRs are created to backport the fix

Testing:
* [x] There are unit/integration tests that verify all acceptance criterias of the issue
* [ ] New tests are written to ensure backwards compatibility with further versions
* [ ] The behavior is tested manually

Documentation:
* [ ] Javadoc has been written
* [ ] The documentation is updated


Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
@Zelldon Zelldon changed the title Publish message doesn't always receive a response Batch processing might drop responses Feb 28, 2023
@ghost ghost closed this as completed in 42773d6 Mar 1, 2023
ghost pushed a commit that referenced this issue Mar 1, 2023
11867: [Backport 8.1] fix: never drop responses during batch processing r=oleschoenburg a=Zelldon

## Description

Backported #11865, had to fix several merge conflict related to structural changes which have been done on main, but not on 8.1

Had to fix another test, because this was removed on main, but still exists on 8.1 [test: adjust test case](aae1a80)

<!-- Please explain the changes you made here. -->

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #11848



Co-authored-by: Ole Schönburg <ole.schoenburg@gmail.com>
Co-authored-by: Christopher Zell <zelldon91@googlemail.com>
korthout added a commit that referenced this issue Mar 10, 2023
These two cases failed recently due to a bug introduced with batch
processing [1].

The situation happened because multiple commands were being processed
and responded in the same batch, while the batch processing assumed only
a single response would be writen.

As this affected behavior visible to users, it makes sense to introduce
qa integration tests that show case that the
CreateProcessInstanceWithResult can actually respond when the process
instance is completed by another user command:
- complete job
- publish mesage

See:
 [1] #11848
ghost pushed a commit that referenced this issue Mar 20, 2023
11993: Add QA tests that verify CreateProcessInstanceWithResult responses r=korthout a=korthout

## Description

<!-- Please explain the changes you made here. -->

These two cases failed recently due to a bug introduced with batch processing [1].

As this affected behavior visible to users, it makes sense to introduce QA integration tests that showcase that the `CreateProcessInstanceWithResult` can actually respond when the process instance is completed by another user command.

## Related issues

<!-- Which issues are closed by this PR or are related -->

relates to #11848



Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
backport-action pushed a commit that referenced this issue Mar 20, 2023
These two cases failed recently due to a bug introduced with batch
processing [1].

The situation happened because multiple commands were being processed
and responded in the same batch, while the batch processing assumed only
a single response would be writen.

As this affected behavior visible to users, it makes sense to introduce
qa integration tests that show case that the
CreateProcessInstanceWithResult can actually respond when the process
instance is completed by another user command:
- complete job
- publish mesage

See:
 [1] #11848
(cherry picked from commit b490894)
backport-action pushed a commit that referenced this issue Mar 20, 2023
These two cases failed recently due to a bug introduced with batch
processing [1].

The situation happened because multiple commands were being processed
and responded in the same batch, while the batch processing assumed only
a single response would be writen.

As this affected behavior visible to users, it makes sense to introduce
qa integration tests that show case that the
CreateProcessInstanceWithResult can actually respond when the process
instance is completed by another user command:
- complete job
- publish mesage

See:
 [1] #11848
(cherry picked from commit b490894)
ghost pushed a commit that referenced this issue Mar 20, 2023
12072: [Backport stable/8.0] Add QA tests that verify CreateProcessInstanceWithResult responses r=korthout a=backport-action

# Description
Backport of #11993 to `stable/8.0`.

relates to #11848

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
ghost pushed a commit that referenced this issue Mar 20, 2023
12073: [Backport stable/8.1] Add QA tests that verify CreateProcessInstanceWithResult responses r=korthout a=backport-action

# Description
Backport of #11993 to `stable/8.1`.

relates to #11848

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
ghost pushed a commit that referenced this issue Mar 21, 2023
12073: [Backport stable/8.1] Add QA tests that verify CreateProcessInstanceWithResult responses r=korthout a=backport-action

# Description
Backport of #11993 to `stable/8.1`.

relates to #11848

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
ghost pushed a commit that referenced this issue Mar 21, 2023
12072: [Backport stable/8.0] Add QA tests that verify CreateProcessInstanceWithResult responses r=korthout a=backport-action

# Description
Backport of #11993 to `stable/8.0`.

relates to #11848

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
@npepinpe npepinpe added version:8.2.0 Marks an issue as being completely or in parts released in 8.2.0 version:8.1.10 Marks an issue as being completely or in parts released in 8.1.10 release/8.0.13 labels Apr 5, 2023
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/observability Marks an issue as observability related area/reliability Marks an issue as related to improving the reliability of our software (i.e. it behaves as expected) component/stream-platform kind/bug Categorizes an issue or PR as a bug severity/high Marks a bug as having a noticeable impact on the user with no known workaround target:8.2 Issue must be completed before this target release version:8.1.10 Marks an issue as being completely or in parts released in 8.1.10 version:8.2.0 Marks an issue as being completely or in parts released in 8.2.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants