Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE: Cannot invoke "String.getBytes(java.nio.charset.Charset)" because "value" is null #9344

Closed
Zelldon opened this issue May 10, 2022 · 9 comments · Fixed by #9351
Closed
Assignees
Labels
kind/bug Categorizes an issue or PR as a bug severity/high Marks a bug as having a noticeable impact on the user with no known workaround version:8.1.0-alpha2 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0

Comments

@Zelldon
Copy link
Member

Zelldon commented May 10, 2022

Describe the bug

Looks similar to #9269. It seems something in the decision evaluation is null, looks like the .setFailedDecisionId is failing. Happening on prod with Camunda 8.0.0.

Error group https://console.cloud.google.com/errors/detail/CKWO74bjv6a2owE;service=zeebe;time=P7D?project=camunda-cloud-240911

To Reproduce

Backup of logs and models can be found here https://drive.google.com/drive/u/0/folders/1WGHcsqwXDGFbGMBqrACaDfW_vjkVnMEQ

Log details https://console.cloud.google.com/logs/query;query=%0AlogName:%22stdout%22%0Aresource.type%3D%22k8s_container%22%0Aresource.labels.namespace_name%3D%22db0e535c-9198-4185-a526-c81f610969e8-zeebe%22%0Aresource.labels.project_id%3D%22camunda-cloud-240911%22%0Aresource.labels.pod_name%3D%22zeebe-0%22%0Aresource.labels.location%3D%22europe-west1%22%0Aresource.labels.container_name%3D%22zeebe%22%0Aresource.labels.cluster_name%3D%22production-worker-1%22;timeRange=2022-05-10T10:59:04.524Z%2F2022-05-10T11:59:04.524Z;pinnedLogId=2022-05-10T11:28:34.524275289Z%2Fmtmd0orxighigvni;cursorTimestamp=2022-05-10T11:15:46.967095281Z?project=camunda-cloud-240911

Expected behavior
No NPE.

Log/Stacktrace

Full Stacktrace

java.lang.NullPointerException: Cannot invoke "String.getBytes(java.nio.charset.Charset)" because "value" is null

at io.camunda.zeebe.util.StringUtil.getBytes ( [io/camunda.zeebe.util/StringUtil.java:39](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util%2FStringUtil.java&line=39&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.StringUtil.getBytes ( [io/camunda.zeebe.util/StringUtil.java:35](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util%2FStringUtil.java&line=35&project=camunda-cloud-240911) )
at io.camunda.zeebe.msgpack.property.StringProperty.setValue ( [io/camunda.zeebe.msgpack.property/StringProperty.java:30](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.msgpack.property%2FStringProperty.java&line=30&project=camunda-cloud-240911) )
at io.camunda.zeebe.protocol.impl.record.value.decision.DecisionEvaluationRecord.setFailedDecisionId ( [io/camunda.zeebe.protocol.impl.record.value.decision/DecisionEvaluationRecord.java:247](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.protocol.impl.record.value.decision%2FDecisionEvaluationRecord.java&line=247&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.behavior.BpmnDecisionBehavior.writeDecisionEvaluationEvent ( [io/camunda.zeebe.engine.processing.bpmn.behavior/BpmnDecisionBehavior.java:254](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.behavior%2FBpmnDecisionBehavior.java&line=254&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.behavior.BpmnDecisionBehavior.lambda$evaluateDecision$3 ( [io/camunda.zeebe.engine.processing.bpmn.behavior/BpmnDecisionBehavior.java:114](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.behavior%2FBpmnDecisionBehavior.java&line=114&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.Either$Right.flatMap ( [io/camunda.zeebe.util/Either.java:366](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util%2FEither.java&line=366&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.behavior.BpmnDecisionBehavior.evaluateDecision ( [io/camunda.zeebe.engine.processing.bpmn.behavior/BpmnDecisionBehavior.java:109](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.behavior%2FBpmnDecisionBehavior.java&line=109&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor$CalledDecisionBehavior.lambda$onActivate$0 ( [io/camunda.zeebe.engine.processing.bpmn.task/BusinessRuleTaskProcessor.java:89](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.task%2FBusinessRuleTaskProcessor.java&line=89&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.Either$Right.flatMap ( [io/camunda.zeebe.util/Either.java:366](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util%2FEither.java&line=366&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor$CalledDecisionBehavior.onActivate ( [io/camunda.zeebe.engine.processing.bpmn.task/BusinessRuleTaskProcessor.java:89](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.task%2FBusinessRuleTaskProcessor.java&line=89&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor.onActivate ( [io/camunda.zeebe.engine.processing.bpmn.task/BusinessRuleTaskProcessor.java:40](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.task%2FBusinessRuleTaskProcessor.java&line=40&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.task.BusinessRuleTaskProcessor.onActivate ( [io/camunda.zeebe.engine.processing.bpmn.task/BusinessRuleTaskProcessor.java:21](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn.task%2FBusinessRuleTaskProcessor.java&line=21&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processEvent$2 ( [io/camunda.zeebe.engine.processing.bpmn/BpmnStreamProcessor.java:128](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn%2FBpmnStreamProcessor.java&line=128&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.Either$Right.ifRightOrLeft ( [io/camunda.zeebe.util/Either.java:381](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util%2FEither.java&line=381&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processEvent ( [io/camunda.zeebe.engine.processing.bpmn/BpmnStreamProcessor.java:127](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn%2FBpmnStreamProcessor.java&line=127&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.lambda$processRecord$0 ( [io/camunda.zeebe.engine.processing.bpmn/BpmnStreamProcessor.java:110](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn%2FBpmnStreamProcessor.java&line=110&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.Either$Right.ifRightOrLeft ( [io/camunda.zeebe.util/Either.java:381](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util%2FEither.java&line=381&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.bpmn.BpmnStreamProcessor.processRecord ( [io/camunda.zeebe.engine.processing.bpmn/BpmnStreamProcessor.java:107](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.bpmn%2FBpmnStreamProcessor.java&line=107&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.streamprocessor.TypedRecordProcessor.processRecord ( [io/camunda.zeebe.engine.processing.streamprocessor/TypedRecordProcessor.java:54](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.streamprocessor%2FTypedRecordProcessor.java&line=54&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.lambda$processInTransaction$3 ( [io/camunda.zeebe.engine.processing.streamprocessor/ProcessingStateMachine.java:300](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.streamprocessor%2FProcessingStateMachine.java&line=300&project=camunda-cloud-240911) )
at io.camunda.zeebe.db.impl.rocksdb.transaction.ZeebeTransaction.run ( [io/camunda.zeebe.db.impl.rocksdb.transaction/ZeebeTransaction.java:84](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.db.impl.rocksdb.transaction%2FZeebeTransaction.java&line=84&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.processInTransaction ( [io/camunda.zeebe.engine.processing.streamprocessor/ProcessingStateMachine.java:290](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.streamprocessor%2FProcessingStateMachine.java&line=290&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.processCommand ( [io/camunda.zeebe.engine.processing.streamprocessor/ProcessingStateMachine.java:253](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.streamprocessor%2FProcessingStateMachine.java&line=253&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.tryToReadNextRecord ( [io/camunda.zeebe.engine.processing.streamprocessor/ProcessingStateMachine.java:213](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.streamprocessor%2FProcessingStateMachine.java&line=213&project=camunda-cloud-240911) )
at io.camunda.zeebe.engine.processing.streamprocessor.ProcessingStateMachine.readNextRecord ( [io/camunda.zeebe.engine.processing.streamprocessor/ProcessingStateMachine.java:189](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.engine.processing.streamprocessor%2FProcessingStateMachine.java&line=189&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.sched.ActorJob.invoke ( [io/camunda.zeebe.util.sched/ActorJob.java:79](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util.sched%2FActorJob.java&line=79&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.sched.ActorJob.execute ( [io/camunda.zeebe.util.sched/ActorJob.java:44](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util.sched%2FActorJob.java&line=44&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.sched.ActorTask.execute ( [io/camunda.zeebe.util.sched/ActorTask.java:122](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util.sched%2FActorTask.java&line=122&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.sched.ActorThread.executeCurrentTask ( [io/camunda.zeebe.util.sched/ActorThread.java:97](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util.sched%2FActorThread.java&line=97&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.sched.ActorThread.doWork ( [io/camunda.zeebe.util.sched/ActorThread.java:80](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util.sched%2FActorThread.java&line=80&project=camunda-cloud-240911) )
at io.camunda.zeebe.util.sched.ActorThread.run ( [io/camunda.zeebe.util.sched/ActorThread.java:189](https://console.cloud.google.com/debug?referrer=fromlog&file=io%2Fcamunda.zeebe.util.sched%2FActorThread.java&line=189&project=camunda-cloud-240911) )

Environment:

  • OS: SaaS
  • Zeebe Version: 8.0.0
  • Configuration:
@Zelldon Zelldon added kind/bug Categorizes an issue or PR as a bug team/process-automation labels May 10, 2022
@Zelldon
Copy link
Member Author

Zelldon commented May 10, 2022

Looking at the code it seems to fail because the decision result is a failure

https://github.com/camunda/zeebe/blob/main/engine/src/main/java/io/camunda/zeebe/engine/processing/bpmn/behavior/BpmnDecisionBehavior.java#L249-L254

but the failed decision id is not set.

Looking at the error not sure whether it makes sense to extract the dmn? Probably the bpmn is enough, since it fails on activation?

Expected to successfully process record 'LoggedEvent [type=0, version=16384, streamId=1, position=1420880, key=2251799814395374, timestamp=1652182114486, sourceEventPosition=1420873] RecordMetadata{recordType=COMMAND, intentValue=255, intent=ACTIVATE_ELEMENT, requestStreamId=-2147483648, requestId=-1, protocolVersion=3, valueType=PROCESS_INSTANCE, rejectionType=NULL_VAL, rejectionReason=, brokerVersion=8.0.0}' with processor, but caught an exception. Skip this record.

Expected to process record 'TypedEventImpl{metadata=RecordMetadata{recordType=COMMAND, intentValue=255, intent=ACTIVATE_ELEMENT, requestStreamId=-2147483648, requestId=-1, protocolVersion=3, valueType=PROCESS_INSTANCE, rejectionType=NULL_VAL, rejectionReason=, brokerVersion=8.0.0}, value={"bpmnProcessId":"OrderProcess","version":7,"processDefinitionKey":2251799814395026,"processInstanceKey":2251799814395186,"elementId":"Activity_01fqkhq","flowScopeKey":2251799814395186,"bpmnElementType":"BUSINESS_RULE_TASK","parentProcessInstanceKey":-1,"parentElementInstanceKey":-1}}' without errors, but exception occurred with message 'Cannot invoke "String.getBytes(java.nio.charset.Charset)" because "value" is null'.

@korthout
Copy link
Member

Looking at the stacktrace, it failed on BpmnDecisionBehavior.java#L254.

It makes sense to have both the BPMN and the DMN because the decision is evaluated on activation of the business rule task, which is when the NPE was thrown.

@korthout
Copy link
Member

I was quickly trying to determine the severity, but I don't have all the info necessary.

@Zelldon what is the impact of this NPE? Does it block processing or is this covered by our process instance blocklist?

@Zelldon
Copy link
Member Author

Zelldon commented May 10, 2022

The exception is catched and our usual error handling is done as you can see here, based on the logs #9344 (comment)

Means instance is blacklisted https://console.cloud.google.com/logs/query;query=%0AlogName:%22stdout%22%0Aresource.type%3D%22k8s_container%22%0Aresource.labels.namespace_name%3D%22db0e535c-9198-4185-a526-c81f610969e8-zeebe%22%0Aresource.labels.project_id%3D%22camunda-cloud-240911%22%0Aresource.labels.pod_name%3D%22zeebe-0%22%0Aresource.labels.location%3D%22europe-west1%22%0Aresource.labels.container_name%3D%22zeebe%22%0Aresource.labels.cluster_name%3D%22production-worker-1%22;timeRange=2022-05-10T10:59:04.524Z%2F2022-05-10T11:59:04.524Z;pinnedLogId=2022-05-10T11:28:34.525818277Z%2F6yim0aeo29wczy6m;cursorTimestamp=2022-05-10T11:28:54.628999198Z?project=camunda-cloud-240911

What we can see as well is that multiple instances are being blacklisted because of this

https://console.cloud.google.com/logs/query;query=%0AlogName:%22stdout%22%0Aresource.type%3D%22k8s_container%22%0Aresource.labels.namespace_name%3D%22db0e535c-9198-4185-a526-c81f610969e8-zeebe%22%0Aresource.labels.project_id%3D%22camunda-cloud-240911%22%0Aresource.labels.pod_name%3D%22zeebe-0%22%0Aresource.labels.location%3D%22europe-west1%22%0Aresource.labels.container_name%3D%22zeebe%22%0Aresource.labels.cluster_name%3D%22production-worker-1%22%0AjsonPayload.message%3D~%22Blacklist%20process%20instance%22;timeRange=2022-05-04T22:59:04.524Z%2F2022-05-10T12:59:04.524Z;pinnedLogId=2022-05-10T11:28:34.525818277Z%2F6yim0aeo29wczy6m;cursorTimestamp=2022-05-10T11:28:34.525818277Z?project=camunda-cloud-240911

@korthout
Copy link
Member

korthout commented May 10, 2022

@Zelldon the provided models in the link above, are those complete? Or could there be other versions deployed that are not in this folder?

EDIT: I have tried to reproduce the problem, but was unable to so far.

@korthout korthout added the severity/high Marks a bug as having a noticeable impact on the user with no known workaround label May 10, 2022
@Zelldon
Copy link
Member Author

Zelldon commented May 10, 2022

@korthout for the DMN i downloaded all versions. For bpmn only the failed one.

@korthout
Copy link
Member

At this time I'll mark it as severity/high because unless we know the cause, we can't know a workaround

severity/high: Marks a bug as having a noticeable impact on the user with no known workaround

@korthout
Copy link
Member

I've started a research document for further analysis of the data.

@korthout
Copy link
Member

@saig0 and I were unable to reproduce the issue from a user's perspective, but the error is clear when looking at the code. There are 2 cases where the decisionId could be null. For both situations, we can provide a reasonable value instead. So we can fix the issue rather easily.

@korthout korthout self-assigned this May 11, 2022
zeebe-bors-camunda bot added a commit that referenced this issue May 11, 2022
9351: Fix NPE because of missing decisionId of failed decision evaluation r=korthout a=korthout

## Description

<!-- Please explain the changes you made here. -->

When the decision evaluation fails, a DECISION_EVALUATION record with FAILED intent is written, containing relevant information about the failure. This includes the id of the decision that failed. However, in 2 cases it might happen that a `null` value would be used, which cannot be written into a record, leading to a NullPointerException (NPE).

Both NPEs can be avoided by not using `null`, but using reasonable defaults instead. In both cases, the id of the decision that was intended to be evaluated can be used as the replacement of this `null`.

While we were able to discover the NPEs' causes from the code and the stack traces, we were unable to reproduce the situation from a user's perspective. I've thus left out a regression test for it.

## Related issues

<!-- Which issues are closed by this PR or are related -->

closes #9344 



Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue May 12, 2022
9364: [Backport stable/8.0] Fix NPE because of missing decisionId of failed decision evaluation r=korthout a=github-actions[bot]

# Description
Backport of #9351 to `stable/8.0`.

relates to #9344

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
zeebe-bors-camunda bot added a commit that referenced this issue May 12, 2022
9364: [Backport stable/8.0] Fix NPE because of missing decisionId of failed decision evaluation r=korthout a=github-actions[bot]

# Description
Backport of #9351 to `stable/8.0`.

relates to #9344

Co-authored-by: Nico Korthout <nico.korthout@camunda.com>
@Zelldon Zelldon added the version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0 label Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug severity/high Marks a bug as having a noticeable impact on the user with no known workaround version:8.1.0-alpha2 version:8.1.0 Marks an issue as being completely or in parts released in 8.1.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants