Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expected to find job with key, but no job found #6016

Closed
deepthidevaki opened this issue Dec 15, 2020 · 15 comments
Closed

Expected to find job with key, but no job found #6016

deepthidevaki opened this issue Dec 15, 2020 · 15 comments
Labels
kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user

Comments

@deepthidevaki
Copy link
Contributor

Describe the bug

E 2020-12-14T19:08:36.656982Z Expected to find job with key 2251799821310810, but no job found 
E 2020-12-14T19:08:36.662935Z Expected to find job with key 2251799821310836, but no job found 
E 2020-12-14T19:08:36.663326Z Expected to find job with key 2251799821310837, but no job found 
E 2020-12-14T19:08:36.663534Z Expected to find job with key 2251799821310838, but no job found 
E 2020-12-14T19:08:36.663810Z Expected to find job with key 2251799821310839, but no job found 
E 2020-12-14T19:08:36.664011Z Expected to find job with key 2251799821310840, but no job found 
E 2020-12-14T19:08:36.664190Z Expected to find job with key 2251799821310841, but no job found 
E 2020-12-14T19:08:36.664447Z Expected to find job with key 2251799821310842, but no job found 
E 2020-12-14T19:08:36.664742Z Expected to find job with key 2251799821310843, but no job found 
E 2020-12-14T19:08:36.664946Z Expected to find job with key 2251799821310844, but no job found 
E 2020-12-14T19:08:36.665131Z Expected to find job with key 2251799821310810, but no job found 
E 2020-12-14T19:08:36.665295Z Expected to find job with key 2251799821310836, but no job found 
E 2020-12-14T19:08:36.665451Z Expected to find job with key 2251799821310837, but no job found 
E 2020-12-14T19:08:36.665643Z Expected to find job with key 2251799821310838, but no job found 
E 2020-12-14T19:08:36.665880Z Expected to find job with key 2251799821310839, but no job found 
E 2020-12-14T19:08:36.666101Z Expected to find job with key 2251799821310840, but no job found 
E 2020-12-14T19:08:36.667479Z Expected to find job with key 2251799821310841, but no job found 
E 2020-12-14T19:08:36.668035Z Expected to find job with key 2251799821310842, but no job found 
E 2020-12-14T19:08:36.668297Z Expected to find job with key 2251799821310843, but no job found 
E 2020-12-14T19:08:36.668501Z Expected to find job with key 2251799821310844, but no job found 

Version 0.25.3
Report location: visitJob (JobState.java:257)

logs

Environment:

  • Camunda cloud
  • Zeebe Version: 0.25.3
@deepthidevaki deepthidevaki added kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog labels Dec 15, 2020
@deepthidevaki
Copy link
Contributor Author

On a first look, this looks like a bug. If it is not, then we should lower the log level.

@Zelldon
Copy link
Member

Zelldon commented Dec 15, 2020

Duplicate of #3765

@deepthidevaki
Copy link
Contributor Author

I don't think this is duplicate of #3765.
This log occurs when iterating over jobs. This happens when activating the jobs (JobState#forEachActivatableJobs )or when timing out jobs after deadline (JobState#forEachTimedOutEntry). When a job is completed, as far as I know, we don't iterate over the jobs.

@deepthidevaki
Copy link
Contributor Author

I would categorize it as a bug. I couldn't find a case where this is expected to happen. But I would assign severity low as the error went away shortly and we did not observe any other issues related to it.

@deepthidevaki deepthidevaki added severity/low Marks a bug as having little to no noticeable impact for the user Status: Needs Priority and removed Status: Needs Triage labels Dec 15, 2020
@npepinpe
Copy link
Member

This occurred on the gameday cluster, so postponed until tomorrow after the game day in case it's related. Let's not solve the incident before the gameday occurs 😅

@pihme
Copy link
Contributor

pihme commented Dec 16, 2020

Not sure it is related to the game day. It is not deliberate. I can tell that much

Please wait for Game Day. After Game Day I can also give full sources, which might help in diagnosing the problem

@pihme
Copy link
Contributor

pihme commented Dec 16, 2020

So from the client side there is nothing out of the ordinary. I think this is a Zeebe-internal phenomenon. Would be interesting to see a stack trace, in particular whether it is called from forEachTimedOutEntry or forEachActivateableJob.

@npepinpe
Copy link
Member

Please verify whether we're creating garbage (i.e. leaving dangling references) or we're actually "losing" jobs or inserting the wrong references. The latter is high priority, but the former isn't. In case of doubt I would assume the worst, and then we can verify and reprioritize (or just fix it if the cause is obvious at that point).

@npepinpe npepinpe added this to Ready in Zeebe Dec 16, 2020
@deepthidevaki deepthidevaki removed their assignment Dec 17, 2020
@deepthidevaki
Copy link
Contributor Author

So from the client side there is nothing out of the ordinary. I think this is a Zeebe-internal phenomenon. Would be interesting to see a stack trace, in particular whether it is called from forEachTimedOutEntry or forEachActivateableJob.

Unfortunately there is not stack trace available to determine if it is called from forEachTimedOutEntry or forEachActivateableJob.

@deepthidevaki
Copy link
Contributor Author

As a user, Peter did not observe any issues. We also don't see any stuck workflows in Operate. So I think we can safely assume that the severity is low.

@npepinpe npepinpe removed this from Ready in Zeebe Dec 17, 2020
@npepinpe
Copy link
Member

npepinpe commented Mar 3, 2021

Occurred in the last CW09 benchmark: https://console.cloud.google.com/errors/CKe9pf-Tqe-ZZw?service=zeebe-broker&version=medic-cw-09-cf1627167-benchmark&time=P1D&project=zeebe-io

Seems to occur in bursts, but the cluster seems healthy and the workers aren't reporting any errors.

@npepinpe
Copy link
Member

I'm closing for now as we haven't noticed any negative effects, so I wouldn't invest time here at the moment.

@namero999
Copy link

We are experiencing this on 8.3.3

Was it ever discovered what this is about? Should we be worried (as in silent job loss)?

@npepinpe
Copy link
Member

npepinpe commented Dec 7, 2023

This can happen normally if your jobs time out while another worker is working on them. So for example, I have a job of type A, and two different workers foo and bar subscribed to receive jobs of type A. The job has an activation timeout of 5 seconds. Zeebe sends the job to worker foo. That worker is slow for some reason, and takes more than 5 seconds to complete the job. After 5 seconds, Zeebe assumes worker foo timed out, and sends the job to worker bar. But maybe worker foo completes the job after 5.5 seconds! So now the job is completed and effectively doesn't exist anymore in Zeebe. But worker bar has received it, and completes it after 7 seconds. Then you get the NOT_FOUND error.

This is not an error from Zeebe's point of view because Zeebe is an at-least-once system.

@namero999
Copy link

I see, thank you. It might be consistent with the fact that we have attempted to run some tests with a high value for max-jobs-active. Currently the pod is outputting lots of logs about these jobs being not found. While it's clear that this is not an issue, should we expect the logs to stop once all the jobs have exhausted the retries?

github-merge-queue bot pushed a commit that referenced this issue Mar 14, 2024
Bumps [org.apache.maven.plugins:maven-javadoc-plugin](https://github.com/apache/maven-javadoc-plugin) from 3.6.2 to 3.6.3.
- [Release notes](https://github.com/apache/maven-javadoc-plugin/releases)
- [Commits](apache/maven-javadoc-plugin@maven-javadoc-plugin-3.6.2...maven-javadoc-plugin-3.6.3)

---
updated-dependencies:
- dependency-name: org.apache.maven.plugins:maven-javadoc-plugin
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes an issue or PR as a bug scope/broker Marks an issue or PR to appear in the broker section of the changelog severity/low Marks a bug as having little to no noticeable impact for the user
Projects
None yet
Development

No branches or pull requests

5 participants