Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Severe JobRunr Exception #210

Closed
gigicapi opened this issue Aug 27, 2021 · 11 comments
Closed

Severe JobRunr Exception #210

gigicapi opened this issue Aug 27, 2021 · 11 comments
Milestone

Comments

@gigicapi
Copy link

SevereJobRunrException occurred in BackgroundJobServer c7c41a6b-21b1-48f4-bdfe-0b6db0e5b120: Could not resolve ConcurrentJobModificationException

Runtime information

  • Timestamp: 2021-08-03T14:38:34.628293Z
  • Location: BackgroundJobServer c7c41a6b-21b1-48f4-bdfe-0b6db0e5b120
  • JobRunr Version: Unable to determine version
  • StorageProvider: org.jobrunr.storage.sql.postgres.PostgresStorageProvider
  • Java Version: 11.0.8
  • Is running from nested jar: true

Background Job Servers

  • BackgroundJobServer id: c7c41a6b-21b1-48f4-bdfe-0b6db0e5b120
    (workerPoolSize: 8, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-03T14:35:03.616552Z, lastHeartbeat: 2021-08-03T14:38:33.618156Z)
  • BackgroundJobServer id: d48fb30d-d346-4251-b1cc-266c15f8761b
    (workerPoolSize: 32, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-03T15:48:29.153592Z, lastHeartbeat: 2021-08-03T16:25:14.119011Z)

Diagnostics from exception

Concurrent modified jobs:

Job id: 5147ca8e-1ff0-4dc1-bf27-527f4b8f4e4b
Local version: 13; Storage version: 14
Local state: DELETED (at 2021-08-03T14:37:04.647416Z) ← PROCESSING (at 2021-08-03T14:37:04.618096Z) ← ENQUEUED (at 2021-08-03T14:30:09.324990Z)
Storage state: DELETED (at 2021-08-03T14:37:03.645100Z) ← PROCESSING (at 2021-08-03T14:36:49.618110Z) ← ENQUEUED (at 2021-08-03T14:30:09.324990Z)

Exception

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
	at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:207)
	at org.jobrunr.server.JobZooKeeper.updateJobsThatAreBeingProcessed(JobZooKeeper.java:146)
	at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:74)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

SevereJobRunrException occurred in BackgroundJobServer a2aa1fd9-f9e7-4fd9-85bc-b6815418158a: Could not resolve ConcurrentJobModificationException

Runtime information

  • Timestamp: 2021-08-27T07:33:48.325740Z
  • Location: BackgroundJobServer a2aa1fd9-f9e7-4fd9-85bc-b6815418158a
  • JobRunr Version: Unable to determine version
  • StorageProvider: org.jobrunr.storage.sql.postgres.PostgresStorageProvider
  • Java Version: 11.0.8
  • Is running from nested jar: true

Background Job Servers

  • BackgroundJobServer id: a2aa1fd9-f9e7-4fd9-85bc-b6815418158a
    (workerPoolSize: 8, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-26T13:38:47.305522Z, lastHeartbeat: 2021-08-27T07:33:47.306681Z)

Diagnostics from exception

Concurrent modified jobs:

Job id: e1c88176-6d16-402d-a9d6-6cd36da09025
Local version: 33; Storage version: 34
Local state: PROCESSING (at 2021-08-27T07:33:48.306977Z) ← ENQUEUED (at 2021-08-27T07:24:20.703606Z)
Storage state: ENQUEUED (at 2021-08-27T07:32:24.144437Z) ← PROCESSING (at 2021-08-27T07:32:18.306966Z) ← ENQUEUED (at 2021-08-27T07:24:20.703606Z)

Exception

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
	at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:207)
	at org.jobrunr.server.JobZooKeeper.updateJobsThatAreBeingProcessed(JobZooKeeper.java:146)
	at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:74)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

SevereJobRunrException occurred in BackgroundJobServer 2af0b21c-d69b-4692-9e0a-01fb18f9d192: Could not resolve ConcurrentJobModificationException

Runtime information

  • Timestamp: 2021-08-27T07:47:55.624642Z
  • Location: BackgroundJobServer 2af0b21c-d69b-4692-9e0a-01fb18f9d192
  • JobRunr Version: Unable to determine version
  • StorageProvider: org.jobrunr.storage.sql.postgres.PostgresStorageProvider
  • Java Version: 11.0.8
  • Is running from nested jar: true

Background Job Servers

  • BackgroundJobServer id: 2af0b21c-d69b-4692-9e0a-01fb18f9d192
    (workerPoolSize: 8, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-27T07:37:54.608860Z, lastHeartbeat: 2021-08-27T07:47:54.610134Z)

Diagnostics from exception

Concurrent modified jobs:

Job id: 8b947651-d5c6-4636-86be-8b081ce9d5ad
Local version: 35; Storage version: 36
Local state: DELETED (at 2021-08-27T07:46:25.650086Z) ← PROCESSING (at 2021-08-27T07:46:25.610568Z) ← ENQUEUED (at 2021-08-27T07:35:21.435744Z)
Storage state: DELETED (at 2021-08-27T07:46:18.045131Z) ← PROCESSING (at 2021-08-27T07:46:10.610596Z) ← ENQUEUED (at 2021-08-27T07:35:21.435744Z)

Exception

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
	at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:207)
	at org.jobrunr.server.JobZooKeeper.updateJobsThatAreBeingProcessed(JobZooKeeper.java:146)
	at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:74)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

SevereJobRunrException occurred in BackgroundJobServer 28efb165-57b7-436b-8d9f-569b10c75fab: Could not resolve ConcurrentJobModificationException

Runtime information

  • Timestamp: 2021-08-27T08:01:18.935729Z
  • Location: BackgroundJobServer 28efb165-57b7-436b-8d9f-569b10c75fab
  • JobRunr Version: Unable to determine version
  • StorageProvider: org.jobrunr.storage.sql.postgres.PostgresStorageProvider
  • Java Version: 11.0.8
  • Is running from nested jar: true

Background Job Servers

  • BackgroundJobServer id: 28efb165-57b7-436b-8d9f-569b10c75fab
    (workerPoolSize: 8, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-27T07:56:17.922490Z, lastHeartbeat: 2021-08-27T08:01:17.923768Z)

Diagnostics from exception

Concurrent modified jobs:

Job id: d8fbbb71-8c6e-4448-b1fd-107c2cd5a352
Local version: 16; Storage version: 17
Local state: PROCESSING (at 2021-08-27T08:01:18.924162Z) ← ENQUEUED (at 2021-08-27T07:49:00.364118Z)
Storage state: ENQUEUED (at 2021-08-27T08:00:00.640437Z) ← PROCESSING (at 2021-08-27T07:59:48.924210Z) ← ENQUEUED (at 2021-08-27T07:49:00.364118Z)

Exception

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
	at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:207)
	at org.jobrunr.server.JobZooKeeper.updateJobsThatAreBeingProcessed(JobZooKeeper.java:146)
	at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:74)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

SevereJobRunrException occurred in BackgroundJobServer 3044c0d2-2d33-4a99-b9cf-d44697c92d40: Could not resolve ConcurrentJobModificationException

Runtime information

  • Timestamp: 2021-08-27T08:23:29.016333900Z
  • Location: BackgroundJobServer 3044c0d2-2d33-4a99-b9cf-d44697c92d40
  • JobRunr Version: 3.2.1
  • StorageProvider: org.jobrunr.storage.sql.postgres.PostgresStorageProvider
  • Java Version: 11
  • Is running from nested jar: false

Background Job Servers

  • BackgroundJobServer id: 3044c0d2-2d33-4a99-b9cf-d44697c92d40
    (workerPoolSize: 64, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-27T08:07:42.916577Z, lastHeartbeat: 2021-08-27T08:23:27.914142Z)

Diagnostics from exception

Concurrent modified jobs:

Job id: ec352de1-18d8-4d51-9246-34c1b5042665
Local version: 42; Storage version: 43
Local state: DELETED (at 2021-08-27T08:21:59.039823100Z) ← PROCESSING (at 2021-08-27T08:21:58.915566500Z) ← ENQUEUED (at 2021-08-27T08:11:38.975843500Z)
Storage state: DELETED (at 2021-08-27T08:21:46.117607700Z) ← PROCESSING (at 2021-08-27T08:21:43.914824300Z) ← ENQUEUED (at 2021-08-27T08:11:38.975843500Z)

Exception

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
	at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:207)
	at org.jobrunr.server.JobZooKeeper.updateJobsThatAreBeingProcessed(JobZooKeeper.java:146)
	at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:74)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)

SevereJobRunrException occurred in BackgroundJobServer 76c47ee8-496b-4427-bb6b-58c8f3ff7545: Could not resolve ConcurrentJobModificationException

Runtime information

  • Timestamp: 2021-08-27T08:29:18.046403800Z
  • Location: BackgroundJobServer 76c47ee8-496b-4427-bb6b-58c8f3ff7545
  • JobRunr Version: 3.2.1
  • StorageProvider: org.jobrunr.storage.sql.postgres.PostgresStorageProvider
  • Java Version: 11
  • Is running from nested jar: false

Background Job Servers

  • BackgroundJobServer id: 76c47ee8-496b-4427-bb6b-58c8f3ff7545
    (workerPoolSize: 64, pollIntervalInSeconds: 15, firstHeartbeat: 2021-08-27T08:27:31.968353Z, lastHeartbeat: 2021-08-27T08:29:16.968562Z)

Diagnostics from exception

Concurrent modified jobs:

Job id: 58d22da3-e774-470f-b870-e3e05d93fae3
Local version: 12; Storage version: 13
Local state: PROCESSING (at 2021-08-27T08:29:17.968708200Z) ← ENQUEUED (at 2021-08-27T08:27:48.102409800Z) ← SCHEDULED (at 2021-08-27T08:27:33.235105100Z)
Storage state: ENQUEUED (at 2021-08-27T08:27:50.147495Z) ← PROCESSING (at 2021-08-27T08:27:48.466817900Z) ← ENQUEUED (at 2021-08-27T08:27:48.102409800Z)

Exception

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
	at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:207)
	at org.jobrunr.server.JobZooKeeper.updateJobsThatAreBeingProcessed(JobZooKeeper.java:146)
	at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:74)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.base/java.lang.Thread.run(Thread.java:834)
@rdehuyss
Copy link
Contributor

@gigicapi which JobRunr version are you using?

@gigicapi
Copy link
Author

@rdehuyss Thank you for responding me.
I'm using version JobRunr 3.2.1

@rdehuyss
Copy link
Contributor

Your jobs don't follow the normal JobRunr lifecycle. Do you manually delete jobs? Or do you have custom jobfilters that interact with the state of the jobs?

@rdehuyss
Copy link
Contributor

Can you setup a repo that reproduces this?

rdehuyss added a commit that referenced this issue Aug 29, 2021
Add support for AllowedStateChanges to limit bugs see #210
@pbilstein
Copy link

pbilstein commented Aug 30, 2021

Hi,

I did some testing with jobrunnr (3.2.1) and ran into a similar (slightly different) problem.

We would like to use jobrunnr to manage delayed job executions and due to our requirements it might be necessary to cancel and restart these jobs in a concurrent manner triggered by different events.

Here is our exception stacktrace

2021-08-30 11:10:05.508  WARN [pool-9-thread-2] org.jobrunr.server.JobZooKeeper          : JobRunr encountered a problematic exception. Please create a bug report (if possible, provide the code to reproduce this and the stacktrace) - Processing will continue.

org.jobrunr.SevereJobRunrException: Could not resolve ConcurrentJobModificationException
        at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:208) ~[jobrunr-3.1.2.jar:na]
        at org.jobrunr.server.JobZooKeeper.processJobList(JobZooKeeper.java:192) ~[jobrunr-3.1.2.jar:na]
        at org.jobrunr.server.JobZooKeeper.checkForScheduledJobs(JobZooKeeper.java:114) ~[jobrunr-3.1.2.jar:na]
        at org.jobrunr.server.JobZooKeeper.runMasterTasksIfCurrentServerIsMaster(JobZooKeeper.java:94) ~[jobrunr-3.1.2.jar:na]
        at org.jobrunr.server.JobZooKeeper.run(JobZooKeeper.java:76) ~[jobrunr-3.1.2.jar:na]
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[na:na]
        at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) ~[na:na]
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[na:na]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[na:na]
        at java.base/java.lang.Thread.run(Thread.java:831) ~[na:na]

Unfortunately I cannot provide a reproducing repository easily right now, but probably it helps finding the issue.

@rdehuyss
Copy link
Contributor

@pbilstein To be honest, JobRunr wasn't meant to be used that way (a lot of cancelling and restarting) as it requires a lot of extra safety guards. JobRunr 4.0, now in RC1 will already add extra checks that statechanges are valid.

If I would add support for your requirement to the Pro version, would you be willing to take a subscription?

@pbilstein
Copy link

@pbilstein To be honest, JobRunr wasn't meant to be used that way (a lot of cancelling and restarting) as it requires a lot of extra safety guards. JobRunr 4.0, now in RC1 will already add extra checks that statechanges are valid.

If I would add support for your requirement to the Pro version, would you be willing to take a subscription?

@rdehuyss thank for your answer. I think we would rather try some other solutions first, because we really only need a very limited feature set here and it does not make much sence for us to subscribe for the Pro version for it.

@rdehuyss
Copy link
Contributor

@pbilstein - just to be sure, I would like some extra information about your use case: are you defining the id's of your job yourself?

The problem you are facing is that a job that was scheduled is being enqueued but it already exists and I don't immediately see how that can happen.

@pbilstein
Copy link

pbilstein commented Aug 31, 2021

@rdehuyss - Ok.. no, we do not set job ids on our own. Pseudo code is like this:

JobId jobId = jobScheduler.schedule(Instant.now().plusSeconds(foo.getWaitTimeSeconds()), () -> this.check());

jobId is stored as string (in bar object) and then in two other code blocks/threads we do this:

BackgroundJob.delete(UUID.fromString(bar.getJobId()));
jobScheduler.delete(bar.getJobId());

Ah... wait! We do this last deletion inside of check method which is called by jobrunnr. Could this be the cause?

EDIT: problem is, that it's very hard to reproduce the issue und the stacktrace is of another thread, so I cannot tell if this is the reason

rdehuyss added a commit that referenced this issue Aug 31, 2021
@rdehuyss
Copy link
Contributor

@pbilstein , yours is a legit bug. Will be solved in 4.0.0-RC2.

@rdehuyss rdehuyss added this to the 4.0 milestone Aug 31, 2021
@rdehuyss
Copy link
Contributor

Solved in 4.0.0-RC2 (somewhere tomorrow in Maven Central)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants