Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeating job with very short interval triggers exception on shutdown: 'RuntimeError: cannot schedule new futures after shutdown' #285

Closed
awlodge opened this issue Feb 3, 2018 · 22 comments

Comments

@awlodge
Copy link

awlodge commented Feb 3, 2018

This code hits the problem pretty reliably (~75% of the time on python2, only ~60% of the time on python3 for some reason):

from apscheduler.schedulers.background import BackgroundScheduler
import time
import logging

logging.basicConfig()

def dummy_job():
    print("test")

scheduler = BackgroundScheduler()
executors = {
    'default': {
        'type': 'threadpool',
        'max_workers': 20
    }
}
scheduler.configure(executors=executors)
scheduler.start()

scheduler.add_job(func=dummy_job, trigger='interval', seconds=0.05)
time.sleep(0.5)
scheduler.shutdown()

This code gives the following output when run:

test
test
test
test
test
test
test
test
test
ERROR:apscheduler.scheduler:Error submitting job "dummy_job (trigger: interval[0:00:00.050000], next run at: 2018-02-03 12:40:57 GMT)" to executor "default"
Traceback (most recent call last):
  File "/data/build/ext/apscheduler/apscheduler/schedulers/base.py", line 960, in _process_jobs
    executor.submit_job(job, run_times)
  File "/data/build/ext/apscheduler/apscheduler/executors/base.py", line 71, in submit_job
    self._do_submit_job(job, run_times)
  File "/data/build/ext/apscheduler/apscheduler/executors/pool.py", line 22, in _do_submit_job
    f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)
  File "/usr/lib64/python3.3/concurrent/futures/thread.py", line 97, in submit
    raise RuntimeError('cannot schedule new futures after shutdown')
RuntimeError: cannot schedule new futures after shutdown

This is a regression introduced by #268 - reverting that fix causes it to cease hitting this problem at all.

The reason we are hitting this is that we have a repeating job running with a 100ms interval whenever the scheduler is running. Since upgrading to v3.5.1 we hit this error every time the scheduler is stopped.

I'm not sure what the right fix is - obviously you can't just revert #268. Maybe BaseScheduler._process_jobs could check whether the scheduler is running before it does anything else?

@agronholm
Copy link
Owner

agronholm commented Feb 3, 2018

I knew there was a good reason why I made it hold both locks at once!

@tkremeyer
Copy link

Hi, since this issue has been open for quite some time now and it does not seem to be trivial to get the locks right, what do you think of this workaround for now:

I think the desired behavior of the _process_jobs method when the scheduler is stopped is to leave everything as if the job had never been tried to submit. Therefore, we could:

  • catch the error generated by the executor in case the executor is already shut down, only output a log message on a low log level and don't update the next runtime of the job such that the job execution is not effectively skipped.
  • check the scheduler state for each loop iteration in _process_jobs. Should it change to STATE_PAUSED or STATE_STOPPED, abort the loop early and do not try to schedule the remaining jobs. That way at most a single job will be tried to submit when the scheduler is stopped.

To make this generic, it would probably be necessary that the Executor catches the executor-specific exception (e.g. RuntimeError for ThreadPoolExecutor) in its _do_submit_job-method and instead raises an APScheduler-specific exception (like ExecutorNotRunningError).

Does this make sense?

@filwillian
Copy link

How about remove the job once it is done before shutdown...

...
scheduler.configure(executors=executors)
scheduler.start()

my_job = scheduler.add_job(func=dummy_job, trigger='interval', seconds=0.05)
time.sleep(0.5)
my_job.remove()
scheduler.shutdown()
...

@agronholm
Copy link
Owner

I am in the process of doing a major refactoring on the codebase and I will fix this one way or the other in v4.0.

@filwillian I have no idea what you're suggesting here.

@gshashank
Copy link

@agronholm : I am experiencing same issue after shutting down the scheduler and then starting it again. It is not able to run the schedules that are persisted. Please suggest if there is any alternative approach to resolve this issue.

@agronholm
Copy link
Owner

@gshashank what is the reason you are shutting down and then restarting it within the same process? Or am I misunderstanding something?

@gshashank
Copy link

@agronholm : I am trying to understand the usage of persistence and test the scenario. For some reason my system crashed and when I make it up and running it should be able to pick up all the schedules that it missed out.

@agronholm
Copy link
Owner

Oh, so you're not shutting it down manually and restarting within the same process? That's a different scenario then. What exactly happens in your case?

@gshashank
Copy link

gshashank commented Aug 5, 2020

The way I am testing it is am explicitly calling scheduler.shutdown() to stop all the schedules and then am calling scheduler.start() to make sure all the schedules should start again which was not happening and throwing an issue. What does same process mean? In my case possibly server might crash. Below is my code I am scheduling using a list which has the time intervals in it.

# -*- coding: utf-8 -*-
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.jobstores.sqlalchemy import SQLAlchemyJobStore
from apscheduler.executors.pool import ThreadPoolExecutor, ProcessPoolExecutor
from apscheduler.triggers.cron import CronTrigger
import logging


jobstores = {
'default': SQLAlchemyJobStore(url='sqlite:///jobs.sqlite')
}

executors = {
'default': ThreadPoolExecutor(20),
'processpool': ProcessPoolExecutor(5)
}
job_defaults = {
'coalesce': False,
'max_instances': 3
}

def testTrigger(schedule):
    print("Schuduler..excuting for"+str(schedule))

if __name__ == '__main__':
    scheduler = BackgroundScheduler(jobstores=jobstores, executors=executors, job_defaults=job_defaults)
    
scheduleList=[1,2]
for schedule in scheduleList:    
        scheduler.add_job(testTrigger, args=[schedule], trigger='interval',seconds=schedule, id='schedule_'+str(schedule),replace_existing=True)
   

if scheduler.running:
    print("Scheduler Instance is already running")
else:
    print("Starting Scheduler Instance")
    scheduler.start()

#scheduler.shutdown(wait=True)

@gshashank
Copy link

@agronholm : The way of calling sheduler.shutdown() and scheduler.start() is creating this issue but when I completely close the application while execution and reopening it and then calling scheduler.start() seems to be working fine

@agronholm
Copy link
Owner

The way of calling sheduler.shutdown() and scheduler.start() is creating this issue but when I completely close the application while execution and reopening it and then calling scheduler.start() seems to be working fine

So this only happens in your test script and not in production? I understood from your message that you were having a production issue.

@gshashank
Copy link

gshashank commented Aug 5, 2020

@agronholm : I am exploring this library and still in the process of development. I am making sure to test all scenarios before it gets into production. Let me know if my understanding is correct

@agronholm : The way of calling sheduler.shutdown() and scheduler.start() is creating this issue but when I completely close the application while execution and reopening it and then calling scheduler.start() seems to be working fine

@bcm0
Copy link

bcm0 commented Aug 8, 2020

I'm looking forward to a working shutdown-start-cycle.
Until then I have to loop to pause and resume all jobs. The outcome is the same I think.

@huangsam
Copy link

I notice this issue pop up multiple times in production when our team restarts the APScheduler component for our system. As context, here's the code that we wrote:

job_request_scheduler = BackgroundScheduler()

# ...

_every_min = IntervalTrigger(minutes=1, timezone=pytz.utc)

# ...

def soft_ping():
    log.info("Soft ping")

# ...

job_request_scheduler.add_job(soft_ping, _every_min, replace_existing=True, id=soft_ping.__name__)

And here's the log error that we encounter when terminating the APScheduler component as part of our deployment upgrade/rollback process:

2021-04-28 22:17:27,353 [21:139946559825664] [apscheduler.scheduler] [ERROR] Error submitting job "soft_ping (trigger: interval[0:01:00], next run at: 2021-04-28 22:17:27 UTC)" to executor "default"
Traceback (most recent call last):^M
  File ".../apscheduler/schedulers/base.py", line 974, in _process_jobs^M
    executor.submit_job(job, run_times)^M
  File ".../apscheduler/executors/base.py", line 71, in submit_job^M
    self._do_submit_job(job, run_times)^M
  File ".../apscheduler/executors/pool.py", line 22, in _do_submit_job^M
    f = self._pool.submit(run_job, job, job._jobstore_alias, run_times, self._logger.name)^M
  File ".../python3.7/concurrent/futures/thread.py", line 163, in submit^M
    raise RuntimeError('cannot schedule new futures after shutdown')^M
RuntimeError: cannot schedule new futures after shutdown
...

@riyadparvez
Copy link

Is there any update on this issue? I've recently encountered this on python 3.9

@agronholm
Copy link
Owner

What does this have to do with Mycroft?

@Hitmare
Copy link

Hitmare commented Feb 8, 2022

What does this have to do with Mycroft?

my bad, mistook the github thread for an mycroft one.

@agronholm
Copy link
Owner

I've fixed this in the 3.x branch now (at least I can't repro anymore). Release is imminent, but feel free to test already.

@riyadparvez
Copy link

Thank you for fixing it. Do you know when you are planning on releasing?

@karbulot
Copy link

I'm still experiencing this issue on 3.9.1.

@agronholm
Copy link
Owner

@karbulot please provide repro instructions.

@stepacool
Copy link

Still getting this in 3.9 with Flask-APSheduler when restarting the project with docker-compose up (for newer image versions). Also very often I get SchedulerAlreadyRunningError exception too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests