Skip to content

Fix graceful shutdown to wait for download/install worker threads#102

Merged
lstein merged 3 commits intolstein/feature/nicer-shutdownfrom
copilot/fix-worker-thread-shutdown
Feb 28, 2026
Merged

Fix graceful shutdown to wait for download/install worker threads#102
lstein merged 3 commits intolstein/feature/nicer-shutdownfrom
copilot/fix-worker-thread-shutdown

Conversation

Copy link

Copilot AI commented Feb 28, 2026

Summary

On KeyboardInterrupt, the shutdown path called os._exit(0) unconditionally, killing the process before download or model install threads could complete or cancel their in-flight work.

Replace os._exit(0) with ApiDependencies.shutdown(), which propagates through invoker.stop() → each service's stop():

  • Download queue: cancels active jobs, signals workers via stop event, joins threads
  • Model install service: clears pending jobs, joins install thread

The process then exits naturally; all worker threads are daemon=True so they won't block exit if shutdown is interrupted again.

# Before
except KeyboardInterrupt:
    logger.info("InvokeAI shutting down...")
    os._exit(0)  # kills in-flight downloads/installs immediately

# After
except KeyboardInterrupt:
    logger.info("InvokeAI shutting down...")
    from invokeai.app.api.dependencies import ApiDependencies
    ApiDependencies.shutdown()  # cancels/joins worker threads cleanly

To prevent a double-shutdown error, both stop() methods are now idempotent — they return silently if the service is not running. This is necessary because uvicorn's graceful shutdown triggers the FastAPI lifespan which calls ApiDependencies.shutdown(), and then a KeyboardInterrupt can still propagate from run_until_complete() into the except block, causing a second ApiDependencies.shutdown() call. Making stop() a no-op when already stopped (or never started) handles both this double-shutdown case and early interrupts where services were never fully initialized.

Related Issues / Discussions

QA Instructions

Start a model download, then hit CTRL+C. Confirm:

  1. The download is cleanly cancelled (partial .downloading file left on disk, no corrupted state) rather than being silently killed mid-write.
  2. No Exception: Attempt to stop the install service before it was started error appears.
  3. The process exits promptly without hanging indefinitely.

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)
Original prompt

This section details on the original issue you should resolve

<issue_title>[bug]: Wait for non-daemon threads to complete work.</issue_title>
<issue_description>### Is there an existing issue for this problem?

  • I have searched the existing issues

Install method

Invoke's Launcher

Operating system

Linux

GPU vendor

Nvidia (CUDA)

GPU model

No response

GPU VRAM

No response

Version number

lstein/feature/nicer-shutdown branch

Browser

No response

System Information

No response

What happened

The shutdown code introduced in this branch does an unconditional os._exit(0) upon keyboard interrupt. If some threads are still doing work their work will be interrupted. It would be better to wait for all worker threads to complete their work. This is particularly a concern for the model download and install threads.

What you expected to happen

If a worker thread is active, the system should wait for the threads to complete before exiting. I believe that the download and install managers have a "shutdown" call that should do this.

How to reproduce the problem

No response

Additional context

No response

Discord username

No response</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Instead of immediately force-exiting the process on CTRL+C, call
ApiDependencies.shutdown() to gracefully stop the download and install
manager services, allowing active work to complete or cancel cleanly
before the process exits.

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix shutdown to wait for non-daemon threads Fix graceful shutdown to wait for download/install worker threads Feb 28, 2026
@lstein lstein marked this pull request as ready for review February 28, 2026 15:34
@lstein lstein self-requested a review as a code owner February 28, 2026 15:34
@lstein
Copy link
Owner

lstein commented Feb 28, 2026

@copilot After ^C I am now seeing the error message Exception: Attempt to stop the install service before it was started and things hang indefinitely. Stack trace is:

Traceback (most recent call last):
  File "/home/lstein/invokeai-lstein/.venv/bin/invokeai-web", line 12, in <module>
    sys.exit(run_app())
             ^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/run_app.py", line 111, in run_app
    ApiDependencies.shutdown()
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/api/dependencies.py", line 197, in shutdown
    ApiDependencies.invoker.stop()
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/invoker.py", line 37, in stop
    self.__stop_service(getattr(self.services, service))
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/invoker.py", line 26, in __stop_service
    stop_op(self)
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/model_manager/model_manager_default.py", line 69, in stop
    service.stop(invoker)
  File "/home/lstein/Projects/InvokeAI-lstein/invokeai/app/services/model_install/model_install_default.py", line 316, in stop
    raise Exception("Attempt to stop the install service before it was started")
Exception: Attempt to stop the install service before it was started

The same message appears if a model is being downloaded and installed.

When CTRL+C is pressed, uvicorn's graceful shutdown triggers the FastAPI
lifespan which calls ApiDependencies.shutdown(), then a KeyboardInterrupt
propagates from run_until_complete() hitting the except block which tries
to call ApiDependencies.shutdown() a second time.

Change both stop() methods to return silently (instead of raising) when
the service is not running. This handles:
- Double-shutdown: lifespan already stopped the services
- Early interrupt: services were never fully started

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Copy link
Author

Copilot AI commented Feb 28, 2026

@copilot After ^C I am now seeing the error message Exception: Attempt to stop the install service before it was started and things hang indefinitely. Stack trace is:

Traceba...

The error was caused by a double-shutdown: uvicorn's graceful shutdown triggers the FastAPI lifespan which calls ApiDependencies.shutdown(), and then a KeyboardInterrupt propagates from run_until_complete() hitting the except block which called ApiDependencies.shutdown() a second time. Since _running was already False, the second call raised the exception and left cleanup incomplete (causing the hang).

Fixed in commit 4b70d39 by making stop() idempotent in both download_default.py and model_install_default.py — they now return silently if the service is not running, instead of raising. This handles both the double-shutdown case and early interrupts where services were never fully started.

Copy link
Owner

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works as advertised.

@lstein lstein force-pushed the copilot/fix-worker-thread-shutdown branch from 04a83cf to 4b70d39 Compare February 28, 2026 16:08
@lstein lstein merged commit 36150df into lstein/feature/nicer-shutdown Feb 28, 2026
20 of 26 checks passed
@lstein lstein deleted the copilot/fix-worker-thread-shutdown branch February 28, 2026 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants