forked from invoke-ai/InvokeAI
-
Notifications
You must be signed in to change notification settings - Fork 0
Sync with remote #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Added model_cache_keep_alive config field (minutes, default 0 = infinite) - Implemented timeout tracking in ModelCache class - Added _record_activity() to track model usage - Added _on_timeout() to auto-clear cache when timeout expires - Added shutdown() method to clean up timers - Integrated timeout with get(), lock(), unlock(), and put() operations - Updated ModelManagerService to pass keep_alive parameter - Added cleanup in stop() method Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Created test_model_cache_timeout.py with comprehensive tests - Tests timeout clearing behavior - Tests activity resetting timeout - Tests no-timeout default behavior - Tests shutdown canceling timers Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Added clarifying comment that _record_activity is called with lock held - Enhanced double-check in _on_timeout for thread safety - Added lock protection to shutdown method - Improved handling of edge cases where timer fires during activity Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Remove unused variable in test - Add clarifying comment for daemon thread setting - Add detailed comment explaining cache clearing with 1000 GB value - Improve code documentation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Add explicit storage_device parameter (cpu) - Add explicit log_memory_usage parameter from config - Improves code clarity and configuration transparency Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Remove all trailing whitespace (W293 errors) - Add debug logging when timeout fires but activity detected - Add debug logging when timeout fires but cache is empty - Only log "Clearing model cache" message when actually clearing - Prevents misleading timeout messages during active generation Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Only log "Clearing model cache" message when there are actually unlocked models to clear. This prevents the misleading message from appearing during active generation when all models are locked. Changes: - Check for unlocked models before logging clear message - Add count of unlocked models in log message - Add debug log when all models are locked - Improves user experience by avoiding confusing messages Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Configure mock logger to return a valid log level for getEffectiveLevel() to prevent TypeError when comparing with logging.DEBUG constant. The issue was that ModelCache._log_cache_state() checks self._logger.getEffectiveLevel() > logging.DEBUG, and when the logger is a MagicMock without configuration, getEffectiveLevel() returns another MagicMock, causing a TypeError when compared with an int. Fixes all 4 test failures in test_model_cache_timeout.py Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
…model-option' into copilot/add-unload-model-option
Add support for alternative diffusers Flow Matching schedulers: - Euler (default, 1st order) - Heun (2nd order, better quality, 2x slower) - LCM (optimized for few steps) Backend: - Add schedulers.py with scheduler type definitions and class mapping - Modify denoise.py to accept optional scheduler parameter - Add scheduler InputField to flux_denoise invocation (v4.2.0) Frontend: - Add fluxScheduler to Redux state and paramsSlice - Create ParamFluxScheduler component for Linear UI - Add scheduler to buildFLUXGraph for generation
Add support for alternative diffusers Flow Matching schedulers for Z-Image: - Euler (default) - 1st order, optimized for Z-Image-Turbo (8 steps) - Heun (2nd order) - Better quality, 2x slower - LCM - Optimized for few-step generation Backend: - Extend schedulers.py with Z-Image scheduler types and mapping - Add scheduler InputField to z_image_denoise invocation (v1.3.0) - Refactor denoising loop to support diffusers schedulers Frontend: - Add zImageScheduler to Redux state in paramsSlice - Create ParamZImageScheduler component for Linear UI - Add scheduler to buildZImageGraph for generation
Changed the default value of model_cache_keep_alive from 0 (indefinite) to 5 minutes as requested. This means models will now be automatically cleared from cache after 5 minutes of inactivity by default, unless users explicitly configure a different value. Users can still set it to 0 in their config to get the old behavior of keeping models indefinitely. Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
LCM scheduler may have more internal timesteps than user-facing steps, causing user_step to exceed total_steps. This resulted in progress percentage > 1.0, which caused a pydantic validation error. Fix: Only call step_callback when user_step <= total_steps.
…model-option' into copilot/add-unload-model-option
Move Scheduler handler after MainModel in ImageMetadataHandlers so that base-dependent recall logic (z-image scheduler) works correctly. The Scheduler handler checks `base === 'z-image'` before dispatching the z-image scheduler action, but this check failed when Scheduler ran before MainModel was recalled.
…e previous default behavior
Instead of disabling mutually exclusive model selectors, automatically clear conflicting models when a new selection is made. This applies to VAE, Qwen3 Encoder, and Qwen3 Source selectors - selecting one now clears the others. Also applies same logic during metadata recall.
…8693) ## Summary Adds `model_cache_keep_alive_min` config option (minutes, default 5) to automatically clear model cache after inactivity. Addresses memory contention when running InvokeAI alongside other GPU applications like Ollama. **Implementation:** - **Config**: New `model_cache_keep_alive_min` field in `InvokeAIAppConfig` with 5-minute default - **ModelCache**: Activity tracking on get/lock/unlock/put operations, threading.Timer for scheduled clearing - **Thread safety**: Double-check pattern handles race conditions, daemon threads for clean shutdown - **Integration**: ModelManagerService passes config to cache, calls shutdown() on stop - **Logging**: Smart timeout logging that only shows messages when unlocked models are actually cleared - **Tests**: Comprehensive unit tests with properly configured mock logger **Usage:** ```yaml # invokeai.yaml model_cache_keep_alive_min: 10 # Clear after 10 minutes idle model_cache_keep_alive_min: 0 # Set to 0 for indefinite caching (old behavior) ``` **Key Behavior:** - **Default timeout**: 5 minutes - models are automatically cleared after 5 minutes of inactivity - Clearing uses same logic as "Clear Model Cache" button (make_room with 1000GB) - Only clears **unlocked** models (respects models actively in use during generation) - Timeout message only appears when models are actually cleared - Debug logging available for timeout events when no action is taken - Prevents misleading log entries during active generation - Users can set to 0 to restore indefinite caching behavior ## Related Issues / Discussions Addresses enhancement request for automatic model unloading from memory after inactivity period. ## QA Instructions 1. **Test default behavior (5-minute timeout)**: - Start InvokeAI without explicit config - Run a generation - Wait 6 minutes with no activity - Check logs for "Clearing X unlocked model(s) from cache" message - Verify cache is empty 2. **Test custom timeout**: - Set `model_cache_keep_alive_min: 0.1` (6 seconds) in config - Load a model (run generation) - Wait 7+ seconds with no activity - Check logs for "Clearing X unlocked model(s) from cache" message - Verify cache is empty 3. **Test no timeout (old behavior)**: - Set `model_cache_keep_alive_min: 0` in config - Run generations and wait extended periods - Verify models remain cached indefinitely 4. **Test during active use**: - Run continuous generations with any timeout setting - Verify no timeout messages appear during active use (models are locked) - After generation completes, wait for timeout and verify unlocked models are cleared ## Merge Plan N/A - Additive change with sensible defaults. The 5-minute default enables automatic memory management while remaining practical for typical workflows. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [x] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [x] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_ <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>[enhancement]: option to unload from memory </issue_title> > <issue_description>### Is there an existing issue for this? > > - [X] I have searched the existing issues > > ### Contact Details > > ### What should this feature add? > > a command line option to unload model from RAM after a defined period of time > > ### Alternatives > > running as a container and using Sablier to shutdown the container after some time, this has the downside of if traffic isn't see through the web interface it will be shut even if jobs are running. > > ### Additional Content > > _No response_</issue_description> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@lstein</author><body> > I am reopening this issue. I'm running ollama and invoke on the same server and I find their memory requirements are frequently clashing. It would be helpful to offer users the option to have the model cache automatically cleared after a fixed amount of inactivity. I would suggest the following: > > 1. Introduce a new config file option `model_cache_keep_alive` which specifies, in minutes, how long to keep a model in cache between generations. The default is 0, which means to keep the model in cache indefinitely, as is currently the case. > 2. If no model generations occur within the timeout period, the model cache is cleared using the same backend code as the "Clear Model Cache" button in the queue tab. > > I'm going to assign this to GitHub copilot, partly to test how well it can manage the Invoke code base. </body></comment_new> > </comments> > </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes #6856 <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/invoke-ai/InvokeAI/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.
## Summary Add a new "Denoise - Z-Image + Metadata" node (`ZImageDenoiseMetaInvocation`) that extends the Z-Image denoise node with metadata output for image recall functionality. This follows the same pattern as existing `denoise_latents_meta` (SD1.5/SDXL) and `flux_denoise_meta` (FLUX) nodes. **Captured metadata:** - `width` / `height` - `steps` - `guidance` (guidance_scale) - `denoising_start` / `denoising_end` - `scheduler` - `model` (transformer) - `seed` - `loras` (if applied) ## Related Issues / Discussions Enables metadata recall for Z-Image generated images, similar to existing support for SD1.5, SDXL, and FLUX models. ## QA Instructions 1. Create a workflow using the new "Denoise - Z-Image + Metadata" node 2. Connect the metadata output to a "Save Image" node 3. Generate an image 4. Check that metadata is saved with the image (visible in image info panel) 5. Verify all generation parameters are captured correctly ## Merge Plan Requires `feature/zimage-scheduler-support` #8705 branch to be merged first (base branch). ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [ ] _Tests added / updated (if applicable)_ - [ ] _❗Changes to a redux slice have a corresponding migration_ - [ ] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Synchronize with remote invokeai/main.
Related Issues / Discussions
QA Instructions
Merge Plan
Checklist
What's Newcopy (if doing a release after this PR)