Skip to content

Conversation

@lstein
Copy link
Owner

@lstein lstein commented Jan 5, 2026

Summary

Synchronize with remote invokeai/main.

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

Copilot AI and others added 30 commits December 24, 2025 00:16
- Added model_cache_keep_alive config field (minutes, default 0 = infinite)
- Implemented timeout tracking in ModelCache class
- Added _record_activity() to track model usage
- Added _on_timeout() to auto-clear cache when timeout expires
- Added shutdown() method to clean up timers
- Integrated timeout with get(), lock(), unlock(), and put() operations
- Updated ModelManagerService to pass keep_alive parameter
- Added cleanup in stop() method

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Created test_model_cache_timeout.py with comprehensive tests
- Tests timeout clearing behavior
- Tests activity resetting timeout
- Tests no-timeout default behavior
- Tests shutdown canceling timers

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Added clarifying comment that _record_activity is called with lock held
- Enhanced double-check in _on_timeout for thread safety
- Added lock protection to shutdown method
- Improved handling of edge cases where timer fires during activity

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Remove unused variable in test
- Add clarifying comment for daemon thread setting
- Add detailed comment explaining cache clearing with 1000 GB value
- Improve code documentation

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Add explicit storage_device parameter (cpu)
- Add explicit log_memory_usage parameter from config
- Improves code clarity and configuration transparency

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
- Remove all trailing whitespace (W293 errors)
- Add debug logging when timeout fires but activity detected
- Add debug logging when timeout fires but cache is empty
- Only log "Clearing model cache" message when actually clearing
- Prevents misleading timeout messages during active generation

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Only log "Clearing model cache" message when there are actually unlocked
models to clear. This prevents the misleading message from appearing during
active generation when all models are locked.

Changes:
- Check for unlocked models before logging clear message
- Add count of unlocked models in log message
- Add debug log when all models are locked
- Improves user experience by avoiding confusing messages

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
Configure mock logger to return a valid log level for getEffectiveLevel()
to prevent TypeError when comparing with logging.DEBUG constant.

The issue was that ModelCache._log_cache_state() checks
self._logger.getEffectiveLevel() > logging.DEBUG, and when the logger
is a MagicMock without configuration, getEffectiveLevel() returns another
MagicMock, causing a TypeError when compared with an int.

Fixes all 4 test failures in test_model_cache_timeout.py

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
…model-option' into copilot/add-unload-model-option
Add support for alternative diffusers Flow Matching schedulers:
- Euler (default, 1st order)
- Heun (2nd order, better quality, 2x slower)
- LCM (optimized for few steps)

Backend:
- Add schedulers.py with scheduler type definitions and class mapping
- Modify denoise.py to accept optional scheduler parameter
- Add scheduler InputField to flux_denoise invocation (v4.2.0)

Frontend:
- Add fluxScheduler to Redux state and paramsSlice
- Create ParamFluxScheduler component for Linear UI
- Add scheduler to buildFLUXGraph for generation
Add support for alternative diffusers Flow Matching schedulers for Z-Image:
- Euler (default) - 1st order, optimized for Z-Image-Turbo (8 steps)
- Heun (2nd order) - Better quality, 2x slower
- LCM - Optimized for few-step generation

Backend:
- Extend schedulers.py with Z-Image scheduler types and mapping
- Add scheduler InputField to z_image_denoise invocation (v1.3.0)
- Refactor denoising loop to support diffusers schedulers

Frontend:
- Add zImageScheduler to Redux state in paramsSlice
- Create ParamZImageScheduler component for Linear UI
- Add scheduler to buildZImageGraph for generation
Changed the default value of model_cache_keep_alive from 0 (indefinite)
to 5 minutes as requested. This means models will now be automatically
cleared from cache after 5 minutes of inactivity by default, unless
users explicitly configure a different value.

Users can still set it to 0 in their config to get the old behavior
of keeping models indefinitely.

Co-authored-by: lstein <111189+lstein@users.noreply.github.com>
LCM scheduler may have more internal timesteps than user-facing steps,
causing user_step to exceed total_steps. This resulted in progress
percentage > 1.0, which caused a pydantic validation error.

Fix: Only call step_callback when user_step <= total_steps.
Pfannkuchensack and others added 22 commits January 3, 2026 19:50
…model-option' into copilot/add-unload-model-option
Move Scheduler handler after MainModel in ImageMetadataHandlers so that
base-dependent recall logic (z-image scheduler) works correctly. The
Scheduler handler checks `base === 'z-image'` before dispatching the
z-image scheduler action, but this check failed when Scheduler ran
before MainModel was recalled.
Instead of disabling mutually exclusive model selectors, automatically
clear conflicting models when a new selection is made. This applies to
VAE, Qwen3 Encoder, and Qwen3 Source selectors - selecting one now
clears the others. Also applies same logic during metadata recall.
…8693)

## Summary

Adds `model_cache_keep_alive_min` config option (minutes, default 5) to
automatically clear model cache after inactivity. Addresses memory
contention when running InvokeAI alongside other GPU applications like
Ollama.

**Implementation:**
- **Config**: New `model_cache_keep_alive_min` field in
`InvokeAIAppConfig` with 5-minute default
- **ModelCache**: Activity tracking on get/lock/unlock/put operations,
threading.Timer for scheduled clearing
- **Thread safety**: Double-check pattern handles race conditions,
daemon threads for clean shutdown
- **Integration**: ModelManagerService passes config to cache, calls
shutdown() on stop
- **Logging**: Smart timeout logging that only shows messages when
unlocked models are actually cleared
- **Tests**: Comprehensive unit tests with properly configured mock
logger

**Usage:**
```yaml
# invokeai.yaml
model_cache_keep_alive_min: 10  # Clear after 10 minutes idle
model_cache_keep_alive_min: 0   # Set to 0 for indefinite caching (old behavior)
```

**Key Behavior:**
- **Default timeout**: 5 minutes - models are automatically cleared
after 5 minutes of inactivity
- Clearing uses same logic as "Clear Model Cache" button (make_room with
1000GB)
- Only clears **unlocked** models (respects models actively in use
during generation)
- Timeout message only appears when models are actually cleared
- Debug logging available for timeout events when no action is taken
- Prevents misleading log entries during active generation
- Users can set to 0 to restore indefinite caching behavior

## Related Issues / Discussions

Addresses enhancement request for automatic model unloading from memory
after inactivity period.

## QA Instructions

1. **Test default behavior (5-minute timeout)**:
   - Start InvokeAI without explicit config
   - Run a generation
   - Wait 6 minutes with no activity
   - Check logs for "Clearing X unlocked model(s) from cache" message
   - Verify cache is empty

2. **Test custom timeout**:
   - Set `model_cache_keep_alive_min: 0.1` (6 seconds) in config
   - Load a model (run generation)
   - Wait 7+ seconds with no activity
   - Check logs for "Clearing X unlocked model(s) from cache" message
   - Verify cache is empty

3. **Test no timeout (old behavior)**:
   - Set `model_cache_keep_alive_min: 0` in config
   - Run generations and wait extended periods
   - Verify models remain cached indefinitely

4. **Test during active use**:
   - Run continuous generations with any timeout setting
- Verify no timeout messages appear during active use (models are
locked)
- After generation completes, wait for timeout and verify unlocked
models are cleared

## Merge Plan

N/A - Additive change with sensible defaults. The 5-minute default
enables automatic memory management while remaining practical for
typical workflows.

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_

<!-- START COPILOT ORIGINAL PROMPT -->



<details>

<summary>Original prompt</summary>

> 
> ----
> 
> *This section details on the original issue you should resolve*
> 
> <issue_title>[enhancement]: option to unload from memory
</issue_title>
> <issue_description>### Is there an existing issue for this?
> 
> - [X] I have searched the existing issues
> 
> ### Contact Details
> 
> ### What should this feature add?
> 
> a command line option to unload model from RAM after a defined period
of time
> 
> ### Alternatives
> 
> running as a container and using Sablier to shutdown the container
after some time, this has the downside of if traffic isn't see through
the web interface it will be shut even if jobs are running.
> 
> ### Additional Content
> 
> _No response_</issue_description>
> 
> ## Comments on the Issue (you are @copilot in this section)
> 
> <comments>
> <comment_new><author>@lstein</author><body>
> I am reopening this issue. I'm running ollama and invoke on the same
server and I find their memory requirements are frequently clashing. It
would be helpful to offer users the option to have the model cache
automatically cleared after a fixed amount of inactivity. I would
suggest the following:
> 
> 1. Introduce a new config file option `model_cache_keep_alive` which
specifies, in minutes, how long to keep a model in cache between
generations. The default is 0, which means to keep the model in cache
indefinitely, as is currently the case.
> 2. If no model generations occur within the timeout period, the model
cache is cleared using the same backend code as the "Clear Model Cache"
button in the queue tab.
> 
> I'm going to assign this to GitHub copilot, partly to test how well it
can manage the Invoke code base. </body></comment_new>
> </comments>
> 


</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes #6856

<!-- START COPILOT CODING AGENT TIPS -->
---

✨ Let Copilot coding agent [set things up for
you](https://github.com/invoke-ai/InvokeAI/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot)
— coding agent works faster and does higher quality work when set up for
your repo.
## Summary

Add a new "Denoise - Z-Image + Metadata" node
(`ZImageDenoiseMetaInvocation`) that extends the Z-Image denoise node
with metadata output for image recall functionality.

This follows the same pattern as existing `denoise_latents_meta`
(SD1.5/SDXL) and `flux_denoise_meta` (FLUX) nodes.

**Captured metadata:**
- `width` / `height`
- `steps`
- `guidance` (guidance_scale)
- `denoising_start` / `denoising_end`
- `scheduler`
- `model` (transformer)
- `seed`
- `loras` (if applied)

## Related Issues / Discussions

Enables metadata recall for Z-Image generated images, similar to
existing support for SD1.5, SDXL, and FLUX models.

## QA Instructions

1. Create a workflow using the new "Denoise - Z-Image + Metadata" node
2. Connect the metadata output to a "Save Image" node
3. Generate an image
4. Check that metadata is saved with the image (visible in image info
panel)
5. Verify all generation parameters are captured correctly

## Merge Plan

Requires `feature/zimage-scheduler-support` #8705 branch to be merged
first (base branch).

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [ ] _Tests added / updated (if applicable)_
- [ ] _❗Changes to a redux slice have a corresponding migration_
- [ ] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
@lstein lstein changed the base branch from main to lstein-master January 5, 2026 02:01
@lstein lstein merged commit 9ca69d8 into lstein:lstein-master Jan 5, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants