Enable fast switching among models at the invoke> command line #1066

lstein · 2022-10-12T06:39:32Z

New commands:

This PR enables two new commands in the invoke.py script

!models -- list the available models and their cache status
!switch <model> -- switch to the indicated model

Example:

invoke> !models
laion400m            not loaded  Latent Diffusion LAION400M model
stable-diffusion-1.4     active  Stable Diffusion inference model version 1.4
waifu-1.3                cached  Waifu anime model version 1.3
invoke> !switch waifu-1.3
>> Caching model stable-diffusion-1.4 in system RAM
>> Retrieving model waifu-1.3 from system RAM cache

The name and descriptions of the models are taken from config/models.yaml. A future enhancement to model_cache.py will be to enable new model stanzas to be added to the file programmatically. This will be useful for the WebGUI.

More details:

Use fast switching algorithm described in PR Feature: model switching in dream prompt #948, implemented in new module ldm/invoke/model_cache.py
Models are selected using their configuration stanza name given in models.yaml.
To avoid filling up CPU RAM with cached models, this PR implements an LRU cache that monitors available CPU RAM.
The caching code allows the minimum value of available RAM to be adjusted, but invoke.py does not currently have a command-line argument that allows you to set it. The minimum free RAM is arbitrarily set to 2 GB.
Add optional description field to configs/models.yaml

Unrelated fixes:

Added ">>" to CompViz model loading messages in order to make user experience more consistent.
When generating an image greater than defaults, will only warn about possible VRAM filling the first time.
Fixed bug that was causing help message to be printed twice. This involved moving the import line for the web backend into the section where it is called.

Co-authored by @ArDiouscuros

- This PR enables two new commands in the invoke.py script !models -- list the available models and their cache status !switch <model> -- switch to the indicated model Example: invoke> !models laion400m not loaded Latent Diffusion LAION400M model stable-diffusion-1.4 active Stable Diffusion inference model version 1.4 waifu-1.3 cached Waifu anime model version 1.3 invoke> !switch waifu-1.3 >> Caching model stable-diffusion-1.4 in system RAM >> Retrieving model waifu-1.3 from system RAM cache More details: - Use fast switching algorithm described in PR #948 - Models are selected using their configuration stanza name given in models.yaml. - To avoid filling up CPU RAM with cached models, this PR implements an LRU cache that monitors available CPU RAM. - The caching code allows the minimum value of available RAM to be adjusted, but invoke.py does not currently have a command-line argument that allows you to set it. The minimum free RAM is arbitrarily set to 2 GB. - Add optional description field to configs/models.yaml Unrelated fixes: - Added ">>" to CompViz model loading messages in order to make user experience more consistent. - When generating an image greater than defaults, will only warn about possible VRAM filling the first time. - Fixed bug that was causing help message to be printed twice. This involved moving the import line for the web backend into the section where it is called.

@ArDiouscuros

- This PR enables two new commands in the invoke.py script !models -- list the available models and their cache status !switch <model> -- switch to the indicated model Example: invoke> !models laion400m not loaded Latent Diffusion LAION400M model stable-diffusion-1.4 active Stable Diffusion inference model version 1.4 waifu-1.3 cached Waifu anime model version 1.3 invoke> !switch waifu-1.3 >> Caching model stable-diffusion-1.4 in system RAM >> Retrieving model waifu-1.3 from system RAM cache The name and descriptions of the models are taken from `config/models.yaml`. A future enhancement to `model_cache.py` will be to enable new model stanzas to be added to the file programmatically. This will be useful for the WebGUI. More details: - Use fast switching algorithm described in PR #948 - Models are selected using their configuration stanza name given in models.yaml. - To avoid filling up CPU RAM with cached models, this PR implements an LRU cache that monitors available CPU RAM. - The caching code allows the minimum value of available RAM to be adjusted, but invoke.py does not currently have a command-line argument that allows you to set it. The minimum free RAM is arbitrarily set to 2 GB. - Add optional description field to configs/models.yaml Unrelated fixes: - Added ">>" to CompViz model loading messages in order to make user experience more consistent. - When generating an image greater than defaults, will only warn about possible VRAM filling the first time. - Fixed bug that was causing help message to be printed twice. This involved moving the import line for the web backend into the section where it is called. Coauthored by: @ArDiouscuros

…model-switching

lstein · 2022-10-12T06:41:41Z

Hey folks, I've tested the fast model switching on both a CUDA system and an intel box with no GPU, so I think it will work on the Mac, but I'm asking @Any-Winter-4079 to check it out just in case.

I tried to make this useful for the WebGUI. The API is in ldm/invoke/model_cache.py, and the useful calls are:

from ldm.invoke.mode_cache import ModelCache
cache = ModelCache(omega_conf, device_type, precision, [min_avail_mem])
model_dict1 = cache.get_model('latent-diffusion-1.4')
model_dict2 = cache.get_model('waifu-1.3')
model_status = cache.list_models()
for name in model_status:
   print(f'{name} / {model_status[name]["status"]'})

The model_dict contains the following keys: model, width, height, and hash. The width and height are the defaults for the model. Hash is the sha256 hash of the model for use in writing metadata. After you switch models you should rebuild any generators that might use the old one or you will get unwelcome surprises.

ldm/generate.py

Any-Winter-4079 · 2022-10-12T13:19:03Z

I have a group (uni) project for tomorrow, so I'll be mostly unavailable today. So, if you want a quick review maybe @Vargol or @skurovec can chime in

skurovec · 2022-10-12T13:44:14Z

Tested on M1 Max 32GB.

stable diffusion 1.4
"banana sushi" -s 50 -S 1760350044 -W 512 -H 512 -C 8.0 -A ddim

image generated

!switch waifu-diffusion-1.3
"banana sushi" -s 50 -S 1760350044 -W 512 -H 512 -C 8.0 -A ddim

image generated

!switch waifu-diffusion-1.2
"banana sushi" -s 50 -S 1760350044 -W 512 -H 512 -C 8.0 -A ddim

image generated

!switch stable-diffusion-1.4
"banana sushi" -s 50 -S 1760350044 -W 512 -H 512 -C 8.0 -A ddim

crashed with:

RuntimeError: Placeholder storage has not been allocated on MPS device!

Switching on another cached model throws same error.

skurovec · 2022-10-12T13:58:37Z

It look like it is because in model_cache it is not returned to mps, only for cuda.

def _model_from_cpu(self,model):
        if self._has_cuda():
            model.to(self.device)
            model.first_stage_model.to(self.device)
            model.cond_stage_model.to(self.device)
            model.cond_stage_model.device = self.device

        return model

If I add model.to(self.device) it is workign again and generatign same image.

    def _model_from_cpu(self,model):
        if self._has_cuda():
            model.to(self.device)
            model.first_stage_model.to(self.device)
            model.cond_stage_model.to(self.device)
            model.cond_stage_model.device = self.device
        else:
            model.to(self.device)

        return model

lstein · 2022-10-12T14:14:25Z

Stupid bug! I'll fix. Lincoln

On Wed, Oct 12, 2022 at 9:58 AM Jan Skurovec ***@***.***> wrote: It look like it is because in model_cache it is not returned to mps, only for cuda. def _model_from_cpu(self,model): if self._has_cuda(): model.to(self.device) model.first_stage_model.to(self.device) model.cond_stage_model.to(self.device) model.cond_stage_model.device = self.device return model If I add model.to(self.device) it is workign again and generatign same image. def _model_from_cpu(self,model): if self._has_cuda(): model.to(self.device) model.first_stage_model.to(self.device) model.cond_stage_model.to(self.device) model.cond_stage_model.device = self.device else: model.to(self.device) return model — Reply to this email directly, view it on GitHub <#1066 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA3EVMH4QC52NOB4RETDGLWC27ZPANCNFSM6AAAAAARC6U4ZI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

lstein · 2022-10-12T14:24:47Z

H'mmm. On second thought, I'm not sure whether this will fix the problem. On non-CUDA systems, the only device is "cpu", and so the caching commands are essentially intended to be no-ops on MPS systems. Unless I'm fundamentally misunderstanding something! Lincoln On Wed, Oct 12, 2022 at 10:14 AM Lincoln Stein ***@***.***> wrote:

Stupid bug! I'll fix. Lincoln On Wed, Oct 12, 2022 at 9:58 AM Jan Skurovec ***@***.***> wrote: > It look like it is because in model_cache it is not returned to mps, only > for cuda. > > def _model_from_cpu(self,model): > if self._has_cuda(): > model.to(self.device) > model.first_stage_model.to(self.device) > model.cond_stage_model.to(self.device) > model.cond_stage_model.device = self.device > > return model > > If I add model.to(self.device) it is workign again and generatign same > image. > > def _model_from_cpu(self,model): > if self._has_cuda(): > model.to(self.device) > model.first_stage_model.to(self.device) > model.cond_stage_model.to(self.device) > model.cond_stage_model.device = self.device > else: > model.to(self.device) > > return model > > — > Reply to this email directly, view it on GitHub > <#1066 (comment)>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AAA3EVMH4QC52NOB4RETDGLWC27ZPANCNFSM6AAAAAARC6U4ZI> > . > You are receiving this because you authored the thread.Message ID: > ***@***.***> > -- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

-- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

lstein · 2022-10-12T14:27:57Z

Or is there an "mps" device? On Wed, Oct 12, 2022 at 10:24 AM Lincoln Stein ***@***.***> wrote:

H'mmm. On second thought, I'm not sure whether this will fix the problem. On non-CUDA systems, the only device is "cpu", and so the caching commands are essentially intended to be no-ops on MPS systems. Unless I'm fundamentally misunderstanding something! Lincoln On Wed, Oct 12, 2022 at 10:14 AM Lincoln Stein ***@***.***> wrote: > Stupid bug! I'll fix. > > Lincoln > > On Wed, Oct 12, 2022 at 9:58 AM Jan Skurovec ***@***.***> > wrote: > >> It look like it is because in model_cache it is not returned to mps, >> only for cuda. >> >> def _model_from_cpu(self,model): >> if self._has_cuda(): >> model.to(self.device) >> model.first_stage_model.to(self.device) >> model.cond_stage_model.to(self.device) >> model.cond_stage_model.device = self.device >> >> return model >> >> If I add model.to(self.device) it is workign again and generatign same >> image. >> >> def _model_from_cpu(self,model): >> if self._has_cuda(): >> model.to(self.device) >> model.first_stage_model.to(self.device) >> model.cond_stage_model.to(self.device) >> model.cond_stage_model.device = self.device >> else: >> model.to(self.device) >> >> return model >> >> — >> Reply to this email directly, view it on GitHub >> <#1066 (comment)>, >> or unsubscribe >> <https://github.com/notifications/unsubscribe-auth/AAA3EVMH4QC52NOB4RETDGLWC27ZPANCNFSM6AAAAAARC6U4ZI> >> . >> You are receiving this because you authored the thread.Message ID: >> ***@***.***> >> > -- > Written on my cell phone. Anything that seems odd is the fault of > auto-correct. > -- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

-- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

skurovec · 2022-10-12T14:29:51Z

self.device.type returns 'mps'

edit: at least on m1 mac, I currently do not have intel mac

maybe create _has_mps for m1 and leave rest as is

        elif self._has_mps():
             model.to(self.device)

lstein · 2022-10-12T14:33:34Z

So the code should be: def _model_from_cpu(self,model): if self.device != 'cpu': model.to(self.device) model.first_stage_model.to(self.device) model.cond_stage_model.to(self.device) model.cond_stage_model.device = self.device return model I'm at a conference right now, but will commit this fix sometime today. L

On Wed, Oct 12, 2022 at 10:30 AM Jan Skurovec ***@***.***> wrote: self.device.type returns 'mps' — Reply to this email directly, view it on GitHub <#1066 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA3EVLIQI5ODUDNDWE3Y2LWC3DOVANCNFSM6AAAAAARC6U4ZI> . You are receiving this because you authored the thread.Message ID: ***@***.***>

-- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

skurovec · 2022-10-12T15:25:12Z

After fix switching models works. Images are consistent between models.

Vargol · 2022-10-12T16:03:13Z

Seems to work well on a 8gb M1 too.
No obvious slowdown of rendering, images are different with same seeds, return to previous image when swapping models back.

I didn't think its popping on my 8gb, but the OS is swapping the cached model out to swap, this is having no affect on me, concerned about effect 16Gb M1, with larger images though. Just running with a few debug statements to test.

Ideas:
Print when its popping a cached model
A clear cache option.
Default model option.

Vargol · 2022-10-12T16:17:20Z

Yep, macOS is so lying about the free memory

NOTE: Redirects are currently not supported in Windows or MacOs.
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
checking pop
8359503462.4
2147483648
>> Loading stable-diffusion-1.4 from models/ldm/stable-diffusion-v1/model.ckpt
   >> LatentDiffusion: Running in eps-prediction mode
   >> DiffusionWrapper has 859.52 M params.
   >> Making attention of type 'vanilla' with 512 in_channels
   >> Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
   >> Making attention of type 'vanilla' with 512 in_channels
>> Using more accurate float32 precision
>> Model loaded in 67.97s
>> Setting Sampler to k_lms

* Initialization done! Awaiting your command (-h for help, 'q' to quit)
invoke> !models
laion400m                 not loaded  Latent Diffusion LAION400M model
stable-diffusion-1.4          active  Stable Diffusion inference model version 1.4
waifu-1.3                 not loaded  Waifu 1.3
waifu-1.x                 not loaded  Waifu 1.3
invoke> !switch waifu-1.x
>> Caching model stable-diffusion-1.4 in system RAM
checking pop
3794085478.4
2147483648
>> Loading waifu-1.x from models/ldm/stable-diffusion-v1/model-waifu.ckpt
   >> LatentDiffusion: Running in eps-prediction mode
   >> DiffusionWrapper has 859.52 M params.
   >> Making attention of type 'vanilla' with 512 in_channels
   >> Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
   >> Making attention of type 'vanilla' with 512 in_channels
>> Using more accurate float32 precision
>> Model loaded in 84.83s
>> Setting Sampler to k_lms
invoke> !switch waifu-1.3
>> Caching model waifu-1.x in system RAM
checking pop
3696191078.4
2147483648
invoke> !models
laion400m                 not loaded  Latent Diffusion LAION400M model
stable-diffusion-1.4          cached  Stable Diffusion inference model version 1.4
waifu-1.3                     active  Waifu 1.3
waifu-1.x                     cached  Waifu 1.3
invoke>

Vargol · 2022-10-12T16:20:59Z

hold on

AVG_MODEL_SIZE=2.1GIGS
DEFAULT_MIN_AVAIL=2GIGS

    print(avail_memory + AVG_MODEL_SIZE)
    print(self.min_avail_mem)
    if avail_memory + AVG_MODEL_SIZE < self.min_avail_mem:

If you're using DEFAULT_MIN_AVAIL for self.min_avail_mem wouldn't avail_memory have to be negative to pop
as AVG_MODEL_SIZE > DEFAULT_MIN_AVAIL
or am I having a brain fart....

- fixed backwards calculation of minimum available memory - only execute m.padding adjustment code once upon load

lstein · 2022-10-12T19:59:03Z

hold on

AVG_MODEL_SIZE=2.1_GIGS DEFAULT_MIN_AVAIL=2_GIGS
 print(avail_memory + AVG_MODEL_SIZE)
 print(self.min_avail_mem)
 if avail_memory + AVG_MODEL_SIZE < self.min_avail_mem:
If you're using DEFAULT_MIN_AVAIL for self.min_avail_mem wouldn't avail_memory have to be negative to pop as AVG_MODEL_SIZE > DEFAULT_MIN_AVAIL or am I having a brain fart....

This is what comes from coding at 2 am, and also why it's so important to have multiple eyes on the code! It looks like I got this backward, and I'm surprised that it worked on my system. But I probably tested it backwards too.

Latest commit fixes this, and addresses other misc issues. Thanks for the help debugging on M1!

skurovec · 2022-10-12T20:15:10Z

Latest changes working on my m1.

Vargol · 2022-10-12T21:53:59Z

Same here , on my 8Gb M1
Models swapping, rendered images reflect the model, images consistent if you swap back to the model used for a previous render and render with same parameters, and its popping the models too now

lstein · 2022-10-13T00:32:22Z

So would someone provide a code review approval so that I can move on to working on the next version of outpainting?

blessedcoolant · 2022-10-13T11:03:34Z

Tested this out. Works as advertised. Just a question though: how many models are we caching? If the mem limit is hardcoded to 2GB, would this be an issue on lower end systems?

Once the amount of available memory falls below 2 GB what will happen is that each new model will replace the last one in cache memory. To retrieve the previous model the system will have to reload from disk. I think the main problem might be when the user wants to give InvokeAI more latitude to use available memory.

Any-Winter-4079 · 2022-10-13T13:36:24Z

generate.py does not have fix_func. Does the merge take care of that? (I hope)

skurovec · 2022-10-13T13:41:01Z

@Any-Winter-4079 PR #1056 was merged about same time this was created, so it is possible its from version before fix

lstein · 2022-10-13T13:54:36Z

generate.py does not have fix_func. Does the merge take care of that? (I hope)

The switch-models branch was made before fix-func came in, and when I merge to development this will be resolved.

Any-Winter-4079 · 2022-10-13T14:05:17Z

Just started testing this. I'm getting
RuntimeError: Placeholder storage has not been allocated on MPS device! after trying to load a model that does not exist (model laion400m could not be loaded: [Errno 2] No such file or directory: 'models/ldm/text2img-large/model.ckpt') and then creating an image (with the current model).
@skurovec does it work for you? I may have messed sth up mixing this with other (uncommitted) changes

Sequence:
python scripts/dream.py --outdir outputs/new
"banana sushi" -s 50 -S 1760350044 -W 512 -H 512 -C 8.0 -A ddim -> works
!models

laion400m                 not loaded  Latent Diffusion LAION400M model
stable-diffusion-1.4          active  Stable Diffusion inference model version 1.4

!switch laion400m

>> Caching model stable-diffusion-1.4 in system RAM
>> Loading laion400m from models/ldm/text2img-large/model.ckpt
** model laion400m could not be loaded: [Errno 2] No such file or directory: 'models/ldm/text2img-large/model.ckpt'
** Model switch failed **

"banana sushi" -s 50 -S 1760350044 -W 512 -H 512 -C 8.0 -A ddim
RuntimeError: Placeholder storage has not been allocated on MPS device!

Also
a
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, mps:0 and cpu!

lstein · 2022-10-13T16:21:11Z

I can confirm this bug on a CUDA system. I'd tested out the scenario of loading a missing model file earlier in development and it was working, but more recent changes apparently broke it. I'll make another commit later today, and probably include code for interactively adding and editing stanzas in the models.yaml file.

Any-Winter-4079 · 2022-10-13T19:32:54Z

Thanks for the fix. One thing is stable-diffusion-1.4 can't be renamed in models.yml or it fails.
"stable-diffusion-1.4" is not a known model name. Please check your models.yaml file
I guess it uses a hard-coded value as default to load the model on initialization.

Besides that, it works well as far as I've tested it. By the way, is there correlation between sd-1.4 and waifu-1.3 (especially SD at low step values) in terms of seeds? I find it pretty fascinating.

The prompt for these images is the same. Only seeds change.

The Waifu Diffusion 1.3 model is a Stable Diffusion model that has been finetuned from Stable Diffusion v1.4

I guess this is the answer, but I still find it pretty cool.

Any-Winter-4079 · 2022-10-13T19:42:59Z

I'm going to keep testing a bit more, with larger images especially.
Update: I've tested waifu-1.3 and stable-diffusion-1.4 only (not more models) but it seems to behave well. Especially no out of memory errors for generating large images + having several models around.

Any-Winter-4079

LGTM

- !import_model <path/to/model/weights> will import a new model, prompt the user for its name and description, write it to the models.yaml file, and load it. - !edit_model <model_name> will bring up a previously-defined model and prompt the user to edit its descriptive fields. Example of !import_model <pre> invoke> !import_model models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt >> Model import in process. Please enter the values needed to configure this model: Name for this model: waifu-diffusion Description of this model: Waifu Diffusion v1.3 Configuration file for this model: configs/stable-diffusion/v1-inference.yaml Default image width: 512 Default image height: 512 >> New configuration: waifu-diffusion: config: configs/stable-diffusion/v1-inference.yaml description: Waifu Diffusion v1.3 height: 512 weights: models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt width: 512 OK to import [n]? y >> Caching model stable-diffusion-1.4 in system RAM >> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt | LatentDiffusion: Running in eps-prediction mode | DiffusionWrapper has 859.52 M params. | Making attention of type 'vanilla' with 512 in_channels | Working with z of shape (1, 4, 32, 32) = 4096 dimensions. | Making attention of type 'vanilla' with 512 in_channels | Using faster float16 precision </pre> Example of !edit_model <pre> invoke> !edit_model waifu-diffusion >> Editing model waifu-diffusion from configuration file ./configs/models.yaml description: Waifu diffusion v1.4beta weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt config: configs/stable-diffusion/v1-inference.yaml width: 512 height: 512 >> New configuration: waifu-diffusion: config: configs/stable-diffusion/v1-inference.yaml description: Waifu diffusion v1.4beta weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt height: 512 width: 512 OK to import [n]? y >> Caching model stable-diffusion-1.4 in system RAM >> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt ... </pre>

lstein · 2022-10-14T04:00:44Z

I've added a new commit that gives ldm.invoke.model_cache an API for importing and editing model config stanzas. I'm hoping this will be useful for the WebGUI. I've also added !import_model and !edit_model commands to the CLI. I'd appreciate if you could test this a bit and try to break it by giving bad values, etc.

Here are examples of how they work:

!import_model

invoke> !import_model models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
>> Model import in process. Please enter the values needed to configure this model:
    
Name for this model: waifu-diffusion
Description of this model: Waifu Diffusion v1.3
Configuration file for this model: configs/stable-diffusion/v1-inference.yaml
Default image width: 512
Default image height: 512
>> New configuration:
waifu-diffusion:
config: configs/stable-diffusion/v1-inference.yaml
description: Waifu Diffusion v1.3
height: 512
weights: models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
width: 512
OK to import [n]? y
>> Caching model stable-diffusion-1.4 in system RAM
 >> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch08-float16.ckpt
       | LatentDiffusion: Running in eps-prediction mode
       | DiffusionWrapper has 859.52 M params.
       | Making attention of type 'vanilla' with 512 in_channels
       | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
       | Making attention of type 'vanilla' with 512 in_channels
       | Using faster float16 precision
...etc

!edit_model

invoke> !edit_model waifu-diffusion
>> Editing model waifu-diffusion from configuration file ./configs/models.yaml
description: Waifu diffusion v1.4beta
weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
config: configs/stable-diffusion/v1-inference.yaml
width: 512
height: 512
    
>> New configuration:
waifu-diffusion:
    config: configs/stable-diffusion/v1-inference.yaml
    description: Waifu diffusion v1.4beta
    weights: models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
    height: 512
    width: 512
    
OK to change [n]? y
>> Caching model stable-diffusion-1.4 in system RAM
>> Loading waifu-diffusion from models/ldm/stable-diffusion-v1/model-epoch10-float16.ckpt
...

lstein · 2022-10-14T04:07:38Z

Thanks for the fix. One thing is stable-diffusion-1.4 can't be renamed in models.yml or it fails. "stable-diffusion-1.4" is not a known model name. Please check your models.yaml file I guess it uses a hard-coded value as default to load the model on initialization.

Yeah, stable-diffusion-1.4 is the default model name in the event the --model is not specified on the command line. Perhaps we should extend the models.yaml syntax to have an entry marked default: True. What do you think?

Thanks for letting me know about the correlation between SD and Waifu. I hadn't noticed that. It's quite cool.

Any-Winter-4079 · 2022-10-14T11:48:37Z

Perhaps we should extend the models.yaml syntax to have an entry marked default: True. What do you think?

That's probably a good idea.

I've added a new commit that gives ldm.invoke.model_cache an API for importing and editing model config stanzas. I'm hoping this will be useful for the WebGUI. I've also added !import_model and !edit_model commands to the CLI. I'd appreciate if you could test this a bit and try to break it by giving bad values, etc.

Testing now (there's a lot to test).

1) Passing a wrong model

It does create a models.yaml entry but maybe we should check whether it's a .ckpt file and allow or prevent it altogether.
!import_model models/ldm/waifu-diffusion/inference/wd-v1-3-full.sha256

fd:
  weights: models/ldm/waifu-diffusion/inference/wd-v1-3-full.sha256
  description: fgdhf
  config: configs/stable-diffusion/v1-inference.yaml
  width: 512
  height: 512

Trying to use it.

invoke> !switch fd
invoke>

Silently fails and continues having the current model

invoke> !models
laion400m                 not loaded  Latent Diffusion LAION400M model
stable-diffusion-1.4          active  Stable Diffusion inference model version 1.4
stable-diffusion-1.4-full not loaded  Stable Diffusion inference model version 1.4 (full)
stable-diffusion-1.3      not loaded  Stable Diffusion inference model version 1.3
waifu-1.3                 not loaded  Waifu Diffusion inference model version 1.3
fd                        not loaded  fgdhf

2) Deleting a model

I can't remove the wrongly created model except from the yaml file, right? I think either we prevent wrong additions, or allow to remove (or ideally both)

3) Adding an existing model

I edited the fd entry via
!edit_model fd
but I tried adding an existing model and it let me.

waifu-1.3:
  config: configs/stable-diffusion/v1-inference.yaml
  weights: models/ldm/waifu-diffusion/inference/wd-v1-3-float32.ckpt
  description: Waifu Diffusion inference model version 1.3
  width: 512
  height: 512
fd:
  weights: models/ldm/waifu-diffusion/inference/wd-v1-3-float32.ckpt
  description: fgdhf
  config: configs/stable-diffusion/v1-inference.yaml
  width: 512
  height: 512

It's not super important, but it'd be nice to say: hey, reminder that this ckpt is already in models.yaml
Especially as we add more and more.

4) Using a recently added model

!import_model models/ldm/waifu-diffusion/inference/wd-v1-3-float32.ckpt
!switch fd3
silently fails and cannot use the model.
Only after switching to another model (e.g. waifu-1.3), then I can load fd3.

5) Autocompleting

The autocomplete doesn't show the new model either.

invoke> !switch fd
fd

but
!models

laion400m                 not loaded  Latent Diffusion LAION400M model
stable-diffusion-1.4          cached  Stable Diffusion inference model version 1.4
stable-diffusion-1.4-full not loaded  Stable Diffusion inference model version 1.4 (full)
stable-diffusion-1.3      not loaded  Stable Diffusion inference model version 1.3
waifu-1.3                     active  Waifu Diffusion inference model version 1.3
fd                        not loaded  fgdhf
fd1                       not loaded  w
fd3                           cached  d

lstein · 2022-10-14T17:18:13Z

Good feedback. I knew about the problem with autocomplete not picking up the new model until you restart the script, but the rest is new to me. I'll add more error reporting as well as a delete option.

I think I'll merge in now and add the error checks in a subsequent PR.

lstein added 4 commits October 11, 2022 17:24

add mostly functional model caching module

b9e910b

Merge branch 'model-switching' of github.com:invoke-ai/InvokeAI into …

7c06849

…model-switching

lstein requested review from tildebyte and psychedelicious October 12, 2022 06:40

lstein requested a review from Any-Winter-4079 October 12, 2022 06:41

move tokenizer into cpu cache as well

b537e92

ArDiouscuros reviewed Oct 12, 2022

View reviewed changes

ldm/generate.py Outdated Show resolved Hide resolved

proposed fix to work on mps systems

aa6aa68

final fixups to memory_cache

1c102c7

- fixed backwards calculation of minimum available memory - only execute m.padding adjustment code once upon load

gracefully recover from failed model load

916f5bf

Any-Winter-4079 approved these changes Oct 13, 2022

View reviewed changes

lstein requested review from ArDiouscuros and Any-Winter-4079 and removed request for ArDiouscuros October 14, 2022 03:54

minor doc fixes

1c50133

Merge branch 'development' into model-switching

fe2a2cf

lstein merged commit fe2a2cf into development Oct 14, 2022

lstein deleted the model-switching branch October 14, 2022 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable fast switching among models at the invoke> command line #1066

Enable fast switching among models at the invoke> command line #1066

lstein commented Oct 12, 2022

lstein commented Oct 12, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 12, 2022 •

edited

Loading

skurovec commented Oct 12, 2022

skurovec commented Oct 12, 2022

lstein commented Oct 12, 2022 via email

lstein commented Oct 12, 2022 via email

lstein commented Oct 12, 2022 via email

skurovec commented Oct 12, 2022 •

edited

Loading

lstein commented Oct 12, 2022 via email

skurovec commented Oct 12, 2022

Vargol commented Oct 12, 2022

Vargol commented Oct 12, 2022

Vargol commented Oct 12, 2022

lstein commented Oct 12, 2022

skurovec commented Oct 12, 2022

Vargol commented Oct 12, 2022

lstein commented Oct 13, 2022

blessedcoolant commented Oct 13, 2022 •

edited by lstein

Loading

Any-Winter-4079 commented Oct 13, 2022

skurovec commented Oct 13, 2022 •

edited

Loading

lstein commented Oct 13, 2022

Any-Winter-4079 commented Oct 13, 2022 •

edited

Loading

lstein commented Oct 13, 2022

Any-Winter-4079 commented Oct 13, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 13, 2022 •

edited

Loading

Any-Winter-4079 left a comment

lstein commented Oct 14, 2022

lstein commented Oct 14, 2022

Any-Winter-4079 commented Oct 14, 2022 •

edited

Loading

lstein commented Oct 14, 2022

Enable fast switching among models at the invoke> command line #1066

Enable fast switching among models at the invoke> command line #1066

Conversation

lstein commented Oct 12, 2022

New commands:

More details:

Unrelated fixes:

lstein commented Oct 12, 2022 • edited Loading

Any-Winter-4079 commented Oct 12, 2022 • edited Loading

skurovec commented Oct 12, 2022

skurovec commented Oct 12, 2022

lstein commented Oct 12, 2022 via email

lstein commented Oct 12, 2022 via email

lstein commented Oct 12, 2022 via email

skurovec commented Oct 12, 2022 • edited Loading

lstein commented Oct 12, 2022 via email

skurovec commented Oct 12, 2022

Vargol commented Oct 12, 2022

Vargol commented Oct 12, 2022

Vargol commented Oct 12, 2022

lstein commented Oct 12, 2022

skurovec commented Oct 12, 2022

Vargol commented Oct 12, 2022

lstein commented Oct 13, 2022

blessedcoolant commented Oct 13, 2022 • edited by lstein Loading

Any-Winter-4079 commented Oct 13, 2022

skurovec commented Oct 13, 2022 • edited Loading

lstein commented Oct 13, 2022

Any-Winter-4079 commented Oct 13, 2022 • edited Loading

lstein commented Oct 13, 2022

Any-Winter-4079 commented Oct 13, 2022 • edited Loading

Any-Winter-4079 commented Oct 13, 2022 • edited Loading

Any-Winter-4079 left a comment

Choose a reason for hiding this comment

lstein commented Oct 14, 2022

!import_model

!edit_model

lstein commented Oct 14, 2022

Any-Winter-4079 commented Oct 14, 2022 • edited Loading

1) Passing a wrong model

2) Deleting a model

3) Adding an existing model

4) Using a recently added model

5) Autocompleting

lstein commented Oct 14, 2022

lstein commented Oct 12, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 12, 2022 •

edited

Loading

skurovec commented Oct 12, 2022 •

edited

Loading

blessedcoolant commented Oct 13, 2022 •

edited by lstein

Loading

skurovec commented Oct 13, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 13, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 13, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 13, 2022 •

edited

Loading

Any-Winter-4079 commented Oct 14, 2022 •

edited

Loading