Stable Cascade #2785

DataCTE · 2024-02-13T11:43:13Z

https://huggingface.co/stabilityai/stable-cascade

DataCTE · 2024-02-13T11:43:38Z

new stability model wondering if we could get native comfyui support for it

CyberTimon · 2024-02-13T12:47:14Z

I'm waiting for support too! Thanks

Arcitec · 2024-02-14T00:50:19Z

It's inevitably gonna be supported, just be patient.

The whole point of ComfyUI is AI generation.
Stable Cascade is a major evolution which beats the crap out of SD1.5 and SDXL. It can now do hands, feet and text, and complicated prompts. Everyone wants it. You don't need to remind them. 😉
Stable Cascade even comes with a new Face ID transfer controlnet, which will generate high resolution faces with zero LoRA training.
It is also way more efficient, only taking something like 10% of the time to train a checkpoint or LoRA compared to SDXL. Which is gonna make a lot of people go crazy!
ComfyUI is Stability AI's internal, preferred UI for their developers and researchers.
In fact, comfyanonymous is working at Stability AI.
Stability even made Stable Swarm, a simpler UI for ComfyUI: https://github.com/Stability-AI/StableSwarmUI

It is obviously exciting enough that it will be supported soon.

Just be very patient. It takes a while to analyze the new architecture, creating new nodes, and figuring it all out. Let's not rush or demand anything! 😉

By the way, Stable Cascade isn't even finished yet. It's still in "early development" research/training. Their codebase is changing constantly. The large model currently uses 20 GB VRAM (!) and they think they can optimize this to 1/2 or 1/3 the usage when they are done. Furthermore, the diffusers library isn't even ready, and the code is far from reaching the quality needed for merging. So let's breathe.

Update: One of the Stability employees commented on Reddit that Comfy will have support in a week or two.

Arcitec · 2024-02-14T01:40:41Z

Well. There's a basic node which doesn't implement anything and just uses the official code and wraps it in a ComfyUI node. No interesting support for anything special like controlnets, prompt conditioning or anything else really. It's just basic wrapper for some prompt strings and a seed. It doesn't support model loading or unloading, so it will hog your VRAM. I also see that it has a bug (loads one of the models as float16 instead of bfloat16). But hopefully still good enough for impatient people like @GaleBoro, while waiting.

https://github.com/kijai/ComfyUI-DiffusersStableCascade

Arcitec · 2024-02-14T10:27:06Z

Alright. Here's another option to experiment with it locally in the meantime. It's Pinokio's tweak of the unofficial gradio web UI, and has removed the need for huggingface token / login.

Needs: Python 3.11. The two largest models need ~15.5 GB of VRAM at 1024x1024, or ~18.0-20.0 GB of VRAM on 1536x1536 with bfloat16 model format.

git clone https://huggingface.co/spaces/cocktailpeanut/stable-cascade pinokio
cd pinokio

python -m venv .venv
. .venv/bin/activate

pip install gradio spaces itsdangerous
pip install -r requirements.txt

python app.py

JorgeR81 · 2024-02-14T16:56:31Z

Even with Comfy UI vram optimizations, it seems that unless you have a GPU that supports bfloat16 or has 16 GB vram, you can only use the "lite" version of the "Stage C" model, right ?

https://huggingface.co/stabilityai/stable-cascade/tree/main

edit:

Apparently, it could theoretically be possible to split the model for VRAM optimization.

https://news.ycombinator.com/item?id=39360106

The large C model have fair bit of parameters tied to text-conditioning, not to the main denoising process. Similar to how we split the network for SDXL Base, I am pretty confident we can split non-trivial amount of parameters to text-conditioning hence during denoising process, loading less than 3.6B parameters.

Arcitec · 2024-02-15T01:44:22Z

@JorgeR81 The CEO of Stability has made statements that SD1.0 originally used more than 20 GB VRAM too. And that they are confident that they can reduce Stable Cascade to 1/2 or 1/3 of the current VRAM requirements. He didn't go into detail what techniques they'd use to achieve that, but it sounds good to me, because the 20 GB VRAM usage right now is very painful.

Edit: The statements. Take them with a big grain of salt. But their hope is to reach 8 GB VRAM usage.

Edit: And another statement that I found really interesting. Saying that RTX 40-series and newer cards will become increasingly required for future AI networks, because they support fp8 and optical flow:

https://www.reddit.com/r/StableDiffusion/comments/1ah06me/comment/konsqz1/

MushroomFleet · 2024-02-15T18:36:40Z

I think it's worth waiting for the official release from comfyanonymous,

but there is the huggingface space, this colab from Camenduru (Gradio UI, runs on T4 - 80s for 1 image, slow): https://github.com/camenduru/stable-cascade-jupyter

& another one i put together on launch day (runs on A100 - 14 seconds for 4 images, fast): https://github.com/MushroomFleet/StableCascade-text2image

personally i can't wait to get this into Comfy, and there is a diffusers custom node here, if you can't wait!
https://github.com/kijai/ComfyUI-DiffusersStableCascade

comfyanonymous · 2024-02-17T06:00:31Z

It's implemented in the main repo now, you can use this workflow until I write an examples page for it.

https://gist.github.com/comfyanonymous/0f09119a342d0dd825bb2d99d19b781c

hben35096 · 2024-02-17T07:31:44Z

stage、stage_b16、stage_lite、stage_lite_b16，Is there a big gap between them when it comes to generating images?

Thireus · 2024-02-17T10:00:35Z

Error occurred when executing KSampler:

Given groups=1, weight of size [320, 16, 1, 1], expected input[2, 64, 12, 12] to have 16 channels, but got 64 channels instead

I get the following error, that's weird.

Edit: Solved! stage_c wasn't loaded properly.

hzhangxyz · 2024-02-17T11:14:59Z

Is there any workflow example for img-to-img or controlnet?

Th3Rom3 · 2024-02-17T11:49:19Z

Getting an error when running the example workflow for Stable Cascade with both bf16 and lite models:

Error occurred when executing CLIPTextEncode:

output with shape [20, 77, 77] doesn't match the broadcast shape [1, 20, 77, 77]

File "/media/scratch/StableDiffusion/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/nodes.py", line 56, in encode
cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd.py", line 131, in encode_from_tokens
cond, pooled = self.cond_stage_model.encode_token_weights(tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 514, in encode_token_weights
out, pooled = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 39, in encode_token_weights
out, pooled = self.encode(to_encode)
^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 190, in encode
return self(tokens)
^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 172, in forward
outputs = self.transformer(tokens, attention_mask, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 131, in forward
return self.text_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 109, in forward
x, i = self.encoder(x, mask=mask, intermediate_output=intermediate_output)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 68, in forward
x = l(x, mask, optimized_attention)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 49, in forward
x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 20, in forward
out = optimized_attention(q, k, v, self.heads, mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/ldm/modules/attention.py", line 117, in attention_basic
sim += mask

** ComfyUI startup time: 2024-02-17 12:52:52.409368
** Platform: Linux
** Python version: 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]
** Python executable: /media/scratch/Python-envs/ComfyUI/bin/python

### ComfyUI Revision: 1983 [805c36ac] | Released on '2024-02-17'

Torch version: 2.2.0+rocm5.7

Might be a problem on my end since I am running on AMD ROCm but wanted to leave this here if it crops up anywhere else.

Setting the clip type in "Load CLIP" to stable_diffusion instead of stable_cascade runs the workflow but results in random outputs.

ke1ne · 2024-02-17T12:31:14Z

Guys, sorry, where to get a proper CLIP G SDXL BF16?

HelloClyde · 2024-02-17T12:34:27Z

Guys, sorry, where to get a proper CLIP G SDXL BF16?

and this: https://huggingface.co/stabilityai/stable-cascade/blob/main/text_encoder/model.safetensors goes in the ComfyUI/models/clip/ folder to be loaded with the Load CLIP node in the workflow

Rename "model.safetensors" to "clip_g_sdxl.fp16.safetensors" or simply select "model.safetensors".

JorgeR81 · 2024-02-17T12:51:09Z

stage、stage_b16、stage_lite、stage_lite_b16，Is there a big gap between them when it comes to generating images?

I'm using the full size models, but it's worth trying the [stage_c_lite] as well, since the results are very different, like a different model.

The bf16 models look the same, as the regular models.

The full size models are usable even on a GTX 1070 (8GB).
Generation time is 125 sec, with the full size models ( "loading in lowvram mode 4717.692307472229")
Generating for the first time is 165 sec ( loading the full models from SATA SSD )
With [stage_c_lite] is 55 and 65 sec respectively.

stage_c ( default workflow settings )

stage_c_lite ( same seed )

stage_c_lite ( better seed )

ke1ne · 2024-02-17T12:56:47Z

Guys, sorry, where to get a proper CLIP G SDXL BF16?

and this: https://huggingface.co/stabilityai/stable-cascade/blob/main/text_encoder/model.safetensors goes in the ComfyUI/models/clip/ folder to be loaded with the Load CLIP node in the workflow

Rename "model.safetensors" to "clip_g_sdxl.fp16.safetensors" or simply select "model.safetensors".

Thank you!

comfyanonymous · 2024-02-17T12:57:50Z

#2785 (comment)

This should be fixed now.

Th3Rom3 · 2024-02-17T13:06:38Z

#2785 (comment)

This should be fixed now.

### ComfyUI Revision: 1984 [6c875d84] | Released on '2024-02-17'

Thanks for the insanely fast turnaround. Works like a charm now.

hben35096 · 2024-02-17T14:45:50Z

In terms of picture details, there is still room for improvement. Occasionally, there are prompts that cause a messy picture like the picture in the upper right corner.

JorgeR81 · 2024-02-17T14:45:53Z

The [stage c] steps can be reduced from 20 to 10, at least with my simple test prompt, for a generation time of 84 sec.
The [stage_b] steps can be reduced from 10 to 5, with only a slight loss in skin texture quality, for a generation time of 67 sec.
So this is faster than the 72 sec, I need for SDXL, ( for 20 steps ) for good image quality, on a GTX 1070.

stage_c ( 10 steps ) ( same seed as the first image in the post above )

stage_c ( 10 steps )
stage_b ( 5 steps )

JorgeR81 · 2024-02-17T15:08:07Z

In terms of picture details, there is still room for improvement.

Yes, I also noticed that skin detail is not as good as the latest SDXL community finetunes, but I'm sure this will be improved.

In the huggingface model card it's stated one of the "Limitations" as "Faces and people in general may not be generated properly", so this is already much better than what I was expecting.

The only downside is that a single set of cascade models at full size is 20 GB !

CraftMaster163 · 2024-02-17T16:11:07Z

on amd directml mode and cpu mode i get the error
Error occurred when executing UNETLoader:

unet_dtype() got an unexpected keyword argument 'supported_dtypes'

File "E:\AI\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "E:\AI\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "E:\AI\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "E:\AI\ComfyUI\nodes.py", line 850, in load_unet
model = comfy.sd.load_unet(unet_path)
File "E:\AI\ComfyUI\comfy\sd.py", line 553, in load_unet
model = load_unet_state_dict(sd)
File "E:\AI\ComfyUI\comfy\sd.py", line 540, in load_unet_state_dict
unet_dtype = model_management.unet_dtype(model_params=parameters, supported_dtypes=model_config.supported_inference_dtypes)

CraftMaster163 · 2024-02-17T16:14:24Z

ah it was some of my custom nodes, idk which one(s) tho lol

Th3Rom3 · 2024-02-17T16:16:08Z

@CraftMaster163 Did you update Comfy? This issue is supposedly fixed with a recent commit (to this specified custom node).

Edit: Seems like I indeed misread the issue I mentioned. Looks like beyond main Comfy any modules replacing the affected modules will have to be updated, too.

CraftMaster163 · 2024-02-17T16:17:58Z

i updated my comfyui. got to update nodes too tho

JorgeR81 · 2024-02-17T17:49:35Z

When we start getting stable cascade models from the community, I'm going to run out of space very fast, on my main drive.

But, by editing the [ extra_model_paths.yaml ], I was able to load the models in the "unet" folder, from another drive in my PC.
You can still keep models in the main "unet" folder; they will also be accessible in the UI.
And this can also be done for "checkpoints", "loras", etc., ...

https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example

Instructions are in this discussion:
#72
You need to rename the file to [ extra_model_paths.yaml ]
Inside the file you need to change the base path, according to the folder you created for your extra models.

The only new step is to add the "unet" folder to the example provided, like this:
Here I also keep paths for other large model types, but you don't need to create the folders, if you're not using them.

comfyui:
      base_path: d:/ComfyUI/
      
      checkpoints: models/checkpoints/
      clip_vision: models/clip_vision/
      controlnet: models/controlnet/
      loras: models/loras/
      unet: models/unet/
      upscale_models: models/upscale_models/
      vae: models/vae/

If it works you should see extras lines in the cmd line, for each new path, when opening Comfy UI, like this:

"Adding extra search path unet d:/ComfyUI/models/unet/"

lalalabush · 2024-02-17T18:10:30Z

#2785 (comment)

This should be fixed now.

SDXL is broken now for some reason. Stable Cascade is working.

Error occurred when executing CLIPTextEncode:

shape '[77, -1, 77, 77]' is invalid for input of size 5929

  File "/Users/tom/ComfyUI/execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/tom/ComfyUI/execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/tom/ComfyUI/execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/tom/ComfyUI/nodes.py", line 56, in encode
    cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
  File "/Users/tom/ComfyUI/comfy/sd.py", line 131, in encode_from_tokens
    cond, pooled = self.cond_stage_model.encode_token_weights(tokens)
  File "/Users/tom/ComfyUI/comfy/sdxl_clip.py", line 54, in encode_token_weights
    g_out, g_pooled = self.clip_g.encode_token_weights(token_weight_pairs_g)
  File "/Users/tom/ComfyUI/comfy/sd1_clip.py", line 39, in encode_token_weights
    out, pooled = self.encode(to_encode)
  File "/Users/tom/ComfyUI/comfy/sd1_clip.py", line 190, in encode
    return self(tokens)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/sd1_clip.py", line 172, in forward
    outputs = self.transformer(tokens, attention_mask, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 131, in forward
    return self.text_model(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 109, in forward
    x, i = self.encoder(x, mask=mask, intermediate_output=intermediate_output)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 68, in forward
    x = l(x, mask, optimized_attention)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 49, in forward
    x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 20, in forward
    out = optimized_attention(q, k, v, self.heads, mask)
  File "/Users/tom/ComfyUI/comfy/ldm/modules/attention.py", line 117, in attention_basic
    mask = mask.reshape(mask.shape[0], -1, mask.shape[-2], mask.shape[-1]).expand(-1, heads, -1, -1).reshape(sim.shape)

Thireus · 2024-02-17T18:15:16Z

In terms of picture details, there is still room for improvement. Occasionally, there are prompts that cause a messy picture like the picture in the upper right corner.

I also get some pictures like the top right corner one from time to time.

comfyanonymous · 2024-02-20T10:57:59Z

There was a small fix to the stage b sampling, if you update ComfyUI and use the updated workflows:
https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade/

You should see a small quality increase.

JorgeR81 · 2024-02-20T12:09:50Z

I noticed the img2img workflow uses less compression by default.

This makes sense.

Since the model has a picture as reference, we are less likely to have distortions, due to low compression, when changing the aspect ratio ( e.g. a long neck in portrait aspect ratio ).

So when can have the benefits of lower compression, like more crip, images, without the drawbacks.

#2785 (comment)

JorgeR81 · 2024-02-20T13:35:51Z

There was a small fix to the stage b sampling, if you update ComfyUI and use the updated workflows:
https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade/

You should see a small quality increase.

Yes, in the new version, I noticed that the dark circumference around the iris is slimmer, which is more realistic.
And the part where the hair makes contact with the dress ( on the left side of the neck ) is also better.

But,

Was there also a change on the shift parameter ?

After this update, when I use the shift node on Stage C, the image is very different.
The quality is perhaps a little better, but I really liked the aesthetics of the old image, with shift = 3

I managed to get a more similar look with shift = 3.2, but still quite different. The braid is gone now.
Also, I noticed the iris is worse ( less round ), in the new version, with this shift value.

So we just need to play with the shift value, or perhaps is another setting you can add to the node ?

EDIT :

It was this commit: c6b7a15

I made the code changes manually, and I can confirm it was this.
I was able to create an image with the old shift look, and the improvements on Stage B.

This commit seems to affect the image much more, when the shift parameter is changed.
I like both versions of these images; with and without this commit.
So an option to toggle this, would be very useful.

The differences created by this commit, when the shift parameter is changed, are comparable to the differences between some of the current scheduler and samplers.
So, I think that adding a new option would be justified.

These are 10 + 10 steps:

old version

new version

old version - shift = 3.0

new version - shift = 3.0

new version - shift = 3.2

new version - shift = 3.0 - changed code to revert the scheduling commit

rovo79 · 2024-02-20T16:28:50Z

How are the comfyui checkpoints different from the others?
example, Stage B;
comfyui:
https://huggingface.co/stabilityai/stable-cascade/blob/main/comfyui_checkpoints/stable_cascade_stage_b.safetensors (4.55GB)

others:
https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b.safetensors (6.25GB)
https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_bf16.safetensors (3.13GB)
https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_lite.safetensors (2.8GB)
https://huggingface.co/stabilityai/stable-cascade/blob/main/stage_b_lite_bf16.safetensors. (1.4GB)

Do the comfyUI checkpoints include the VAE?

zartio-com · 2024-02-20T17:25:06Z

Is super resolution controlnet supported yet?

anthonyaquino83 · 2024-02-20T17:28:07Z

Hi guys, I updated ComfyUI, I downloaded both models but when I run the default example from stable_cascade__text_to_image.png ComfyUI shows this error:

Update ComfyUI
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
All extensions are already up-to-date.
got prompt
model_type STABLE_CASCADE
adm 0
clip missing: ['clip_g.logit_scale']

Can you help me please?

jn-jairo · 2024-02-20T17:38:45Z

About the comfyui checkpoints, looking at the examples I presume they contain the following:

stable_cascade_stage_c.safetensors

model (stage_c_bf16)
clip
vae (effnet_encoder)
clip_vision

stable_cascade_stage_b.safetensors

model (stage_b_bf16)
vae (stage_a)

zartio-com · 2024-02-20T17:50:42Z

Hi guys, I updated ComfyUI, I downloaded both models but when I run the default example from stable_cascade__text_to_image.png ComfyUI shows this error:
Update ComfyUI
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
All extensions are already up-to-date.
got prompt
model_type STABLE_CASCADE
adm 0
clip missing: ['clip_g.logit_scale']
Can you help me please?

hey, getting the same message (along with a lot of left over keys) but everything seems to work fine.

The clip missing is probably from loading stage b since it does not have clip embedded

anthonyaquino83 · 2024-02-20T18:54:58Z

Hi guys, I updated ComfyUI, I downloaded both models but when I run the default example from stable_cascade__text_to_image.png ComfyUI shows this error:
Update ComfyUI
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
All extensions are already up-to-date.
got prompt
model_type STABLE_CASCADE
adm 0
clip missing: ['clip_g.logit_scale']
Can you help me please?

Got it working, I used the workflow below:
https://gist.github.com/comfyanonymous/0f09119a342d0dd825bb2d99d19b781c
Instead the stable_cascade__text_to_image.png from below:
https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade

JorgeR81 · 2024-02-20T23:45:06Z

The Inspire Pack has a custom loader for cascade models.

https://github.com/ltdrdata/ComfyUI-Inspire-Pack

SLAPaper · 2024-02-21T08:03:03Z

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024*1024 image will be OK, but 1024*1080 will get artifacts, 1080*1080 will get more artifacts

I doubt if it is related to some interporlation issue when stage B taking Stage C's latent as prior condition, since by previewing Stage C's latent in pixel space, it's too blurry to get artifacts.

trying to find the rule and make a smallest reproduction workflow

JorgeR81 · 2024-02-21T09:14:45Z

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

No, but I've found significant artifacts, while changing the cfg, when there is a large image variation, between 2 close cfg values.
But it was only 1 time, at 1024 * 1024, and only 6 steps, on stage C
Although the images without artifacts were very good quality, at 6 steps.

This was with the unet workflow, at cfg 2.9
I've only found this because I was trying to get a good image blend between 2 very diferent variations, at 2.8 an 3.0
#2785 (comment)

After the latest updates, it's a little better, and happening only at cfg 2.95, so it's harder to find.
It is still a bit creepy because it's a face with deformed mouth, so it is hidden / minimized, but it's better than before.

show failed generation (creepy)

JorgeR81 · 2024-02-21T09:57:42Z

My preview for Stage C, is working for all steps, but preview in stage B, stops updating after the first few steps.
Sometimes it gets closer to the end, but it still misses a few steps in between.
It's just me, or is it happening to someone else?

Datou · 2024-02-21T12:25:17Z

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024*1024 image will be OK, but 1024*1080 will get artifacts, 1080*1080 will get more artifacts

I doubt if it is related to some interporlation issue when stage B taking Stage C's latent as prior condition, since by previewing Stage C's latent in pixel space, it's too blurry to get artifacts.

trying to find the rule and make a smallest reproduction workflow

yes. this is a 2048x2048 image.

Datou · 2024-02-21T12:28:12Z

My preview for Stage C, is working for all steps, but preview in stage B, stops updating after the first few steps. Sometimes it gets closer to the end, but it still misses a few steps in between. It's just me, or is it happening to someone else?

could you tell me how to use preview?

JorgeR81 · 2024-02-21T13:22:05Z

Could you tell me how to use preview?

I downloaded the [ previewer.safetensors ] from here, and put it in the "vae" folder, that's inside the "models" folder.
https://huggingface.co/stabilityai/stable-cascade/tree/main

And enabled the preview, via Comfy UI Manager. You can set Preview method to "Auto".
https://github.com/ltdrdata/ComfyUI-Manager

Also, this may be working correctly as it is.
The differences between steps in stage B, are so subtle, that they don't show well in the smaller preview.
And the preview image on Stage B, still needs to go through the VAE, so I'm not sure if it's not supposed to look exactly like the final image.

mnn · 2024-02-21T13:31:38Z

Sometimes I get a black image result, but when I re-run it (same seed and everything) it proceeds to work. It seems to happen just before it finishes (so probably Stage A?):

Requested to load StageA
Loading 1 new model
/mnt/dev/ai/ComfyUI/nodes.py:1432: RuntimeWarning: invalid value encountered in cast
  img = Image.fromarray(np.clip(i, 0, 255).astype(np.uint8))
Prompt executed in 55.35 seconds

I should get the preview working, so I can better see what's going on.

Edit: Just noticed I was using older txt2img workflow (load VAE, some conditioning nodes etc), so maybe the black image problem was related to that?

JorgeR81 · 2024-02-21T14:16:33Z

If I connect a VAE decoder to the Stage C sampler, I can get a small "preview" image, with better colors !

This may be more useful than the sampler preview, to decide if we have a good seed, and it's worth enabling Stage B.
( If I mute the main Preview Image node, the Stage B sampler won't run ).

The sampler preview will always be useful for debugging, but in Stable Cascade, it may be less useful for artistic purposes, since the Stage C preview is too small resolution, and the Stage B preview shows an almost "finished" image from the start, with very few variations between steps.

comfyanonymous · 2024-02-21T14:24:18Z

Those simple previews are just a matrix multiplication so that's why they suck but they are very cheap and better than nothing.

--preview-method auto is how you enable them.

Piezoid · 2024-02-21T14:31:58Z

My preview for Stage C, is working for all steps, but preview in stage B, stops updating after the first few steps. Sometimes it gets closer to the end, but it still misses a few steps in between. It's just me, or is it happening to someone else?

I suspect that the preview does work, but the latter steps aren't doing anything visible.

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

I have noticed this too. With some art styles, such as line works on black background, the output from stage B/A can become very noisy. As a workaround I found that stopping stage_b diffusion early reduce the issue. For example doing only 10 step out of 30 with a Ksampler.

samples

First is stage B denoising 10 out of 10 steps, second is stage B denoising 10 out of 30 steps.

Datou · 2024-02-21T15:33:40Z

Could you tell me how to use preview?

I downloaded the [ previewer.safetensors ] from here, and put it in the "vae" folder, that's inside the "models" folder. https://huggingface.co/stabilityai/stable-cascade/tree/main

And enabled the preview, via Comfy UI Manager. You can set Preview method to "Auto". https://github.com/ltdrdata/ComfyUI-Manager

Also, this may be working correctly as it is. The differences between steps in stage B, are so subtle, that they don't show well in the smaller preview. And the preview image on Stage B, still needs to go through the VAE, so I'm not sure if it's not supposed to look exactly like the final image.

it works, thanks a lot.

SLAPaper · 2024-02-21T15:53:50Z

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024*1024 image will be OK, but 1024*1080 will get artifacts, 1080*1080 will get more artifacts

I doubt if it is related to some interporlation issue when stage B taking Stage C's latent as prior condition, since by previewing Stage C's latent in pixel space, it's too blurry to get artifacts.

trying to find the rule and make a smallest reproduction workflow

Here is my reproduction: using a modified prompt of the official text2img workflow to emphasis the outline, seed and other settings fixed.

at compression ratio 42, size 1008~1048 will have the same stage C latent of 24*24, so the basic content won't change, the only difference should only come from Stage B

size 1008*1008 (OK)

Show picture

size 1016*1016 (artifacts)

Show picture

size 1024*1024 (OK)

Show picture

size 1032*1032 (artifacts)

Show picture

size 1040*1040 (OK)

Show picture

size 1048*1048 (artifacts)

Show picture

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

Piezoid · 2024-02-21T16:50:58Z

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

This is line that resizes the latent C to half of the latent B size (eg. latent C is resized to 128x128 for a 1024x1024 image): https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/ldm/cascade/stage_b.py#L246
I guess that could cause some aliasing artifacts when the latent C size doesn't divide 128.

kakachiex2 · 2024-02-21T20:53:10Z

Some one post recommended size for vertical image I always get this error :

Error occurred when executing KSampler:

pixel_unshuffle expects height to be divisible by downscale_factor, but input.size(-2)=297 is not divisible by 2

File "K:\ComfyUI\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\nodes.py", line 1368, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\nodes.py", line 1338, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 22, in informative_sample
raise e
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample
return original_sample(*args, **kwargs) # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-AnimateLCM\animatediff\sampling.py", line 241, in motion_sample
return orig_comfy_sample(model, noise, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 248, in motion_sample
return orig_comfy_sample(model, noise, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\sample.py", line 100, in sample
samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes_init_.py", line 130, in KSampler_sample
return KSampler_sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 713, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes_init.py", line 149, in sample
return _sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 618, in sample
samples = sampler.sample(model_wrap, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 557, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\k_diffusion\sampling.py", line 154, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 281, in forward
out = self.inner_model(x, sigma, cond=cond, uncond=uncond, cond_scale=cond_scale, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self.call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 271, in forward
return self.apply_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes\smZNodes.py", line 1012, in apply_model
out = super().apply_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 268, in apply_model
out = sampling_function(self.inner_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 248, in sampling_function
cond_pred, uncond_pred = calc_cond_uncond_batch(model, cond, uncond, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 222, in calc_cond_uncond_batch
output = model.apply_model(input_x, timestep, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\model_base.py", line 91, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\ldm\cascade\stage_b.py", line 244, in forward
x = self.embedding(x)
^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\container.py", line 215, in forward
input = module(input)
^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\pixelshuffle.py", line 104, in forward
return F.pixel_unshuffle(input, self.downscale_factor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Datou · 2024-02-22T03:28:20Z

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

This is line that resizes the latent C to half of the latent B size (eg. latent C is resized to 128x128 for a 1024x1024 image): https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/ldm/cascade/stage_b.py#L246 I guess that could cause some aliasing artifacts when the latent C size doesn't divide 128.

What can I do to avoid this problem?

MoonMoon82 · 2024-02-22T14:10:54Z

https://huggingface.co/stabilityai/stable-cascade/tree/main/controlnet
Are there any nodes I can use to make these controlnet models run? The built-in and the advanced controlnet node are just showing this error

Error occurred when executing ControlNetApply:

'NoneType' object has no attribute 'copy'

Guillaume-Fgt · 2024-02-23T07:57:14Z

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

I had similar findings by doing some tests and I made my own latent node with multiple of 64, and possibility to lock the aspect ratio: https://github.com/Guillaume-Fgt/ComfyUI_StableCascadeLatentRatio

If I compare to your observations, I avoid 1016,1032 and 1048 pixel sizes, so looks good. If anyone want to test and give feedback, I can modify it.

JorgeR81 · 2024-02-23T16:14:24Z

#2785 (comment)

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024 x 1024 image will be OK, but 1024 x 1080 will get artifacts, 1080 x 1080 will get more artifacts

This also happens in realistic images:

At 1080 x 1080 is very noticeable

At 1040 x 1040 is moslty OK, but it's still a bit blurry

At 1024 x 1024 is much more crisp

Kwisss · 2024-02-23T18:22:00Z

https://huggingface.co/stabilityai/stable-cascade/tree/main/controlnet Are there any nodes I can use to make these controlnet models run? The built-in and the advanced controlnet node are just showing this error
Error occurred when executing ControlNetApply:

'NoneType' object has no attribute 'copy'

was wondering the same, we probably have to wait I'm sure there is being worked on

HelloClyde mentioned this issue Feb 17, 2024

Are there any plans to support the graph generation feature for StableCascade models? #2798

Closed

MoonMoon82 mentioned this issue Feb 23, 2024

(Stable Cascade) The controlnet is not working, an error occurred on the 【Apply Controlnet】node #2884

Open

Stable Cascade #2785

Stable Cascade #2785

Comments

DataCTE commented Feb 13, 2024

DataCTE commented Feb 13, 2024

CyberTimon commented Feb 13, 2024

Arcitec commented Feb 14, 2024 • edited Loading

Arcitec commented Feb 14, 2024 • edited Loading

Arcitec commented Feb 14, 2024

JorgeR81 commented Feb 14, 2024 • edited Loading

Arcitec commented Feb 15, 2024 • edited Loading

MushroomFleet commented Feb 15, 2024

comfyanonymous commented Feb 17, 2024

hben35096 commented Feb 17, 2024

Thireus commented Feb 17, 2024 • edited Loading

hzhangxyz commented Feb 17, 2024

Th3Rom3 commented Feb 17, 2024 • edited Loading

ke1ne commented Feb 17, 2024 • edited Loading

HelloClyde commented Feb 17, 2024

JorgeR81 commented Feb 17, 2024 • edited Loading

ke1ne commented Feb 17, 2024

comfyanonymous commented Feb 17, 2024

Th3Rom3 commented Feb 17, 2024

hben35096 commented Feb 17, 2024

JorgeR81 commented Feb 17, 2024

JorgeR81 commented Feb 17, 2024

CraftMaster163 commented Feb 17, 2024

CraftMaster163 commented Feb 17, 2024

Th3Rom3 commented Feb 17, 2024 • edited Loading

CraftMaster163 commented Feb 17, 2024

JorgeR81 commented Feb 17, 2024

lalalabush commented Feb 17, 2024 • edited Loading

Thireus commented Feb 17, 2024

comfyanonymous commented Feb 20, 2024

JorgeR81 commented Feb 20, 2024

JorgeR81 commented Feb 20, 2024 • edited Loading

rovo79 commented Feb 20, 2024

zartio-com commented Feb 20, 2024

anthonyaquino83 commented Feb 20, 2024

jn-jairo commented Feb 20, 2024

stable_cascade_stage_c.safetensors

stable_cascade_stage_b.safetensors

zartio-com commented Feb 20, 2024 • edited Loading

anthonyaquino83 commented Feb 20, 2024

JorgeR81 commented Feb 20, 2024

SLAPaper commented Feb 21, 2024 • edited Loading

JorgeR81 commented Feb 21, 2024 • edited Loading

JorgeR81 commented Feb 21, 2024 • edited Loading

Datou commented Feb 21, 2024 • edited Loading

Datou commented Feb 21, 2024

JorgeR81 commented Feb 21, 2024

mnn commented Feb 21, 2024 • edited Loading

JorgeR81 commented Feb 21, 2024 • edited Loading

comfyanonymous commented Feb 21, 2024

Piezoid commented Feb 21, 2024

Datou commented Feb 21, 2024 • edited Loading

SLAPaper commented Feb 21, 2024 • edited Loading

size 1008*1008 (OK)

size 1016*1016 (artifacts)

size 1024*1024 (OK)

size 1032*1032 (artifacts)

size 1040*1040 (OK)

size 1048*1048 (artifacts)

Piezoid commented Feb 21, 2024

kakachiex2 commented Feb 21, 2024

Datou commented Feb 22, 2024

MoonMoon82 commented Feb 22, 2024

Guillaume-Fgt commented Feb 23, 2024

JorgeR81 commented Feb 23, 2024

Kwisss commented Feb 23, 2024

Arcitec commented Feb 14, 2024 •

edited

Loading

Arcitec commented Feb 14, 2024 •

edited

Loading

JorgeR81 commented Feb 14, 2024 •

edited

Loading

Arcitec commented Feb 15, 2024 •

edited

Loading

Thireus commented Feb 17, 2024 •

edited

Loading

Th3Rom3 commented Feb 17, 2024 •

edited

Loading

ke1ne commented Feb 17, 2024 •

edited

Loading

JorgeR81 commented Feb 17, 2024 •

edited

Loading

Th3Rom3 commented Feb 17, 2024 •

edited

Loading

lalalabush commented Feb 17, 2024 •

edited

Loading

JorgeR81 commented Feb 20, 2024 •

edited

Loading

zartio-com commented Feb 20, 2024 •

edited

Loading

SLAPaper commented Feb 21, 2024 •

edited

Loading

JorgeR81 commented Feb 21, 2024 •

edited

Loading

JorgeR81 commented Feb 21, 2024 •

edited

Loading

Datou commented Feb 21, 2024 •

edited

Loading

mnn commented Feb 21, 2024 •

edited

Loading

JorgeR81 commented Feb 21, 2024 •

edited

Loading

Datou commented Feb 21, 2024 •

edited

Loading

SLAPaper commented Feb 21, 2024 •

edited

Loading