Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable Cascade #2785

Open
DataCTE opened this issue Feb 13, 2024 · 100 comments
Open

Stable Cascade #2785

DataCTE opened this issue Feb 13, 2024 · 100 comments

Comments

@DataCTE
Copy link

DataCTE commented Feb 13, 2024

https://huggingface.co/stabilityai/stable-cascade
image

@DataCTE
Copy link
Author

DataCTE commented Feb 13, 2024

new stability model wondering if we could get native comfyui support for it

@CyberTimon
Copy link

I'm waiting for support too! Thanks

@Arcitec
Copy link

Arcitec commented Feb 14, 2024

It's inevitably gonna be supported, just be patient.

  • The whole point of ComfyUI is AI generation.
  • Stable Cascade is a major evolution which beats the crap out of SD1.5 and SDXL. It can now do hands, feet and text, and complicated prompts. Everyone wants it. You don't need to remind them. 😉
  • Stable Cascade even comes with a new Face ID transfer controlnet, which will generate high resolution faces with zero LoRA training.
  • It is also way more efficient, only taking something like 10% of the time to train a checkpoint or LoRA compared to SDXL. Which is gonna make a lot of people go crazy!
  • ComfyUI is Stability AI's internal, preferred UI for their developers and researchers.
  • In fact, comfyanonymous is working at Stability AI.
  • Stability even made Stable Swarm, a simpler UI for ComfyUI: https://github.com/Stability-AI/StableSwarmUI

It is obviously exciting enough that it will be supported soon.

Just be very patient. It takes a while to analyze the new architecture, creating new nodes, and figuring it all out. Let's not rush or demand anything! 😉

By the way, Stable Cascade isn't even finished yet. It's still in "early development" research/training. Their codebase is changing constantly. The large model currently uses 20 GB VRAM (!) and they think they can optimize this to 1/2 or 1/3 the usage when they are done. Furthermore, the diffusers library isn't even ready, and the code is far from reaching the quality needed for merging. So let's breathe.

Update: One of the Stability employees commented on Reddit that Comfy will have support in a week or two.

@Arcitec
Copy link

Arcitec commented Feb 14, 2024

Well. There's a basic node which doesn't implement anything and just uses the official code and wraps it in a ComfyUI node. No interesting support for anything special like controlnets, prompt conditioning or anything else really. It's just basic wrapper for some prompt strings and a seed. It doesn't support model loading or unloading, so it will hog your VRAM. I also see that it has a bug (loads one of the models as float16 instead of bfloat16). But hopefully still good enough for impatient people like @GaleBoro, while waiting.

https://github.com/kijai/ComfyUI-DiffusersStableCascade

@Arcitec
Copy link

Arcitec commented Feb 14, 2024

Alright. Here's another option to experiment with it locally in the meantime. It's Pinokio's tweak of the unofficial gradio web UI, and has removed the need for huggingface token / login.

Needs: Python 3.11. The two largest models need ~15.5 GB of VRAM at 1024x1024, or ~18.0-20.0 GB of VRAM on 1536x1536 with bfloat16 model format.

git clone https://huggingface.co/spaces/cocktailpeanut/stable-cascade pinokio
cd pinokio

python -m venv .venv
. .venv/bin/activate

pip install gradio spaces itsdangerous
pip install -r requirements.txt

python app.py

@JorgeR81
Copy link

JorgeR81 commented Feb 14, 2024

Even with Comfy UI vram optimizations, it seems that unless you have a GPU that supports bfloat16 or has 16 GB vram, you can only use the "lite" version of the "Stage C" model, right ?

https://huggingface.co/stabilityai/stable-cascade/tree/main

edit:

Apparently, it could theoretically be possible to split the model for VRAM optimization.

https://news.ycombinator.com/item?id=39360106

The large C model have fair bit of parameters tied to text-conditioning, not to the main denoising process. Similar to how we split the network for SDXL Base, I am pretty confident we can split non-trivial amount of parameters to text-conditioning hence during denoising process, loading less than 3.6B parameters.

@Arcitec
Copy link

Arcitec commented Feb 15, 2024

@JorgeR81 The CEO of Stability has made statements that SD1.0 originally used more than 20 GB VRAM too. And that they are confident that they can reduce Stable Cascade to 1/2 or 1/3 of the current VRAM requirements. He didn't go into detail what techniques they'd use to achieve that, but it sounds good to me, because the 20 GB VRAM usage right now is very painful.

Edit: The statements. Take them with a big grain of salt. But their hope is to reach 8 GB VRAM usage.

Edit: And another statement that I found really interesting. Saying that RTX 40-series and newer cards will become increasingly required for future AI networks, because they support fp8 and optical flow:

@MushroomFleet
Copy link

I think it's worth waiting for the official release from comfyanonymous,

but there is the huggingface space, this colab from Camenduru (Gradio UI, runs on T4 - 80s for 1 image, slow): https://github.com/camenduru/stable-cascade-jupyter

& another one i put together on launch day (runs on A100 - 14 seconds for 4 images, fast): https://github.com/MushroomFleet/StableCascade-text2image

personally i can't wait to get this into Comfy, and there is a diffusers custom node here, if you can't wait!
https://github.com/kijai/ComfyUI-DiffusersStableCascade

@comfyanonymous
Copy link
Owner

It's implemented in the main repo now, you can use this workflow until I write an examples page for it.

https://gist.github.com/comfyanonymous/0f09119a342d0dd825bb2d99d19b781c

@hben35096
Copy link

stage、stage_b16、stage_lite、stage_lite_b16,Is there a big gap between them when it comes to generating images?

@Thireus
Copy link

Thireus commented Feb 17, 2024

Error occurred when executing KSampler:

Given groups=1, weight of size [320, 16, 1, 1], expected input[2, 64, 12, 12] to have 16 channels, but got 64 channels instead

I get the following error, that's weird.

Edit: Solved! stage_c wasn't loaded properly.

@hzhangxyz
Copy link

Is there any workflow example for img-to-img or controlnet?

@Th3Rom3
Copy link

Th3Rom3 commented Feb 17, 2024

Getting an error when running the example workflow for Stable Cascade with both bf16 and lite models:

Error occurred when executing CLIPTextEncode:

output with shape [20, 77, 77] doesn't match the broadcast shape [1, 20, 77, 77]

File "/media/scratch/StableDiffusion/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/nodes.py", line 56, in encode
cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd.py", line 131, in encode_from_tokens
cond, pooled = self.cond_stage_model.encode_token_weights(tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 514, in encode_token_weights
out, pooled = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 39, in encode_token_weights
out, pooled = self.encode(to_encode)
^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 190, in encode
return self(tokens)
^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/sd1_clip.py", line 172, in forward
outputs = self.transformer(tokens, attention_mask, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 131, in forward
return self.text_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 109, in forward
x, i = self.encoder(x, mask=mask, intermediate_output=intermediate_output)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 68, in forward
x = l(x, mask, optimized_attention)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 49, in forward
x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/Python-envs/ComfyUI/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/clip_model.py", line 20, in forward
out = optimized_attention(q, k, v, self.heads, mask)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/scratch/StableDiffusion/ComfyUI/comfy/ldm/modules/attention.py", line 117, in attention_basic
sim += mask
** ComfyUI startup time: 2024-02-17 12:52:52.409368
** Platform: Linux
** Python version: 3.11.5 (main, Sep 11 2023, 13:54:46) [GCC 11.2.0]
** Python executable: /media/scratch/Python-envs/ComfyUI/bin/python

### ComfyUI Revision: 1983 [805c36ac] | Released on '2024-02-17'

Torch version: 2.2.0+rocm5.7

Might be a problem on my end since I am running on AMD ROCm but wanted to leave this here if it crops up anywhere else.

Setting the clip type in "Load CLIP" to stable_diffusion instead of stable_cascade runs the workflow but results in random outputs.

@ke1ne
Copy link

ke1ne commented Feb 17, 2024

Guys, sorry, where to get a proper CLIP G SDXL BF16?

@HelloClyde
Copy link

Guys, sorry, where to get a proper CLIP G SDXL BF16?

and this: https://huggingface.co/stabilityai/stable-cascade/blob/main/text_encoder/model.safetensors goes in the ComfyUI/models/clip/ folder to be loaded with the Load CLIP node in the workflow

Rename "model.safetensors" to "clip_g_sdxl.fp16.safetensors" or simply select "model.safetensors".

@JorgeR81
Copy link

JorgeR81 commented Feb 17, 2024

stage、stage_b16、stage_lite、stage_lite_b16,Is there a big gap between them when it comes to generating images?

I'm using the full size models, but it's worth trying the [stage_c_lite] as well, since the results are very different, like a different model.

The bf16 models look the same, as the regular models.

The full size models are usable even on a GTX 1070 (8GB).
Generation time is 125 sec, with the full size models ( "loading in lowvram mode 4717.692307472229")
Generating for the first time is 165 sec ( loading the full models from SATA SSD )
With [stage_c_lite] is 55 and 65 sec respectively.

stage_c ( default workflow settings )

ComfyUI_00004_

stage_c_lite ( same seed )

ComfyUI_00009_

stage_c_lite ( better seed )

ComfyUI_00015_

@ke1ne
Copy link

ke1ne commented Feb 17, 2024

Guys, sorry, where to get a proper CLIP G SDXL BF16?

and this: https://huggingface.co/stabilityai/stable-cascade/blob/main/text_encoder/model.safetensors goes in the ComfyUI/models/clip/ folder to be loaded with the Load CLIP node in the workflow

Rename "model.safetensors" to "clip_g_sdxl.fp16.safetensors" or simply select "model.safetensors".

Thank you!

@comfyanonymous
Copy link
Owner

#2785 (comment)

This should be fixed now.

@Th3Rom3
Copy link

Th3Rom3 commented Feb 17, 2024

#2785 (comment)

This should be fixed now.

### ComfyUI Revision: 1984 [6c875d84] | Released on '2024-02-17'

Thanks for the insanely fast turnaround. Works like a charm now.

@hben35096
Copy link

PixPin_2024-02-17_21-27-33
In terms of picture details, there is still room for improvement. Occasionally, there are prompts that cause a messy picture like the picture in the upper right corner.

@JorgeR81
Copy link

The [stage c] steps can be reduced from 20 to 10, at least with my simple test prompt, for a generation time of 84 sec.
The [stage_b] steps can be reduced from 10 to 5, with only a slight loss in skin texture quality, for a generation time of 67 sec.
So this is faster than the 72 sec, I need for SDXL, ( for 20 steps ) for good image quality, on a GTX 1070.

stage_c ( 10 steps ) ( same seed as the first image in the post above )

ComfyUI_00033_

stage_c ( 10 steps )
stage_b ( 5 steps )

ComfyUI_00036_5

@JorgeR81
Copy link

In terms of picture details, there is still room for improvement.

Yes, I also noticed that skin detail is not as good as the latest SDXL community finetunes, but I'm sure this will be improved.

In the huggingface model card it's stated one of the "Limitations" as "Faces and people in general may not be generated properly", so this is already much better than what I was expecting. 

The only downside is that a single set of cascade models at full size is 20 GB !

@CraftMaster163
Copy link

on amd directml mode and cpu mode i get the error
Error occurred when executing UNETLoader:

unet_dtype() got an unexpected keyword argument 'supported_dtypes'

File "E:\AI\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
File "E:\AI\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
File "E:\AI\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
File "E:\AI\ComfyUI\nodes.py", line 850, in load_unet
model = comfy.sd.load_unet(unet_path)
File "E:\AI\ComfyUI\comfy\sd.py", line 553, in load_unet
model = load_unet_state_dict(sd)
File "E:\AI\ComfyUI\comfy\sd.py", line 540, in load_unet_state_dict
unet_dtype = model_management.unet_dtype(model_params=parameters, supported_dtypes=model_config.supported_inference_dtypes)

@CraftMaster163
Copy link

ah it was some of my custom nodes, idk which one(s) tho lol

@Th3Rom3
Copy link

Th3Rom3 commented Feb 17, 2024

@CraftMaster163 Did you update Comfy? This issue is supposedly fixed with a recent commit (to this specified custom node).

Edit: Seems like I indeed misread the issue I mentioned. Looks like beyond main Comfy any modules replacing the affected modules will have to be updated, too.

@CraftMaster163
Copy link

i updated my comfyui. got to update nodes too tho

@JorgeR81
Copy link

When we start getting stable cascade models from the community, I'm going to run out of space very fast, on my main drive.

But, by editing the [ extra_model_paths.yaml ], I was able to load the models in the "unet" folder, from another drive in my PC.
You can still keep models in the main "unet" folder; they will also be accessible in the UI.
And this can also be done for "checkpoints", "loras", etc., ...

https://github.com/comfyanonymous/ComfyUI/blob/master/extra_model_paths.yaml.example

Instructions are in this discussion:
#72
You need to rename the file to [ extra_model_paths.yaml ]
Inside the file you need to change the base path, according to the folder you created for your extra models.

The only new step is to add the "unet" folder to the example provided, like this:
Here I also keep paths for other large model types, but you don't need to create the folders, if you're not using them.

comfyui:
      base_path: d:/ComfyUI/
      
      checkpoints: models/checkpoints/
      clip_vision: models/clip_vision/
      controlnet: models/controlnet/
      loras: models/loras/
      unet: models/unet/
      upscale_models: models/upscale_models/
      vae: models/vae/

If it works you should see extras lines in the cmd line, for each new path, when opening Comfy UI, like this:

"Adding extra search path unet d:/ComfyUI/models/unet/"

@lalalabush
Copy link

lalalabush commented Feb 17, 2024

#2785 (comment)

This should be fixed now.

SDXL is broken now for some reason. Stable Cascade is working.

Error occurred when executing CLIPTextEncode:

shape '[77, -1, 77, 77]' is invalid for input of size 5929

  File "/Users/tom/ComfyUI/execution.py", line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/tom/ComfyUI/execution.py", line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/tom/ComfyUI/execution.py", line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/tom/ComfyUI/nodes.py", line 56, in encode
    cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
  File "/Users/tom/ComfyUI/comfy/sd.py", line 131, in encode_from_tokens
    cond, pooled = self.cond_stage_model.encode_token_weights(tokens)
  File "/Users/tom/ComfyUI/comfy/sdxl_clip.py", line 54, in encode_token_weights
    g_out, g_pooled = self.clip_g.encode_token_weights(token_weight_pairs_g)
  File "/Users/tom/ComfyUI/comfy/sd1_clip.py", line 39, in encode_token_weights
    out, pooled = self.encode(to_encode)
  File "/Users/tom/ComfyUI/comfy/sd1_clip.py", line 190, in encode
    return self(tokens)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/sd1_clip.py", line 172, in forward
    outputs = self.transformer(tokens, attention_mask, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 131, in forward
    return self.text_model(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 109, in forward
    x, i = self.encoder(x, mask=mask, intermediate_output=intermediate_output)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 68, in forward
    x = l(x, mask, optimized_attention)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 49, in forward
    x += self.self_attn(self.layer_norm1(x), mask, optimized_attention)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/tom/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/tom/ComfyUI/comfy/clip_model.py", line 20, in forward
    out = optimized_attention(q, k, v, self.heads, mask)
  File "/Users/tom/ComfyUI/comfy/ldm/modules/attention.py", line 117, in attention_basic
    mask = mask.reshape(mask.shape[0], -1, mask.shape[-2], mask.shape[-1]).expand(-1, heads, -1, -1).reshape(sim.shape)

@Thireus
Copy link

Thireus commented Feb 17, 2024

PixPin_2024-02-17_21-27-33 In terms of picture details, there is still room for improvement. Occasionally, there are prompts that cause a messy picture like the picture in the upper right corner.

I also get some pictures like the top right corner one from time to time.

@comfyanonymous
Copy link
Owner

There was a small fix to the stage b sampling, if you update ComfyUI and use the updated workflows:
https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade/

You should see a small quality increase.

@JorgeR81
Copy link

I noticed the img2img workflow uses less compression by default.

This makes sense. 

Since the model has a picture as reference, we are less likely to have distortions, due to low compression, when changing the aspect ratio ( e.g. a long neck in portrait aspect ratio ).

So when can have the benefits of lower compression, like more crip, images, without the drawbacks. 

#2785 (comment)

@JorgeR81
Copy link

JorgeR81 commented Feb 20, 2024

There was a small fix to the stage b sampling, if you update ComfyUI and use the updated workflows:
https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade/

You should see a small quality increase.

Yes, in the new version, I noticed that the dark circumference around the iris is slimmer, which is more realistic.
And the part where the hair makes contact with the dress ( on the left side of the neck ) is also better.

But,

Was there also a change on the shift parameter ?

After this update, when I use the shift node on Stage C, the image is very different.
The quality is perhaps a little better, but I really liked the aesthetics of the old image, with shift = 3

I managed to get a more similar look with shift = 3.2, but still quite different. The braid is gone now.
Also, I noticed the iris is worse ( less round ), in the new version, with this shift value.

So we just need to play with the shift value, or perhaps is another setting you can add to the node ?

EDIT :

It was this commit: c6b7a15

I made the code changes manually, and I can confirm it was this.
I was able to create an image with the old shift look, and the improvements on Stage B.

This commit seems to affect the image much more, when the shift parameter is changed.
I like both versions of these images; with and without this commit.
So an option to toggle this, would be very useful.

The differences created by this commit, when the shift parameter is changed, are comparable to the differences between some of the current scheduler and samplers.
So, I think that adding a new option would be justified.


These are 10 + 10 steps:

old version image.jpg
new version image.jpg
old version - shift = 3.0 image.jpg
new version - shift = 3.0 image.jpg
new version - shift = 3.2 image.jpg
new version - shift = 3.0 - changed code to revert the scheduling commit image.jpg

@rovo79
Copy link

rovo79 commented Feb 20, 2024

@zartio-com
Copy link

Is super resolution controlnet supported yet?

@anthonyaquino83
Copy link

Hi guys, I updated ComfyUI, I downloaded both models but when I run the default example from stable_cascade__text_to_image.png ComfyUI shows this error:

Update ComfyUI
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
All extensions are already up-to-date.
got prompt
model_type STABLE_CASCADE
adm 0
clip missing: ['clip_g.logit_scale']

Can you help me please?

@jn-jairo
Copy link
Contributor

About the comfyui checkpoints, looking at the examples I presume they contain the following:

stable_cascade_stage_c.safetensors

  • model (stage_c_bf16)
  • clip
  • vae (effnet_encoder)
  • clip_vision

stable_cascade_stage_b.safetensors

  • model (stage_b_bf16)
  • vae (stage_a)

@zartio-com
Copy link

zartio-com commented Feb 20, 2024

Hi guys, I updated ComfyUI, I downloaded both models but when I run the default example from stable_cascade__text_to_image.png ComfyUI shows this error:

Update ComfyUI
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
All extensions are already up-to-date.
got prompt
model_type STABLE_CASCADE
adm 0
clip missing: ['clip_g.logit_scale']

Can you help me please?

hey, getting the same message (along with a lot of left over keys) but everything seems to work fine.

The clip missing is probably from loading stage b since it does not have clip embedded

@anthonyaquino83
Copy link

Hi guys, I updated ComfyUI, I downloaded both models but when I run the default example from stable_cascade__text_to_image.png ComfyUI shows this error:

Update ComfyUI
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
All extensions are already up-to-date.
got prompt
model_type STABLE_CASCADE
adm 0
clip missing: ['clip_g.logit_scale']

Can you help me please?

Got it working, I used the workflow below:
https://gist.github.com/comfyanonymous/0f09119a342d0dd825bb2d99d19b781c
Instead the stable_cascade__text_to_image.png from below:
https://comfyanonymous.github.io/ComfyUI_examples/stable_cascade

@JorgeR81
Copy link

The Inspire Pack has a custom loader for cascade models.

https://github.com/ltdrdata/ComfyUI-Inspire-Pack

insp

@SLAPaper
Copy link

SLAPaper commented Feb 21, 2024

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024*1024 image will be OK, but 1024*1080 will get artifacts, 1080*1080 will get more artifacts

I doubt if it is related to some interporlation issue when stage B taking Stage C's latent as prior condition, since by previewing Stage C's latent in pixel space, it's too blurry to get artifacts.

trying to find the rule and make a smallest reproduction workflow

@JorgeR81
Copy link

JorgeR81 commented Feb 21, 2024

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

No, but I've found significant artifacts, while changing the cfg, when there is a large image variation, between 2 close cfg values.
But it was only 1 time, at 1024 * 1024, and only 6 steps, on stage C
Although the images without artifacts were very good quality, at 6 steps.

This was with the unet workflow, at cfg 2.9
I've only found this because I was trying to get a good image blend between 2 very diferent variations, at 2.8 an 3.0
#2785 (comment)

After the latest updates, it's a little better, and happening only at cfg 2.95, so it's harder to find.
It is still a bit creepy because it's a face with deformed mouth, so it is hidden / minimized, but it's better than before.

show failed generation (creepy)

@JorgeR81
Copy link

JorgeR81 commented Feb 21, 2024

My preview for Stage C, is working for all steps, but preview in stage B, stops updating after the first few steps.
Sometimes it gets closer to the end, but it still misses a few steps in between.
It's just me, or is it happening to someone else? 

prev2

@Datou
Copy link

Datou commented Feb 21, 2024

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024*1024 image will be OK, but 1024*1080 will get artifacts, 1080*1080 will get more artifacts

I doubt if it is related to some interporlation issue when stage B taking Stage C's latent as prior condition, since by previewing Stage C's latent in pixel space, it's too blurry to get artifacts.

trying to find the rule and make a smallest reproduction workflow

yes. this is a 2048x2048 image.
ComfyUI_05517_

@Datou
Copy link

Datou commented Feb 21, 2024

My preview for Stage C, is working for all steps, but preview in stage B, stops updating after the first few steps. Sometimes it gets closer to the end, but it still misses a few steps in between. It's just me, or is it happening to someone else? 

prev2

could you tell me how to use preview?

@JorgeR81
Copy link

Could you tell me how to use preview?

I downloaded the [ previewer.safetensors ] from here, and put it in the "vae" folder, that's inside the "models" folder.
https://huggingface.co/stabilityai/stable-cascade/tree/main

And enabled the preview, via Comfy UI Manager. You can set Preview method to "Auto".
https://github.com/ltdrdata/ComfyUI-Manager

Also, this may be working correctly as it is.
The differences between steps in stage B, are so subtle, that they don't show well in the smaller preview.
And the preview image on Stage B, still needs to go through the VAE, so I'm not sure if it's not supposed to look exactly like the final image.

man

@mnn
Copy link

mnn commented Feb 21, 2024

Sometimes I get a black image result, but when I re-run it (same seed and everything) it proceeds to work. It seems to happen just before it finishes (so probably Stage A?):

Requested to load StageA
Loading 1 new model
/mnt/dev/ai/ComfyUI/nodes.py:1432: RuntimeWarning: invalid value encountered in cast
  img = Image.fromarray(np.clip(i, 0, 255).astype(np.uint8))
Prompt executed in 55.35 seconds

I should get the preview working, so I can better see what's going on.

Edit: Just noticed I was using older txt2img workflow (load VAE, some conditioning nodes etc), so maybe the black image problem was related to that?

@JorgeR81
Copy link

JorgeR81 commented Feb 21, 2024

If I connect a VAE decoder to the Stage C sampler, I can get a small "preview" image, with better colors ! 

This may be more useful than the sampler preview, to decide if we have a good seed, and it's worth enabling Stage B.
( If I mute the main Preview Image node, the Stage B sampler won't run ).

The sampler preview will always be useful for debugging, but in Stable Cascade, it may be less useful for artistic purposes, since the Stage C preview is too small resolution, and the Stage B preview shows an almost "finished" image from the start, with very few variations between steps. 

prev3

@comfyanonymous
Copy link
Owner

Those simple previews are just a matrix multiplication so that's why they suck but they are very cheap and better than nothing.

--preview-method auto is how you enable them.

@Piezoid
Copy link

Piezoid commented Feb 21, 2024

My preview for Stage C, is working for all steps, but preview in stage B, stops updating after the first few steps. Sometimes it gets closer to the end, but it still misses a few steps in between. It's just me, or is it happening to someone else?

I suspect that the preview does work, but the latter steps aren't doing anything visible.

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

I have noticed this too. With some art styles, such as line works on black background, the output from stage B/A can become very noisy. As a workaround I found that stopping stage_b diffusion early reduce the issue. For example doing only 10 step out of 30 with a Ksampler.

samples First is stage B denoising 10 out of 10 steps, second is stage B denoising 10 out of 30 steps.

ComfyUI_0147
ComfyUI_0148

@Datou
Copy link

Datou commented Feb 21, 2024

Could you tell me how to use preview?

I downloaded the [ previewer.safetensors ] from here, and put it in the "vae" folder, that's inside the "models" folder. https://huggingface.co/stabilityai/stable-cascade/tree/main

And enabled the preview, via Comfy UI Manager. You can set Preview method to "Auto". https://github.com/ltdrdata/ComfyUI-Manager

Also, this may be working correctly as it is. The differences between steps in stage B, are so subtle, that they don't show well in the smaller preview. And the preview image on Stage B, still needs to go through the VAE, so I'm not sure if it's not supposed to look exactly like the final image.

man

it works, thanks a lot.

@SLAPaper
Copy link

SLAPaper commented Feb 21, 2024

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024*1024 image will be OK, but 1024*1080 will get artifacts, 1080*1080 will get more artifacts

I doubt if it is related to some interporlation issue when stage B taking Stage C's latent as prior condition, since by previewing Stage C's latent in pixel space, it's too blurry to get artifacts.

trying to find the rule and make a smallest reproduction workflow

Here is my reproduction: using a modified prompt of the official text2img workflow to emphasis the outline, seed and other settings fixed.

at compression ratio 42, size 1008~1048 will have the same stage C latent of 24*24, so the basic content won't change, the only difference should only come from Stage B

size 1008*1008 (OK)

Show picture

SC_1008_00001_

size 1016*1016 (artifacts)

Show picture

SC_1018_00001_

size 1024*1024 (OK)

Show picture

SC_1024_00001_

size 1032*1032 (artifacts)

Show picture

SC_1032_00001_

size 1040*1040 (OK)

Show picture

SC_1040_00001_

size 1048*1048 (artifacts)

Show picture

SC_1048_00001_

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

@Piezoid
Copy link

Piezoid commented Feb 21, 2024

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

This is line that resizes the latent C to half of the latent B size (eg. latent C is resized to 128x128 for a 1024x1024 image): https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/ldm/cascade/stage_b.py#L246
I guess that could cause some aliasing artifacts when the latent C size doesn't divide 128.

@kakachiex2
Copy link

Some one post recommended size for vertical image I always get this error :

Error occurred when executing KSampler:

pixel_unshuffle expects height to be divisible by downscale_factor, but input.size(-2)=297 is not divisible by 2

File "K:\ComfyUI\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\nodes.py", line 1368, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\nodes.py", line 1338, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 22, in informative_sample
raise e
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-Impact-Pack\modules\impact\sample_error_enhancer.py", line 9, in informative_sample
return original_sample(*args, **kwargs) # This code helps interpret error messages that occur within exceptions but does not have any impact on other operations.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-AnimateLCM\animatediff\sampling.py", line 241, in motion_sample
return orig_comfy_sample(model, noise, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\animatediff\sampling.py", line 248, in motion_sample
return orig_comfy_sample(model, noise, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\sample.py", line 100, in sample
samples = sampler.sample(noise, positive_copy, negative_copy, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes_init_.py", line 130, in KSampler_sample
return KSampler_sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 713, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes_init
.py", line 149, in sample
return _sample(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 618, in sample
samples = sampler.sample(model_wrap, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 557, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\k_diffusion\sampling.py", line 154, in sample_euler_ancestral
denoised = model(x, sigmas[i] * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 281, in forward
out = self.inner_model(x, sigma, cond=cond, uncond=uncond, cond_scale=cond_scale, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self.call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 271, in forward
return self.apply_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\custom_nodes\ComfyUI_smZNodes\smZNodes.py", line 1012, in apply_model
out = super().apply_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 268, in apply_model
out = sampling_function(self.inner_model, x, timestep, uncond, cond, cond_scale, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 248, in sampling_function
cond_pred, uncond_pred = calc_cond_uncond_batch(model, cond, uncond
, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\samplers.py", line 222, in calc_cond_uncond_batch
output = model.apply_model(input_x, timestep
, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\model_base.py", line 91, in apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\comfy\ldm\cascade\stage_b.py", line 244, in forward
x = self.embedding(x)
^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\container.py", line 215, in forward
input = module(input)
^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "K:\ComfyUI\ComfyUI\venv\Lib\site-packages\torch\nn\modules\pixelshuffle.py", line 104, in forward
return F.pixel_unshuffle(input, self.downscale_factor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

@Datou
Copy link

Datou commented Feb 22, 2024

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

This is line that resizes the latent C to half of the latent B size (eg. latent C is resized to 128x128 for a 1024x1024 image): https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/ldm/cascade/stage_b.py#L246 I guess that could cause some aliasing artifacts when the latent C size doesn't divide 128.

What can I do to avoid this problem?

@MoonMoon82
Copy link

https://huggingface.co/stabilityai/stable-cascade/tree/main/controlnet
Are there any nodes I can use to make these controlnet models run? The built-in and the advanced controlnet node are just showing this error

Error occurred when executing ControlNetApply:

'NoneType' object has no attribute 'copy'

@Guillaume-Fgt
Copy link
Contributor

my observation: the pixel seems should be a multiple of 16 to avoid artifacts, instead of 8

I had similar findings by doing some tests and I made my own latent node with multiple of 64, and possibility to lock the aspect ratio: https://github.com/Guillaume-Fgt/ComfyUI_StableCascadeLatentRatio

If I compare to your observations, I avoid 1016,1032 and 1048 pixel sizes, so looks good. If anyone want to test and give feedback, I can modify it.

@JorgeR81
Copy link

#2785 (comment)

It seems that on some pixel size, the generated image will have strange artifacts, especially in non-realistic artstyles like digital art or anime, anyone encounter?

For example, generate 1024 x 1024 image will be OK, but 1024 x 1080 will get artifacts, 1080 x 1080 will get more artifacts

This also happens in realistic images:

At 1080 x 1080 is very noticeable image.jpg
At 1040 x 1040 is moslty OK, but it's still a bit blurry image.jpg
At 1024 x 1024 is much more crisp image.jpg

@Kwisss
Copy link

Kwisss commented Feb 23, 2024

https://huggingface.co/stabilityai/stable-cascade/tree/main/controlnet Are there any nodes I can use to make these controlnet models run? The built-in and the advanced controlnet node are just showing this error

Error occurred when executing ControlNetApply:

'NoneType' object has no attribute 'copy'

was wondering the same, we probably have to wait I'm sure there is being worked on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests