Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in sampling stage- possibly related to MPS backend #12

Closed
jwooldridge234 opened this issue Apr 19, 2024 · 17 comments
Closed

Error in sampling stage- possibly related to MPS backend #12

jwooldridge234 opened this issue Apr 19, 2024 · 17 comments
Labels
bug Something isn't working

Comments

@jwooldridge234
Copy link

Getting this error:

File "/Users/jackwooldridge/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/jackwooldridge/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/jackwooldridge/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/jackwooldridge/ComfyUI/nodes.py", line 1344, in sample
    return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
  File "/Users/jackwooldridge/ComfyUI/nodes.py", line 1314, in common_ksampler
    samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
  File "/Users/jackwooldridge/ComfyUI/comfy/sample.py", line 37, in sample
    samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 755, in sample
    return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 657, in sample
    return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 644, in sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 623, in inner_sample
    samples = sampler.sample(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 534, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/comfy/k_diffusion/sampling.py", line 137, in sample_euler
    denoised = model(x, sigma_hat * s_in, **extra_args)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 272, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 610, in __call__
    return self.predict_noise(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 613, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 258, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
  File "/Users/jackwooldridge/ComfyUI/comfy/samplers.py", line 216, in calc_cond_batch
    output = model_options['model_function_wrapper'](model.apply_model, {"input": input_x, "timestep": timestep_, "c": c, "cond_or_uncond": cond_or_uncond}).chunk(batch_chunks)
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/ella.py", line 67, in __call__
    self.model_sampling.timestep(timestep_[i]),
IndexError: index 1 is out of bounds for dimension 0 with size 1

Seems to be related to this function:

def __call__(self, apply_model, kwargs: dict):
        input_x = kwargs["input"]
        timestep_ = kwargs["timestep"]
        c = kwargs["c"]
        cond_or_uncond = kwargs["cond_or_uncond"]  # [0|1]

        time_aware_encoder_hidden_states = []
        self.ella.to(device=self.load_device)
        for i in cond_or_uncond:
            h = self.ella(
                self.model_sampling.timestep(timestep_[i]),
                **self.embeds[i],
            )
            time_aware_encoder_hidden_states.append(h)
        self.ella.to(self.offload_device)

        c["c_crossattn"] = torch.cat(time_aware_encoder_hidden_states, dim=0)

        return apply_model(input_x, timestep_, **c)

This happens while running the latest ComfyUI on MacOS with "python main.py --preview-method taesd" as the initialization command. Also fails when running in force-fp16 mode. Fails if the encoder is set to fp32 or fp16. Let me know if I can provide any other information. Thanks!

@JettHu
Copy link
Collaborator

JettHu commented Apr 19, 2024

Does the error also occur without --preview-method-taesd. And would you mind sharing your workflow?

@jwooldridge234
Copy link
Author

Yes, just checked. And I'm using the default text to image workflow in the repo, just loaded Dreamshaper as the checkpoint rather than the aw_painting checkpoint.

@JettHu
Copy link
Collaborator

JettHu commented Apr 20, 2024

I'll try to reproduce it on my mac.

@jwooldridge234
Copy link
Author

@JettHu Wait, I just remembered (forgot because I've had to do this so many times in different ComfyUI plugins)- I modified the plugin code in model.py (line 28):

def forward(self, x: torch.Tensor, timestep_embedding: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
        emb = self.linear(self.silu(timestep_embedding))
        shift, scale = emb.view(len(x), 1, -1).chunk(2, dim=-1)
        return self.norm(x.to(torch.float32)).to(torch.float16) * (1 + scale) + shift

If I don't cast x to full and then back to half, I get a fatal error in the Apply_Ella stage:

Traceback (most recent call last):
  File "/Users/jackwooldridge/ComfyUI/execution.py", line 151, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
  File "/Users/jackwooldridge/ComfyUI/execution.py", line 81, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
  File "/Users/jackwooldridge/ComfyUI/execution.py", line 74, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/ella.py", line 119, in apply
    _cond, _uncond = ella_proxy.prepare_conds()
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/ella.py", line 52, in prepare_conds
    cond = self.ella(torch.Tensor([999]).to(torch.int64), **self.embeds[0])
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/model.py", line 296, in forward
    return self.connector(t5_embeds, timestep_embedding=time_embedding)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/model.py", line 106, in forward
    latents = p_block(x, latents, timestep_embedding=timestep_embedding)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/model.py", line 65, in forward
    normed_latents = self.ln_1(latents, timestep_embedding)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/custom_nodes/ComfyUI-ELLA/model.py", line 28, in forward
    return self.norm(x) * (1 + scale) + shift
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/modules/normalization.py", line 201, in forward
    return F.layer_norm(
  File "/Users/jackwooldridge/ComfyUI/venv/lib/python3.9/site-packages/torch/nn/functional.py", line 2546, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'

Sorry I didn't mention that before, since it might be relevant. Other plugins have worked fine with that change.

@jwooldridge234
Copy link
Author

Also, here's my full workflow in case it helps:

{
  "last_node_id": 21,
  "last_link_id": 26,
  "nodes": [
    {
      "id": 10,
      "type": "ELLALoader",
      "pos": [
        -3,
        626
      ],
      "size": {
        "0": 341.86419677734375,
        "1": 58
      },
      "flags": {},
      "order": 0,
      "mode": 0,
      "outputs": [
        {
          "name": "ELLA",
          "type": "ELLA",
          "links": [
            10
          ],
          "shape": 3,
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "ELLALoader"
      },
      "widgets_values": [
        "ella-sd1.5-tsc-t5xl.safetensors"
      ]
    },
    {
      "id": 17,
      "type": "T5TextEncode #ELLA",
      "pos": [
        117,
        742
      ],
      "size": {
        "0": 210,
        "1": 90.64571380615234
      },
      "flags": {},
      "order": 4,
      "mode": 0,
      "inputs": [
        {
          "name": "text_encoder",
          "type": "T5_TEXT_ENCODER",
          "link": 21
        }
      ],
      "outputs": [
        {
          "name": "ELLA_EMBEDS",
          "type": "ELLA_EMBEDS",
          "links": [
            23
          ],
          "shape": 3,
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "T5TextEncode #ELLA"
      },
      "widgets_values": [
        "a large, textured green crocodile lying comfortably on a patch of grass with a cute, knitted orange sweater enveloping its scaly body. Around its neck, the sweater features a whimsical pattern of blue and yellow stripes. In the background, a smooth, grey rock partially obscures the view of a small pond with lily pads floating on the surface."
      ]
    },
    {
      "id": 12,
      "type": "EllaApply",
      "pos": [
        427,
        478
      ],
      "size": {
        "0": 210,
        "1": 86
      },
      "flags": {},
      "order": 6,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 11
        },
        {
          "name": "ella",
          "type": "ELLA",
          "link": 10
        },
        {
          "name": "positive",
          "type": "ELLA_EMBEDS",
          "link": 23
        },
        {
          "name": "negative",
          "type": "ELLA_EMBEDS",
          "link": 24
        }
      ],
      "outputs": [
        {
          "name": "model",
          "type": "MODEL",
          "links": [
            20
          ],
          "shape": 3,
          "slot_index": 0
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "links": [
            25
          ],
          "shape": 3,
          "slot_index": 1
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "links": [
            26
          ],
          "shape": 3,
          "slot_index": 2
        }
      ],
      "properties": {
        "Node name for S&R": "EllaApply"
      }
    },
    {
      "id": 5,
      "type": "EmptyLatentImage",
      "pos": [
        420,
        643
      ],
      "size": {
        "0": 210,
        "1": 106
      },
      "flags": {},
      "order": 1,
      "mode": 0,
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            2
          ],
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "EmptyLatentImage"
      },
      "widgets_values": [
        512,
        512,
        1
      ]
    },
    {
      "id": 8,
      "type": "VAEDecode",
      "pos": [
        986,
        484
      ],
      "size": {
        "0": 210,
        "1": 46
      },
      "flags": {},
      "order": 8,
      "mode": 0,
      "inputs": [
        {
          "name": "samples",
          "type": "LATENT",
          "link": 7
        },
        {
          "name": "vae",
          "type": "VAE",
          "link": 8
        }
      ],
      "outputs": [
        {
          "name": "IMAGE",
          "type": "IMAGE",
          "links": [
            19
          ],
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "VAEDecode"
      }
    },
    {
      "id": 20,
      "type": "PreviewImage",
      "pos": [
        989,
        573
      ],
      "size": {
        "0": 210,
        "1": 246
      },
      "flags": {},
      "order": 9,
      "mode": 0,
      "inputs": [
        {
          "name": "images",
          "type": "IMAGE",
          "link": 19
        }
      ],
      "properties": {
        "Node name for S&R": "PreviewImage"
      }
    },
    {
      "id": 21,
      "type": "T5TextEncode #ELLA",
      "pos": [
        118,
        878
      ],
      "size": {
        "0": 210,
        "1": 90.64571380615234
      },
      "flags": {},
      "order": 5,
      "mode": 0,
      "inputs": [
        {
          "name": "text_encoder",
          "type": "T5_TEXT_ENCODER",
          "link": 22
        }
      ],
      "outputs": [
        {
          "name": "ELLA_EMBEDS",
          "type": "ELLA_EMBEDS",
          "links": [
            24
          ],
          "shape": 3,
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "T5TextEncode #ELLA"
      },
      "widgets_values": [
        "nsfw, text, low quaility "
      ]
    },
    {
      "id": 4,
      "type": "CheckpointLoaderSimple",
      "pos": [
        26,
        474
      ],
      "size": {
        "0": 315,
        "1": 98
      },
      "flags": {},
      "order": 2,
      "mode": 0,
      "outputs": [
        {
          "name": "MODEL",
          "type": "MODEL",
          "links": [
            11
          ],
          "slot_index": 0
        },
        {
          "name": "CLIP",
          "type": "CLIP",
          "links": [],
          "slot_index": 1
        },
        {
          "name": "VAE",
          "type": "VAE",
          "links": [
            8
          ],
          "slot_index": 2
        }
      ],
      "properties": {
        "Node name for S&R": "CheckpointLoaderSimple"
      },
      "widgets_values": [
        "dreamshaper_331BakedVae.safetensors"
      ]
    },
    {
      "id": 3,
      "type": "KSampler",
      "pos": [
        657,
        477
      ],
      "size": {
        "0": 315,
        "1": 262
      },
      "flags": {},
      "order": 7,
      "mode": 0,
      "inputs": [
        {
          "name": "model",
          "type": "MODEL",
          "link": 20
        },
        {
          "name": "positive",
          "type": "CONDITIONING",
          "link": 25
        },
        {
          "name": "negative",
          "type": "CONDITIONING",
          "link": 26
        },
        {
          "name": "latent_image",
          "type": "LATENT",
          "link": 2
        }
      ],
      "outputs": [
        {
          "name": "LATENT",
          "type": "LATENT",
          "links": [
            7
          ],
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "KSampler"
      },
      "widgets_values": [
        16785396861587,
        "randomize",
        20,
        8,
        "euler",
        "normal",
        1
      ]
    },
    {
      "id": 18,
      "type": "T5TextEncoderLoader #ELLA",
      "pos": [
        -246,
        785
      ],
      "size": {
        "0": 339.4064025878906,
        "1": 106
      },
      "flags": {},
      "order": 3,
      "mode": 0,
      "outputs": [
        {
          "name": "T5_TEXT_ENCODER",
          "type": "T5_TEXT_ENCODER",
          "links": [
            21,
            22
          ],
          "shape": 3,
          "slot_index": 0
        }
      ],
      "properties": {
        "Node name for S&R": "T5TextEncoderLoader #ELLA"
      },
      "widgets_values": [
        "models--google--flan-t5-xl--text_encoder",
        0,
        "auto"
      ]
    }
  ],
  "links": [
    [
      2,
      5,
      0,
      3,
      3,
      "LATENT"
    ],
    [
      7,
      3,
      0,
      8,
      0,
      "LATENT"
    ],
    [
      8,
      4,
      2,
      8,
      1,
      "VAE"
    ],
    [
      10,
      10,
      0,
      12,
      1,
      "ELLA"
    ],
    [
      11,
      4,
      0,
      12,
      0,
      "MODEL"
    ],
    [
      19,
      8,
      0,
      20,
      0,
      "IMAGE"
    ],
    [
      20,
      12,
      0,
      3,
      0,
      "MODEL"
    ],
    [
      21,
      18,
      0,
      17,
      0,
      "T5_TEXT_ENCODER"
    ],
    [
      22,
      18,
      0,
      21,
      0,
      "T5_TEXT_ENCODER"
    ],
    [
      23,
      17,
      0,
      12,
      2,
      "ELLA_EMBEDS"
    ],
    [
      24,
      21,
      0,
      12,
      3,
      "ELLA_EMBEDS"
    ],
    [
      25,
      12,
      1,
      3,
      1,
      "CONDITIONING"
    ],
    [
      26,
      12,
      2,
      3,
      2,
      "CONDITIONING"
    ]
  ],
  "groups": [],
  "config": {},
  "extra": {},
  "version": 0.4
}

@MythicalChu
Copy link

MythicalChu commented Apr 20, 2024

I am having the same problem on DirectML
Also, the RuntimeError: "LayerNormKernelImpl" not implemented for 'Half' happens on DirectML too.

@JettHu
Copy link
Collaborator

JettHu commented Apr 21, 2024

I am having the same problem on DirectML Also, the RuntimeError: "LayerNormKernelImpl" not implemented for 'Half' happens on DirectML too.

This may be solved by --fp32-text-enc

@JettHu
Copy link
Collaborator

JettHu commented Apr 21, 2024

@jwooldridge234 Thank you for sharing the environment and workflow. I didn’t reproduce it yesterday. I only have time to look at it during the working day. Besides have you tried --fp32-text-enc?

@JettHu
Copy link
Collaborator

JettHu commented Apr 21, 2024

@JettHu Wait, I just remembered (forgot because I've had to do this so many times in different ComfyUI plugins)- I modified the plugin code in model.py (line 28):

def forward(self, x: torch.Tensor, timestep_embedding: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
        emb = self.linear(self.silu(timestep_embedding))
        shift, scale = emb.view(len(x), 1, -1).chunk(2, dim=-1)
        return self.norm(x.to(torch.float32)).to(torch.float16) * (1 + scale) + shift

@jwooldridge234 On my mac (Apple M1 Pro, 32GB RAM), the error "LayerNormKernelImpl" not implemented for 'Half'also appears in the default mode. However, adding the parameter--fp32-text-enc` can run correctly.

Additionally I found that --fp32-text-enc mode performs better than changing self.norm precision to fp32 in model.py code.

The picture below shows:

1.15it/s with --fp32-text-enc bellow
image

5.31s/it(0.188it/s) without --fp32-text-enc and changing self.norm precision to fp32 in model.py code.
image

Unfortunately, the issue you asked at the beginning was not reproduced.

@jwooldridge234
Copy link
Author

@JettHu That solves the issue with the text encoder, and makes it a ton faster, thank you. I'm still getting that error related to the ksampler ("index 1 is out of bounds for dimension 0 with size 1"), unfortunately. It happens with all models and samplers I've tested.

@jwooldridge234
Copy link
Author

@MythicalChu Are you still seeing this issue when you add --fp32-text-enc?

@JettHu
Copy link
Collaborator

JettHu commented Apr 22, 2024

@JettHu That solves the issue with the text encoder, and makes it a ton faster, thank you. I'm still getting that error related to the ksampler ("index 1 is out of bounds for dimension 0 with size 1"), unfortunately. It happens with all models and samplers I've tested.

I think I found the problem you may encounter. Do you have a long prompt and is the condition input to KSampler coming from the CLIP Text encode node?

@jwooldridge234
Copy link
Author

Sadly, no- I'm using the T5 Text Encode into the Apply Ella node, and then taking the conditioning from there into the KSampler. Shortening the prompt (or deleting it altogether) makes no difference.

@JettHu
Copy link
Collaborator

JettHu commented Apr 22, 2024

@jwooldridge234 I have released a new version, you can try it and see if you still have this error.

@jwooldridge234
Copy link
Author

Perfect, solved my issue! Thanks!

@MythicalChu
Copy link

@MythicalChu Are you still seeing this issue when you add --fp32-text-enc?

Sorry for the late answer. No problem anymore, using --fp32-text-enc. Haven't tried the latest updates though, but I bet it's great :)

@ranfengqaq
Copy link

微信图片_20240429103738
微信图片_20240429103742
我想请问下运行T5TextEncoderLoader显示报错:Error occurred when executing T5TextEncoderLoader #ELLA:

'added_tokens'
通过访问gpt这个问题得到是没有从分词器文件句柄访问密钥,但似乎缺少该密钥,从而导致错误,想问一下这个错误怎么解决呢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants