Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inpaint doesnt work #401

Closed
Someonetoldme584 opened this issue Feb 10, 2024 · 15 comments
Closed

Inpaint doesnt work #401

Someonetoldme584 opened this issue Feb 10, 2024 · 15 comments

Comments

@Someonetoldme584
Copy link

image

and how we can expand the image is there zoom ? beecause images are sooo small

@aureliencta
Copy link

Did you find a solution ? I have the same 3 last lines when i try to inpaint.

@Acly
Copy link
Owner

Acly commented Feb 17, 2024

They're normal and do not indicate an error

@aureliencta
Copy link

Thanks, i'm new to ComfyUI so was getting a bit confused here.
I enabled the option to dump the workflow in the plugin, tried to inpaint again, and loaded that dumped workflow in the ComfyUI interface. When i try to run it from there it hangs on the "Blur Masked Area" node until the ComfyUI server dies eventually...
I'm using a pre-existing ComfyUI so there's maybe something wrong with my installation i guess...
I'm not well versed in Python but i'll try to check what's wrong with this node on my machine.

@tobbez
Copy link

tobbez commented Feb 18, 2024

The root cause is not immediately clear, but what happens is:

The gaussian_blur function gets stuck in its first torch.nn.functional.conv2d call for several minutes (while pegging a CPU core), with the process eventually crashing due to an access violation in torch_cpu.dll.

This is the second time gaussian_blur is called by MaskedBlur.fill (the first call a few lines earlier apparently succeeds).

@Acly
Copy link
Owner

Acly commented Feb 18, 2024

That's interesting. Does it work fine if you lower the blur radius?
Which kind of CPU, and Torch version might also be relevant.

If you could share the workflow.json that would be great

@aureliencta
Copy link

aureliencta commented Feb 18, 2024

Yes it hangs around the same point for me, i think. More precisely this line in the gaussian_blur function (second call, first call succeeds as well):
image = F.conv2d(image, kernel_x, groups=c)

I'm on Linux. CPU is a Ryzen 9 7900X, GPU 4090. Torch version is 2.2.0+cu121.

Here's the workflow file. The workflow succeeds when i bypass the blur node, and it still fails when i try to enter different values for this node.

Maybe i should try installing an older Torch version ?

workflow.json

Edit: corrected wrong line

@aureliencta
Copy link

So i tried reverting to an older torch version 2.1.0 and inpainting now works as intended. The blur node processing time is almost instantaneous.

@Acly
Copy link
Owner

Acly commented Feb 18, 2024

That sounds like it was some kind of bug in Torch. I have a similar setup, torch 2.2.0+cu121 installed on WSL/Ubuntu, but it works fine (AMD Ryzen 5). Not sure what else might have an impact...

@tobbez
Copy link

tobbez commented Feb 18, 2024

It works for blur values <= 37. Values of 39 and higher reproduce the issue (the fill function adds 1 to radius if even, so 38 is not possible).

This is using the internal native windows server (Python 3.10.11, torch==2.2.0+cu121) on Windows 10, with a Zen 4 Ryzen CPU.

Here's a self-contained reproducer:

from __future__ import annotations
import torch
import torch.nn.functional as F
from torch import Tensor

def to_comfy(image: Tensor):
    return image.permute(0, 2, 3, 1)  # BCHW -> BHWC

def make_odd(x):
    if x > 0 and x % 2 == 0:
        return x + 1
    return x

def binary_erosion(mask: Tensor, radius: int):
    kernel = torch.ones(1, 1, radius * 2 + 1, radius * 2 + 1, device=mask.device)
    mask = F.pad(mask, (radius, radius, radius, radius), mode="constant", value=1)
    mask = F.conv2d(mask, kernel, groups=1)
    mask = (mask == kernel.numel()).to(mask.dtype)
    return mask

def to_torch(image: Tensor, mask: Tensor | None = None):
    if len(image.shape) == 3:
        image = image.unsqueeze(0)
    image = image.permute(0, 3, 1, 2)  # BHWC -> BCHW
    if mask is not None:
        if len(mask.shape) < 4:
            mask = mask.reshape(1, 1, mask.shape[-2], mask.shape[-1])
    if image.shape[2:] != mask.shape[2:]:
        raise ValueError(
            f"Image and mask must be the same size. {image.shape[2:]} != {mask.shape[2:]}"
        )
    return image, mask

def _gaussian_kernel(radius: int, sigma: float):
    x = torch.linspace(-radius, radius, steps=radius * 2 + 1)
    pdf = torch.exp(-0.5 * (x / sigma).pow(2))
    return pdf / pdf.sum()


def gaussian_blur(image: Tensor, radius: int, sigma: float = 0):
    c = image.shape[-3]
    if sigma <= 0:
        sigma = 0.3 * (radius - 1) + 0.8

    kernel = _gaussian_kernel(radius, sigma).to(image.device)
    kernel_x = kernel[..., None, :].repeat(c, 1, 1).unsqueeze(1)
    kernel_y = kernel[..., None].repeat(c, 1, 1).unsqueeze(1)

    image = F.pad(image, (radius, radius, radius, radius), mode="reflect")
    image = F.conv2d(image, kernel_x, groups=c)
    image = F.conv2d(image, kernel_y, groups=c)
    return image


def fill(image: Tensor, mask: Tensor, blur: int, falloff: int):
    blur = make_odd(blur)
    falloff = min(make_odd(falloff), blur - 2)
    image, mask = to_torch(image, mask)

    original = image.clone()
    alpha = mask.floor()
    if falloff > 0:
        erosion = binary_erosion(alpha, falloff)
        alpha = alpha * gaussian_blur(erosion, falloff)
    alpha = alpha.repeat(1, 3, 1, 1)

    image = gaussian_blur(image, blur)
    image = original + (image - original) * alpha
    return (to_comfy(image),)

mask = torch.rand([1, 1024, 1024], dtype=torch.float32, device=torch.device('cpu'))
image = torch.rand([1, 1024, 1024, 3], dtype=torch.float32, device=torch.device('cpu'))

# blur <= 37 works
ret = fill(image, mask, blur=39, falloff=9)

@Acly
Copy link
Owner

Acly commented Feb 19, 2024

Thanks for the reproducer. I ran it in a loop with blur radius up to 511 just to be sure, exact same Python + torch version, but still can't reproduce it.
I'd try reducing it further, to figure out which operation triggers the issue (it's probably not the conv2d itself)

@Acly
Copy link
Owner

Acly commented Feb 27, 2024

I tried to disable oneDNN AVX512 via env in 1.15.0 for the case where the Krita plugin starts the ComfyUI process. It might take a while for fixed torch version to arrive and replace existing installs, and I doubt it will impact performance.

@JasonS09
Copy link

I'm working with a Google Colab server. I can't inpaint with T4 nor A100 gpu. It takes a while to load something and then the cell just stops. RAM usage peaks here. I tried to run the generated workflow and the problem seems to be the blur node (where it gets stuck) so I think it has to do with this issue.

@ShadwDrgn
Copy link

Has anyone reported this bug to torch? It'd be nice to attach that issue to this one so we can track it. In the meantime I was having this as well and putting ONEDNN_MAX_CPU_ISA=AVX2 in my launch.sh did fix it.

@Acly
Copy link
Owner

Acly commented Feb 29, 2024

It's already linked above. Sounds like it will be fixed for torch 2.3.0 (end of april)

@Acly
Copy link
Owner

Acly commented Apr 29, 2024

Stable version of torch 2.3.0 is out now and will be used for new installs.

@Acly Acly closed this as completed Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants