Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation images not matching ComfyUI #409

Closed
komninoschatzipapas opened this issue May 17, 2024 · 13 comments
Closed

Validation images not matching ComfyUI #409

komninoschatzipapas opened this issue May 17, 2024 · 13 comments

Comments

@komninoschatzipapas
Copy link
Contributor

I'm curious as to why the validation images output are not matching the ones I'm getting from ComfyUI when using the same parameters.

CleanShot 2024-05-17 at 19 51 33

And then running train_sdxl.py with the following arguments and having added the line I mentioned on the other issue to perform validation on step 0:

  --validation_seed=42 \
  --validation_guidance=5 \
  --validation_guidance_rescale=0.0 \
  --validation_num_inference_steps=35 \
  --validation_negative_prompt="" \
  --validation_resolution=832x1216 \
  --validation_noise_scheduler="ddim"

Yields:
step_0_car_(832, 1216)

We perform inference using ComfyUI on our production service so ideally I would be able to have those two be in sync but I have not found a way to do this yet.

@bghira
Copy link
Owner

bghira commented May 17, 2024

i've never used the validations pipeline that way, so i'm not sure that it would work at all. you can run the inference scripts in the toolkit to try directly via diffusers.

@bghira
Copy link
Owner

bghira commented May 17, 2024

you could also try setting a negative prompt on both tools.

@komninoschatzipapas
Copy link
Contributor Author

I used DDPM since there wasn't a DDIM script:

# Use Pytorch 2!
import torch
from diffusers import (
    StableDiffusionPipeline,
    DiffusionPipeline,
    AutoencoderKL,
    UNet2DConditionModel,
    DDPMScheduler,
)
from transformers import CLIPTextModel

# Any model currently on Huggingface Hub.
# model_id = 'junglerally/digital-diffusion'
# model_id = 'ptx0/realism-engine'
# model_id = 'ptx0/artius_v21'
# model_id = 'ptx0/pseudo-journey'
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipeline = DiffusionPipeline.from_pretrained(model_id)

# Optimize!
pipeline.unet = torch.compile(pipeline.unet)
scheduler = DDPMScheduler.from_pretrained(model_id, subfolder="scheduler")

# Remove this if you get an error.
torch.set_float32_matmul_precision("high")

pipeline.to("cuda")
prompts = {
    "car": "car",
}
for shortname, prompt in prompts.items():
    # old prompt: ''
    image = pipeline(
        prompt=prompt,
        negative_prompt="",
        num_inference_steps=35,
        generator=torch.Generator(device="cuda").manual_seed(42),
        width=832,
        height=1216,
        guidance_scale=5,
    ).images[0]
    image.save(f"./{shortname}_nobetas.png", format="PNG")

Yields:
car_nobetas

ComfyUI:
CleanShot 2024-05-17 at 21 25 06

I'm not sure what I'm missing but its crazy how different these outputs are with seemingly the same model & parameters. My concern is that we're using ComfyUI for inference but judging our models based on the validation images generated by diffusers which don't seem to be generated in the same manner.

I'm going to close this issue as it's related to diffusers & ComfyUI but if you're able to give some input on this situation it would be greatly appreciated!

@bghira
Copy link
Owner

bghira commented May 17, 2024

i haven't really played around with the base SDXL model in almost a year, it is possibly something in their upstream model config. i had to open a few issue reports so far for some of these problems, and a couple of them just remain perpetually unresolved

edit: i will try and reproduce this problem locally.

@bghira
Copy link
Owner

bghira commented May 17, 2024

image

here's my result, though I had to modify the script slightly for MPS, replacing cuda -> mps, and removing the torch.compile() because MPS does not support it

@bghira
Copy link
Owner

bghira commented May 17, 2024

# Use Pytorch 2!
import torch
from diffusers import (
    StableDiffusionPipeline,
    DiffusionPipeline,
    AutoencoderKL,
    UNet2DConditionModel,
    DDPMScheduler,
)
from transformers import CLIPTextModel

# Any model currently on Huggingface Hub.
# model_id = 'junglerally/digital-diffusion'
# model_id = 'ptx0/realism-engine'
# model_id = 'ptx0/artius_v21'
# model_id = 'ptx0/pseudo-journey'
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
pipeline = DiffusionPipeline.from_pretrained(model_id)

# Optimize!
# pipeline.unet = torch.compile(pipeline.unet)
scheduler = DDPMScheduler.from_pretrained(model_id, subfolder="scheduler")

# Remove this if you get an error.
torch.set_float32_matmul_precision("high")

pipeline.to(dtype=torch.float16, device="mps")
prompts = {
    "car": "car",
}
for shortname, prompt in prompts.items():
    # old prompt: ''
    image = pipeline(
        prompt=prompt,
        negative_prompt="",
        num_inference_steps=35,
        generator=torch.Generator(device="mps").manual_seed(42),
        width=832,
        height=1216,
        guidance_scale=5,
    ).images[0]
    image.save(f"./{shortname}_nobetas.png", format="PNG")

@komninoschatzipapas
Copy link
Contributor Author

Check comfyanonymous/ComfyUI#3507

If you were to change torch.Generator(device="mps").manual_seed(42) to torch.Generator(device="cpu").manual_seed(42), you should get the image I got from my ComfyUI flow above.

Is there a possibility we could integrate this into the codebase for generating the validation images?

@bghira
Copy link
Owner

bghira commented May 17, 2024

no, because generator on CPU means all Tensors have to be copied from CPU during generation time. it even tells you this when you try it.

@bghira
Copy link
Owner

bghira commented May 17, 2024

image

using CPU seed on the mac doesn't seem to reproduce the original images you gave still, for what it's worth

@komninoschatzipapas
Copy link
Contributor Author

I actually just tried it on my Studio M1 Ultra and I'm getting the same results as you.

So in order to be in sync with ComfyUI, you need to generate the seed on the CPU and be on a Linux system.

I'm relieved it's the seed that was different and not some other parameter that would be harder to adjust.

@bghira
Copy link
Owner

bghira commented May 18, 2024

so basically no reason to switch since it's just a different seed being tested :-) you can probably sample using GPU seeds on ComfyUI as well.

@komninoschatzipapas
Copy link
Contributor Author

i haven't really played around with the base SDXL model in almost a year, it is possibly something in their upstream model config. i had to open a few issue reports so far for some of these problems, and a couple of them just remain perpetually unresolved

edit: i will try and reproduce this problem locally.

I'm curious what you're fine-tuning off. I found checkpoints like juggernaut overcook pretty easily so I generally go off base.

@bghira
Copy link
Owner

bghira commented May 20, 2024

my own model series trained on a different noise schedule with the same u-net layout as SDXL, and same text encoder, but Made by Ollins' fp16 VAE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants