Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A1111 <> Diffusers Scheduler mapping #4167

Closed
patrickvonplaten opened this issue Jul 19, 2023 · 28 comments
Closed

A1111 <> Diffusers Scheduler mapping #4167

patrickvonplaten opened this issue Jul 19, 2023 · 28 comments
Assignees

Comments

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Jul 19, 2023

A1111 <> Diffusers Mapping

A1111 / K-Diffusion Diffusers Notes
DPM++ 2M Multistep DPM-Solver
DPM++ 2M Karras Multistep DPM-Solver init with use_karras_sigmas=True
DPM++ 2M SDE Multistep DPM-Solver init with algorithm_type="sde-dpmsolver++"
DPM++ 2M SDE Karras Multistep DPM-Solver init with algorithm_type="sde-dpmsolver++" and use_karras_sigmas=True
DPM++ 2S a N/A Very similar to DPM++ SDE
DPM++ 2S a Karras N/A Very similar to DPM++ SDE Karras
DPM++ SDE Singlestep DPM-Solver
DPM++ SDE Karras Singlestep DPM-Solver init with use_karras_sigmas=True
DPM2 DPM Discrete Scheduler
DPM2 Karras DPM Discrete Scheduler init with use_karras_sigmas=True
DPM2 a DPM Discrete Scheduler with ancestral sampling
DPM2 a Karras DPM Discrete Scheduler with ancestral sampling init with use_karras_sigmas=True
DPM adaptive N/A
DPM fast N/A Not really used no?
Euler Euler scheduler
Euler a Euler Ancestral Scheduler
Heun Heun Scheduler
LMS Linear Multistep
LMS Karras Linear Multistep Init with use_karras_sigmas=True
N/A DEIS
N/A UniPCMultistepScheduler

Diffusers samplers have a couple more ways to be customized as can be seen from the inits of the respective samplers.

@patrickvonplaten
Copy link
Contributor Author

@stevhliu could we maybe put this in a doc as discussed a bit during our meeting? Maybe in the overview doc page?

@patrickvonplaten patrickvonplaten pinned this issue Jul 20, 2023
@stevhliu
Copy link
Member

Nice, I'll roll this into a PR I'm working on to update the scheduler APIs!

@rbertus2000
Copy link

@patrickvonplaten: I'm confused: Shouldn't DPM++ SDE be DPMSolverSDEScheduler? And in this issue you just said that "DPM++ 2s a" = DPMSolverSinglestepScheduler.

@patrickvonplaten
Copy link
Contributor Author

  • https://huggingface.co/docs/diffusers/v0.18.2/en/api/schedulers/dpm_sde is another implementation of DPMSolverSDE but it's not the official one (the author contributed DPMSolverSDE himself), so that the default one.
    In my opinion, we could deprecate DPMSolverSDE at some point.
  • RE: DPM++ 2s a yes indeed it's very similar to DPMSolverSinglestepScheduler with init with algorithm_type="sde-dpmsolver++" and use_karras_sigmas=True

@cmdr2
Copy link
Contributor

cmdr2 commented Jul 24, 2023

Hi @patrickvonplaten Thanks for the clarification. For DPM++ 2s a - can you please confirm whether the algorithm_type should be sde-dpmsolver++ or dpmsolver++?

Your recommendation in #3953 is dpmsolver++:

DPM++ 2s a => should be the following: DPMSolverSinglestepScheduler.from_config(..., algorithm_type="dpmsolver++")

And your most-recent message here is sde-dpmsolver++:

RE: DPM++ 2s a yes indeed it's very similar to DPMSolverSinglestepScheduler with init with algorithm_type="sde-dpmsolver++" and use_karras_sigmas=True

Thanks!

@cmdr2
Copy link
Contributor

cmdr2 commented Jul 24, 2023

Hmm, I don't think sde-dpmsolver++ is implemented for DPMSolverSinglestepScheduler

NotImplementedError: sde-dpmsolver++ does is not implemented for <class 'diffusers.schedulers.scheduling_dpmsolver_singlestep.DPMSolverSinglestepScheduler'>

@patrickvonplaten
Copy link
Contributor Author

Ah good catch! We need to add this algorithm type. Here: https://github.com/huggingface/diffusers/pull/3344/files was the PR to add it to the Multi-Step scheduler. I think it can be more or less copied 1-to-1 to DPMSolverSinglestepScheduler - would you like to open a PR?

@cmdr2
Copy link
Contributor

cmdr2 commented Jul 25, 2023

@patrickvonplaten Thanks! I've opened a PR at #4251

There's a discrepancy that I'm not sure about, I've raised that in the PR.

@CapsAdmin
Copy link

CapsAdmin commented Jul 25, 2023

I tried implementing a1111's kdiffusion samplers in diffusers along with the ability to pass user changable settings from a1111 to kdiffusion.

Do note that I don't have much experience in this field, it's just something I got into for fun the last month or two.

But my findings are:

I was going for 1 to 1 kind of for fun, but I think ideally some sacrifices should be made. Like I'm not sure people intentionally use the extra sampler settings, and they're all set to default as far as I know.

My implementation can be found here. I've compared various samplers to a1111 and my implementation and it seems to work, but probably there are details I've missed.
https://github.com/CapsAdmin/diffusers-a1111/

@CapsAdmin
Copy link

I've refactored and fixed some issues, samplers reproduce the same result in a1111 except for DPM2 a Karras (which is marked as a second order sampler). I also explain a bit better in code what's going on in kdiffusion.py

I can observe that samplers like Heun, DPM2 a, DPM fast, DPM++ SDE Karras produce very similar results but has a subtle variation from a1111.

I was thinking this would be the sigma values being slightly different. For example:

diffusers: [14.6172, 11.7266,  9.3438,  7.3867,  5.7891,  4.5000,  3.4648,  2.6406,
         1.9912,  1.4834,  1.0908,  0.7910,  0.5649,  0.3965,  0.2732,  0.1842,
         0.1214,  0.0779,  0.0485,  0.0292,  0.0000]

a1111: [14.6116, 11.7413,  9.3685,  7.4192,  5.8284,  4.5394,  3.5029,  2.6763,
         2.0228,  1.5112,  1.1147,  0.8110,  0.5812,  0.4096,  0.2834,  0.1921,
         0.1273,  0.0822,  0.0515,  0.0313,  0.0000]

I used the same method from the k-diffusion pipeline in diffusers to get the min max sigma and I'm not sure how to do it otherwise.

@CapsAdmin
Copy link

I also discovered a potential bug where a1111 will pass eta=1 coming from sampler settings to DPM Fast and DPM Adaptive. The default eta in kdiffusion is 0.

When they are used with eta 0, DPM fast actually produces good results and DPM adaptive finishes a lot faster.

old man, 25 steps, DPM fast, eta 0
image

old man, 25 steps, DPM fast, eta 1
image

I think this setting was supposed to be used for ancestral samplers, but accidentally brought to these samplers.

So I guess this is the reason DPM fast is infamous for producing bad results.

@xhinker
Copy link
Contributor

xhinker commented Aug 4, 2023

Wow, this is great, thanks @patrickvonplaten , I have been spending lots of nights testing out the correct mapping relationship

@alexisrolland
Copy link
Contributor

alexisrolland commented Aug 15, 2023

Thank you @patrickvonplaten for writing up this mapping! <3

@stevhliu did it eventually make it into the Diffusers documentation? I just want to make sure whether this mapping above is still the point of truth or if there is another one somewhere I should use...

Side note, I noticed there are a couple of schedulers in diffusers that are not included in the mapping above. It seems the list is not entirely exhaustive, maybe because they do not exist in A1111?

Thank you both! 👐

@stevhliu
Copy link
Member

Hi @alexisrolland, the mapping is live on the main version of the docs 🤗

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 12, 2023
@stevhliu
Copy link
Member

Ok to close this @patrickvonplaten now that it is in the docs here?

@CodeCorrupt
Copy link

Hey HF team, I'm trying to mimic the A1111 samplers using the table listed here and in the docs but it seems the KDPM2DiscreteScheduler and KDPM2AncestralDiscreteScheduler schedulers don't actually support the use_karras_sigmas=True argument.

Please let me know if I'm just missing something 🤗

@patrickvonplaten
Copy link
Contributor Author

Hey @CodeCorrupt,

Yes we could add them in a PR I think :-)

@patrickvonplaten
Copy link
Contributor Author

@CodeCorrupt do you mind opening a new issue with this?

@CodeCorrupt
Copy link

New issue was opened #5002 👍

@JLLLinn
Copy link

JLLLinn commented Dec 9, 2023

Hi! Wondering if there's plan for "DPM++ 3M SDE" implementation (or if that's already implemented?)

@patrickvonplaten
Copy link
Contributor Author

There is no plan to add it yet, feel free to open a PR @JLLLinn :-)

@sayakpaul sayakpaul unpinned this issue Jan 4, 2024
@caiqi
Copy link

caiqi commented Feb 19, 2024

I've refactored and fixed some issues, samplers reproduce the same result in a1111 except for DPM2 a Karras (which is marked as a second order sampler). I also explain a bit better in code what's going on in kdiffusion.py

I can observe that samplers like Heun, DPM2 a, DPM fast, DPM++ SDE Karras produce very similar results but has a subtle variation from a1111.

I was thinking this would be the sigma values being slightly different. For example:

diffusers: [14.6172, 11.7266,  9.3438,  7.3867,  5.7891,  4.5000,  3.4648,  2.6406,
         1.9912,  1.4834,  1.0908,  0.7910,  0.5649,  0.3965,  0.2732,  0.1842,
         0.1214,  0.0779,  0.0485,  0.0292,  0.0000]

a1111: [14.6116, 11.7413,  9.3685,  7.4192,  5.8284,  4.5394,  3.5029,  2.6763,
         2.0228,  1.5112,  1.1147,  0.8110,  0.5812,  0.4096,  0.2834,  0.1921,
         0.1273,  0.0822,  0.0515,  0.0313,  0.0000]

I used the same method from the k-diffusion pipeline in diffusers to get the min max sigma and I'm not sure how to do it otherwise.

The discrepancy arises from numerical precision. In A1111, betas are generated in float64, and the alphas_cumprod, used in sigma creation, is in float16 format. Conversely, diffusers use float32. torch.cumprod is sensitive to precision, especially when it contains 1000 items. If matching the format, sigmas should be identical. This discrepancy also leads to some differences between A1111 and diffusers' results.

@alexisrolland
Copy link
Contributor

Ok to close this @patrickvonplaten now that it is in the docs here?

@stevhliu it seems the link is broken. Would you mind resharing a link to where this is in the doc please?

Thank you

@stevhliu
Copy link
Member

You can find it in the Schedulers overview :)

@abc2cba
Copy link

abc2cba commented Mar 4, 2024

How about the LCM scheduler?How to set init params to align a1111 rst?

@dfl
Copy link

dfl commented Mar 24, 2024

@caiqi I am debugging my comfyui implementation of TCD scheduler, and want to print the sigmas array like you did above. Can you please share how you did that? Sorry if it's a basic question, I am still learning my way around diffusers pipelines... I only see timesteps inside the scheduler mixin code and am not sure where to get the sigmas.

UPDATE: I figured it out, added this to the scheduler's set_timesteps method

        self.timesteps = torch.from_numpy(timesteps).to(device=device, dtype=torch.long)
        print(f"timesteps: {repr(self.timesteps)}")
        sigmas = np.sqrt(1 - self.alphas_cumprod)
        print(f"sigmas: {sigmas.to(device=device)[self.timesteps]}")

@andypotato
Copy link

Hi! Wondering if there's plan for "DPM++ 3M SDE" implementation (or if that's already implemented?)

+1 for this. I'm getting consistent good results with this sampler in A1111, so I'd really love to see it in Diffusers

@yiyixuxu yiyixuxu removed the stale Issues that haven't received updates label Mar 27, 2024
@yiyixuxu yiyixuxu self-assigned this Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

15 participants