Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add CFG Rescale (thank you, anonymous user from 4chan) #10555

Closed
wants to merge 1 commit into from

Conversation

AUTOMATIC1111
Copy link
Owner

@AUTOMATIC1111 AUTOMATIC1111 commented May 19, 2023

Proposed by anon based on this: https://arxiv.org/abs/2305.08891

xyz_grid-0002-3340097554

Adds the setting, xyz plot support and infotext. The setting is at the bottom Sampler Parameters page.

@AUTOMATIC1111
Copy link
Owner Author

To also put this in context, this - https://github.com/mcmonkeyprojects/sd-dynamic-thresholding - extension presumably does the same thing but better.

@AUTOMATIC1111
Copy link
Owner Author

Although from usability perspective having one slider that you can just set and forget may be more preferable to what dynamic thresholding requires.

@ljleb
Copy link
Contributor

ljleb commented May 19, 2023

I would like there to be an extension point for combine_denoised so that extensions can add alternative keywords similar to AND to the prompt. Is that already possible without stepping on other extension code?

If that is not the case, I can take a bite at the code. If it is simple to implement, it would be nice to add it to this PR.

@AUTOMATIC1111
Copy link
Owner Author

Generally, you can just replace combine_denoised with your function and call the original editing inputs/outputs. If the other extension does the same, and you both properly put the original function back on on_unload, scripts should work just fine together. If you do something that does not involve calling the original function, I don't think any extension point will prevent scripts from stepping on each others' toes.

@ljleb
Copy link
Contributor

ljleb commented May 20, 2023

I found a good enough solution in the end, which involves recomputing the composable result without cfg std rescale. Thanks anyways!

It's possible the extension code goes out of sync with the repo if other stuff ends up being added. For now, I should not have any trouble when this is merged at least.

@drhead
Copy link
Contributor

drhead commented Jun 20, 2023

I think I have gotten it to work much more closely to what the paper authors demonstrated:
image
image

The desaturation is actually far less of an issue if you are using a model trained with zero terminal SNR and v-loss as the paper authors described -- this grid was generated on a SD1.5 model trained on zero terminal SNR for quite a while and on V-loss for a very short time, which is currently somewhat underbaked. For other prompts I've tested, overexposed images look much more reasonable with a value of 0.7, which also as demonstrated makes the saturation consistent at any CFG scale.

I have not had the opportunity to test dynamic thresholding on this model yet to see if it actually does the same thing, but based on my past experiences with that plugin CFG rescale is far easier to use in any case.

@ljleb
Copy link
Contributor

ljleb commented Jun 27, 2023

I just want to bring to your attention that the greek letter is causing problems with extensions like civitai. A user reported it in my implementation: ljleb/sd-webui-neutral-prompt#18

Replacing the greek letter with "phi" would prevent metadata issues. It would also make old generations that used my extension for the rescaling work with future webui generations when using paste params.

@AUTOMATIC1111 AUTOMATIC1111 deleted the cfg-rescale branch August 5, 2023 06:29
@6DammK9
Copy link

6DammK9 commented Aug 15, 2023

I am deeply regret to not try to preserve this feature, or at least made a viable migration guide *sob
Here is my research: https://github.com/6DammK9/nai-anime-pure-negative-prompt/blob/main/ch01/dynamic_cfg.md
Long things short (I just started playing sd-dynamic-thresholding), here is the closest parameter to replicate this original feature:

  • Mimic CFG Scale: 1 CFG ONE. Hence the awful green-ish effect on high phi.
  • Top percentile of latents to clamp: 100
  • Interpolate Phi: Same phi
  • Mimic Scale Scheduler: Constant
  • CFG Scale Scheduler: Constant
  • Separate Feature Channels: ON OFF is better. It disables some green-ish effect in some area.
  • Scaling Startpoint: MEAN
  • Variability Measure: AD I was expecting STD, but AD looks closer

The image previews

  • Original:
parameters

(aesthetic:0), (quality:0), (solo:0), (1boy:0), [astolfo], [[ducati]]
Negative prompt: (worst:0), (low:0), (bad:0), (exceptional:0), (masterpiece:0), (comic:0), (extra:0), (lowres:0), (breasts:0.5)
Steps: 48, Sampler: Euler, CFG scale: 4.5, Seed: 4226915900, Size: 768x768, Model hash: ba7e3203e2, Model: marble_v141.fp16, Denoising strength: 0.7, Clip skip: 2, Hires upscale: 1.5, Hires upscaler: Latent, Version: v1.5.1-5-g819dedb7

231031-4226915900-1152-1152-4 5-48-20230816011511

  • Phi 0.3 (cfg-rescale)
    231033-4226915900-1152-1152-4 5-48-20230816011739
  • mimic 1, phi 0.3, ZERO, STD
    231034-4226915900-1152-1152-4 5-48-20230816011923
  • mimic 1, phi 0.3, MEAN, AD
    231035-4226915900-1152-1152-4 5-48-20230816012050
  • mimic 1, phi 0.3, MEAN, AD, Separate Feature Channels OFF
    231037-4226915900-1152-1152-4 5-48-20230816012337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants