Skip to content

Configuration

s1dlx edited this page Mar 27, 2023 · 24 revisions

We use hydra library to handle config files, see their docs for a comprehensive description of how it works. Here we explain the bare minimum setup to get you going with the extension.

Your extensions/sd-webui-bayesian-merger/ folder is organised as follows

├── README.md
├── bayesian_merger.py
├── conf/...
├── install.py
├── logs/...
├── models/...
├── requirements.txt
├── sd_webui_bayesian_merger/...
├── tests/...
└── wildcards/...

In this page we focus on conf/ folder and its content:

├── conf
│   ├── config.tmpl.yaml
│   └── payloads
│       ├── cargo
│       │   └── payload.tmpl.yaml
│       └── cargo.tmpl.yaml

As you can see there are three .tmpl.yaml files in a nested folder structure. You need to copy and rename the three of them in the following way:

  • config.tmpl.yaml -> config.yaml
  • cargo.tmpl.yaml -> cargo.yaml
  • payload.tmpl.yaml -> payload.yaml

resulting in

├── conf
│   ├── config.tmpl.yaml
│   ├── config.yaml
│   └── payloads
│       ├── cargo
│       │   ├── payload.tmpl.yaml
│       │   └── payload.yaml
│       ├── cargo.tmpl.yaml
│       └── cargo.yaml

Let's have a look at each of them

config.yaml

The file begins with a defaults section

defaults:
  - _self_
  - payloads: cargo

These will be the same for all the users, no need to change anything.


run_name: ${optimiser}_${scorer_method}
hydra:
  run:
    dir: logs/${now:%Y-%m-%d_%H-%M-%S}_${run_name}

run_name can be anything you want, even empty. By default this is set so that the optimiser name and the scorer method are concatenated. This is used to create a sub-folder to contain all your results in logs directory. Change it as you like (see here for an explanation of ${...} templating).


url: http://127.0.0.1:7860

this is the url to connect to webui API, the one above is the default one when launching webui with --api flag. In case you use --nowebui, change that to http://127.0.0.1:7861.


device: cpu

This is where the script will operate the merge and scoring, we suggest to leave it to cpu so that VMEM is free to be used for generations. In any case, you can set it to gpu and use your GPU VMEM for merging and scoring too.


wildcards_dir: path/to/wildcards/folder

This extension re-implements the wildcard extension for various reasons you do not need to care of. As a result, if you want to use wildcards in your prompts, you need to point tell the extension where to find them.


scorer_model_dir: path/where/to/save/scorer/models

This is where you want the aesthetic scorer models to be downloaded and stored.


model_a: path/to/model_a
model_b: path/to/model_b
model_c: path/to/model_c
merge_mode: weigthed_sum
skip_position_ids: 0

Where to find the two models to merge and what clip_skip value to use. Read more about merge_modes in Merging page.


batch_size: 1

How many images to generate per prompt.


optimiser: bayes # tpe
bounds_transformer: False # works only with bayes optiser, experimental feature
init_points: 1
n_iters: 1

Here you can select an optimiser, the number of warmup/exploration points (init_points) and nunmber of optimisation/exploitation points (n_iters).


save_imgs: False

Whether to save generated images or not


scorer_method: cafe_style # laion, aes, cafe_aesthetic, cafe_style, cafe_waifu
scorer_model_name: sac+logos+ava1-l14-linearMSE.pth # ava+logos-l14-linearMSE.pth, ava+logos-l14-reluMSE.pth

Pick a scoring method and (in case of chad) a scoring model. In case of a method different from chad, only one model is available and it will automatically picked for you, no need to specify a scorer_model_name (I mean, you can but it will be ignored).


save_best: False
best_format: safetensors # ckpt
best_precision: 16 # 32

Whether to save the best merged model (at the end of the optimisaiton run) or not. In case that is set to True, you can also pick the model format and precision to be saved in


draw_unet_weights: False
draw_unet_base_alpha: False

These can be used to skip optimisation and draw only the UNET visualisation.

cargo.yaml

This file defines the default image generation options. Again, the file begins with a defaults (naming is quite confusing I know) section

defaults:
  - cargo:
    - payload

Here you can tell the extension which payloads you want to generate images with. In this example we have only one, but we can make as many as we want. For example

defaults:
  - cargo:
    - dog
    - cat
    - horse

In this case we'll ask webui to generate image(s) for each of dog, cat and horse payloads (more in payload.yaml section).


The following are all the webui options. The values you set here will be used by all the payloads. One thing to remember is these values will be also overridden by payload specific ones, e.g., there's no point in having a global prompt but you may use neg_prompt below to avoid retyping it several times.

prompt: ""
neg_prompt: ""

n_iter: 1
batch_size: 1
steps: 20
cfg_scale: 7
width: 512
height: 512
sampler_name: Euler
sampler_index: Euler

seed: -1
subseed: -1
subseed_strength: 0
seed_resize_from_h: -1
seed_resize_from_w: -1

enable_hr: false
denoising_strength: 0
firstphase_width: 0
firstphase_height: 0
hr_scale: 2
hr_upscaler: ""
hr_second_pass_steps: 0
hr_resize_x: 0
hr_resize_y: 0

styles: []

restore_faces: false
tiling: false

eta: 0
s_churn: 0
s_tmax: 0
s_tmin: 0
s_noise: 1

payload.yaml

As mentioned before you can have as many payloads in conf/payloads/cargo/ folder. These are structured as follows

payloadname:
  parameter1: value
  parameter2: value
  ...

where the parameters can be any from cargo.yaml file. One thing you need to do is to change payloadname to something different. For example, for our dog payload, we'll have a conf/payloads/cargo/dog.yaml file reading:

dog:
  prompt: "a drawing of a dog"
  neg_prompt: "3d"
  steps: 30
  cfg: 6
  width: 512
  height: 768
  sampler_name: "Euler a"

Note how only few parameters are explicitly set, all the others will take default values from cargo.yaml file.

Let's talk about batch_size

You may have noticed that batch_size is defined twice in our configs: in config.yaml and in cargo.yaml (or if you want in each payload .yaml file). This is not a mistake, but a quirk of how the extension works. Put simply

  • batch_size in config.yaml (let's call it bayesian-batch_size here) rules how many times your prompt is rendered. In case you use wildcards, this will randomise them bayesian-batch_size times. When not using wildcards, the prompt will be always the same. Note that multiple images will be generated by separate api calls, not at the same time as when setting batch_size>1 in the webui (webui-batch_size). Thus, bayesian-batch_size = 100 will not crash you GPU as webui-batch_size = 100 may do.
  • batch_size in cargo.yaml is the actual webui batch_size (webui-batch_size). Note that this will not render different prompts when using wildcards (this is the quirk I was referring to), but it will generate multiple images in one call. This may be faster if you can afford enough VRAM.

Clone this wiki locally