You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[Feature Request] Option for blank prompts in SD3's triple encoder to create the same conditioning as when text encoder is absent, the same as the model was trained
#3785
Closed
CodeExplode opened this issue
Jun 19, 2024
· 2 comments
When SD3 is missing a text encoder, zeroes are passed instead. This seems to be how dropout was done during training, given the zeroing node of SD3's workflow. If T5 is absent, the prompt is not padded out for 77 zero tokens, rather just the first 77 CLIP token are passed (and vice versa).
I have also found that when finetuning with a particular conditioning (e.g. encoding blank prompts for the unconditional dropout, instead of zeroes), the model quickly adjusts to this, and then doesn't work with existing comfy workflows (e.g. zeroing). By the same logic, finetuning with only the CLIP models and no T5 would create a model which doesn't expect blank encoded T5 prompts as part of the conditioning, but rather requires the T5 to be handled as if it were completely absent.
Currently I load text encoders from the base SD3 checkpoint since there's no need to save them in my finetuning checkpoints if frozen, but that means there's no way to act as if T5 is missing as I train with. Having the option to zero each text encoder input (or rather, use the logic which comfy implements when it's absent altogether) would be more ideal.
edit: Sorry for the title changes, I somehow submitted the request while typing the description.
The text was updated successfully, but these errors were encountered:
CodeExplode
changed the title
[Feature Request] Option for blank prompts in SD3' triple encoder to create the same zero conditioning as when text encoder is absent
[Feature Request] Option for blank prompts in SD3's triple encoder to create the same zero conditioning as when text encoder is absent
Jun 19, 2024
CodeExplode
changed the title
[Feature Request] Option for blank prompts in SD3's triple encoder to create the same zero conditioning as when text encoder is absent
[Feature Request] Option for blank prompts in SD3's triple encoder to create the same conditioning as when text encoder is absent, the same as the model was trained
Jun 19, 2024
When SD3 is missing a text encoder, zeroes are passed instead. This seems to be how dropout was done during training, given the zeroing node of SD3's workflow. If T5 is absent, the prompt is not padded out for 77 zero tokens, rather just the first 77 CLIP token are passed (and vice versa).
I have also found that when finetuning with a particular conditioning (e.g. encoding blank prompts for the unconditional dropout, instead of zeroes), the model quickly adjusts to this, and then doesn't work with existing comfy workflows (e.g. zeroing). By the same logic, finetuning with only the CLIP models and no T5 would create a model which doesn't expect blank encoded T5 prompts as part of the conditioning, but rather requires the T5 to be handled as if it were completely absent.
Currently I load text encoders from the base SD3 checkpoint since there's no need to save them in my finetuning checkpoints if frozen, but that means there's no way to act as if T5 is missing as I train with. Having the option to zero each text encoder input (or rather, use the logic which comfy implements when it's absent altogether) would be more ideal.
edit: Sorry for the title changes, I somehow submitted the request while typing the description.
The text was updated successfully, but these errors were encountered: