-
Notifications
You must be signed in to change notification settings - Fork 6.4k
Description
At the moment, we have a bit of an inconsistent design with the _encode_prompt
/ encode_prompt
functions.
We essentially have three different designs:
-
1.) The original stable diffusion 1.5 design, which means we have a private
_encode_prompt
function that does not use by default usetorch.no_grad()
and returns a single tensor
def _encode_prompt( -
2.) For IF we made the method public and wrapeed it into a
torch.no_grad()
mainly because one more or less has to run this function independently because of the size of the text encoder and the multi-stage inference process
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/deepfloyd_if/pipeline_if.py#L246 -
3.) For SD-XL for now we also made the method public, but did not add a
torch.no_grad()
decorator. The reason being here that there might be use cases where the user actually wants to compute the gradients when callingencode_prompt
- e.g. when training LoRA with text encoder one could just call encode_prompt
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L212
What are our options to improve the API / unify the API here? cc @yiyixuxu @sayakpaul @williamberman @pcuenca