-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Custom pipeline for Stable Diffusion that returns generation seeds #240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
Hmm @pcuenca, I think I'm not a big fan of having another pipeline just for this in the official pypi -> what do you think about the adding Any downsides with this? |
That's certainly another option. But I think there are two discussions here:
I understand, though, that having the generator is messy and we should maybe remove it if we go the seeds route (or ignore it if the user provides the latents). In addition, but less importantly, this extension would be useful for other pipelines, not just Stable Diffusion. For example, I tested it with regular Latent Diffusion. Is there a good way to generalize it? I'll write the alternative implementation to see what it looks like. |
@pcuenca a couple of higher level thoughts:
generator = torch.manual_seed(0)
# some random generation
pipe(generator=generator) is IMO better than: torch.manual_seed(0)
# some random generation
pipe(seed=0) => So to me that means we should not have both E.g.: latents = torch.randn((4, 64, 64, 3)
images = pipe(prompt="astronaut", latents=latents)
# analyse image
next_latents = latents[best_image_idx]
images = pipe(prompt="astronaut", latents) This saves us a lot of pain with having both @patil-suraj what's your opinion here? |
Hi @patrickvonplaten, thanks a lot for the thoughtful message! Very much agree that readability is a crucial goal to pursue, and that we should include as little code as we can get away with inside the pipelines. Also agree that it doesn't make sense to pass both seeds and generator. My question was motivated by the consideration that we might need to create (or accept from the community) a few pipelines in these initial stages, and they might become harder to update the more we have (for example, removing the deprecated kwargs for the torch device will require visiting all the pipelines). But we'll cross that bridge when we get there, I'm now satisfied with these design philosophy points, and I do share them :) I'm really happy that we care about readability, as it tends to get forgotten soon. It also makes total sense to have one pipeline per task, but I wasn't sure whether this feature should be in the core pipeline or be provided as an example. If everybody agrees I think we can close this PR and I'll try the other approach instead. Thanks! |
Hey @pcuenca , thanks a lot for the PR, think this is an very interesting use-case. @patrickvonplaten explained most the things here and echo those.
This will be a very impactful example, looking forward to it ! |
Oh, very interesting, I didn't know that, thanks! Thank you, @patil-suraj, I think I get it now. These conversations are very insightful to absorb the design goals you adopted :) |
Replaced by #247. |
TODO:
The idea here is simply to iterate when generating the latents and take note of the seeds that were used. If the user provides any seeds, use them, otherwise generate new ones. Either way, return them.
I think this is a good opportunity to discuss how we want to incorporate these extensions inside the library, depending on how popular we think it could happen. @patil-suraj wrote image-to-image just yesterday, for example. It feels like the natural way to provide an extension is via the pipeline interface. It was very easy to do it, but most of the code is boilerplate. I only had to do the following:
__call__
(the previous seeds).I wonder if we should design Pipelines in a way that they are meant for subclassing, or if we could use mixins or something else. Having to repeat the same steps everywhere (even the deprecated
torch_device
snippet) feels fragile.Another limitation I found is that this mechanism could be useful for other diffusion pipelines, not just Stable Diffusion. But if you use
DiffusionPipeline
as a superclass then you are forced to write__init__
to register the appropriate modules, tying your implementation to some specific combination anyway.One alternative is to just do nothing and keep this as it is right now. This makes perfect sense if we don't expect many contributions in the form of new pipelines. The code is now easy to read if you focus on a specific pipeline and don't care what others look like. But it might be a problem if we end up having lots of pipelines with similar blocks of code.
Thoughts @patrickvonplaten @anton-l @patil-suraj ?