Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plans for a diffusers version? #3

Closed
tonyf opened this issue Jun 6, 2023 · 8 comments
Closed

Any plans for a diffusers version? #3

tonyf opened this issue Jun 6, 2023 · 8 comments

Comments

@tonyf
Copy link

tonyf commented Jun 6, 2023

Hey guys, this paper looks great. Really excited to see the full training code. Was curious-- do you had any plans to make a diffusers port?

@haoosz
Copy link
Owner

haoosz commented Jun 7, 2023

Yeah. We will make diffusers version after all the work is done. Thank you.

@haoosz haoosz closed this as completed Jun 7, 2023
@tonyf
Copy link
Author

tonyf commented Jun 7, 2023

Amazing! Looking forward to seeing it. Just curious-- is there an expected timeline for the diffusers version? Debating whether to implement it myself

@haoosz
Copy link
Owner

haoosz commented Jun 8, 2023

Sorry, but I am occupied by the following work and might not work on the diffuser version right now. I will work on the diffuser version in August. If it is too late for you, I am very glad you can implement by yourself. Thank you!

@okaris
Copy link

okaris commented Jun 21, 2023

I am working on this.

@garychan22
Copy link

I have finished the diffusers version but simply feeding the reference image to the frozen unet and doing the otsu is low-speed, which is weird. hahaha

@okaris
Copy link

okaris commented Jul 13, 2023

@garychan22 I've also recently finished it and have been working on getting the hyperparams to fit my needs. otsu itself is the bottleneck, the point of having it is to escape the need of preprocessing, but if you are already doing that a manually supplied mask could also help and speed it up. Other than that this repo is not taking advantage of higher performance attention processors, which you can't use for the attention calculations where you need to extract the scores. But it's possible to use xformers or pytorch's scaled_dot_product_attention for faster calculations.

Were you able to replicate the results exactly like the samples here?

@okaris
Copy link

okaris commented Jul 13, 2023

Also if you would like to submit a PR, here is my issue: huggingface/diffusers#3719

@garychan22
Copy link

@garychan22 I've also recently finished it and have been working on getting the hyperparams to fit my needs. otsu itself is the bottleneck, the point of having it is to escape the need of preprocessing, but if you are already doing that a manually supplied mask could also help and speed it up. Other than that this repo is not taking advantage of higher performance attention processors, which you can't use for the attention calculations where you need to extract the scores. But it's possible to use xformers or pytorch's scaled_dot_product_attention for faster calculations.

Were you able to replicate the results exactly like the samples here?

Thanks for the useful tips here! For now, I have not replicated the similar results as this repo and I will keep working on this.

Moreover, I have been training my own blip-diffusion, finding that better results to dreambooth can be achieved within one-minute fine-tuning, which is awesome. Hope to replicate the results as shown in the paper and release the pre-trained model to the hub soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants