Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapfusion seems to get better results? #25

Closed
jianyuheng opened this issue Aug 29, 2023 · 4 comments
Closed

Snapfusion seems to get better results? #25

jianyuheng opened this issue Aug 29, 2023 · 4 comments

Comments

@jianyuheng
Copy link

Thanks for the generosity of open sourcing your work, but there was a previous work similar to yours, called Snapfusion, aimed at speeding up Stable diffusion.

From the results of their paper, they achieved better results through efficient-unet and step distillation, but unfortunately this work is not open source.

Do you have any opinion on this work? https://snap-research.github.io/SnapFusion/
image

@bokyeong1015
Copy link
Member

Hi, thanks for your interest :)
SnapFusion has attained impressive results and is concurrent to our work. We sincerely appreciate their research efforts.

Below are potential points of comparison. In short, we've highlighted the potential of classical architectural compression, which remains powerful even under limited resources; meanwhile, SnapFusion has nicely approached both architectural reduction and step distillation.

BK-SDM (Ours) SnapFusion
U-Net: architecture reduction O (Block Removal + KD) O (Architecture Evolving)
U-Net: # sampling steps reduction X O (Step Distillation)
Image Decoder: architecture reduction X O (Ch Reduction + KD)
Training Data 0.22M LAION pairs unclear (from
LAION-5B + COYO-700M + internal dataset)
Training GPUs 1 A100 GPU 16 or 32 nodes for most of the training
(each node: 8 A100 GPUs)

The following directions could be promising:

  • Applying step distillation in conjunction with architectural compression.
  • Extending compression to the other parts (Image Decoder, Text Encoder) beyond the U-Net.
  • Investigating the impact of training data volumes and computational resources.

@jianyuheng
Copy link
Author

Very detailed comparison, thanks.

@Bikesuffer
Copy link

Bikesuffer commented Sep 1, 2023

That's really an interesting topic.
I actually tried both approaches for inpainting.
Since Snapfusion is not open source and the authors not responding, I can only write the robust training code based on the description in their paper. After 300k steps training, the model still can't generate acceptable inpainting result.
Later I tried BK SDM approaches for inpainting.
I tried SD_small_64, SD_base_64, SD_base_256, SD_small_256 and SD_tiny_64.
All of them can generate acceptable inpainting results after 50K steps.

@abhigoku10
Copy link

@Bikesuffer can u share the source for inpainting so that we can check it from our end Thanks in advance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants