-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scale of KD-feature loss for SD inpainting 1.5 #21
Comments
Hi, thanks for utilizing our work, glad to know that 😊 Here is a loss curve from our code for text-to-image synthesis, with SD-v1.4 and batch size 64 (= gradient accumulation 4 x mini batch size 16), plotted with 500-point moving average:
|
Please understand that we've changed the name of this issue, 'Batch Size' -> 'Scale of KD-feature loss for SD inpainting 1.5', to clarify the topic and make it easier for people to find in the future. |
Thanks a lot for the information. |
hello, does this method work for SD inpainting 1.5? |
Hi, @yajieC Our models are compressed from SD-v1.4, and SD-v1.x models share the same architecture (with different training recipes); SD-inpainting was based on SD-v1 backbone. |
Yes, it worked for me. |
Thanks for sharing the above and this good news! Happy to know you are okay with the inpainting results using our approach :) Could we ask if you have plans to release your models and/or code? Edit: sorry for initial misunderstanding, you've clarified that "distill the unet in sd inpainting 1.5 to a smaller Unet", which means (Teacher, Student) = (SD-inpainting 1.5, BK-SDM modified using additional input channels) <- |
Hi actually the student is a modified version of bk sdm since the input of unet in inpainting pipeline is 9 channel. But all the anchor points for calculating the loss are the same as bk sdm. |
Thanks for the clarification, and we've updated the student description in the above :) |
hi, I tried this method, but found that the performance was very poor. My experimental configuration was to train on laion_11k data for 10k steps, and the unet is bk_tiny. And I also replaced the pipeline to inpainting and the input data. I would like to ask you for any good suggestions, thanks. |
@yajieC Thanks for your inquiry. We would like to address this in a separate discussion for making it easier for future readers to find, because it seems a different topic. Please kindly refer to our response at that link. |
Hi there,
I am trying to distill the Unet in SD inpainting 1.5 to a smaller Unet by using your code. (I replaced the pipeline to inpainting and the input data)
I have trained for 130K steps with batch size 64.
Right now the kd_feat_loss is around 20.
I am wondering what kd_feat_loss you have when you finished distill the Unet in your experiment?
Thank you.
The text was updated successfully, but these errors were encountered: