Scale of KD-feature loss for SD inpainting 1.5 #21

Bikesuffer · 2023-08-21T17:41:40Z

Hi there,

I am trying to distill the Unet in SD inpainting 1.5 to a smaller Unet by using your code. (I replaced the pipeline to inpainting and the input data)
I have trained for 130K steps with batch size 64.
Right now the kd_feat_loss is around 20.

I am wondering what kd_feat_loss you have when you finished distill the Unet in your experiment?

Thank you.

bokyeong1015 · 2023-08-22T02:39:27Z

Hi, thanks for utilizing our work, glad to know that 😊
Although we haven't attempted inpainting experiments, we hope the following information can be helpful.

Here is a loss curve from our code for text-to-image synthesis, with SD-v1.4 and batch size 64 (= gradient accumulation 4 x mini batch size 16), plotted with 500-point moving average:

The scale of KD feature loss ≫ The scale of KD output loss and SD task loss
- As we described in our paper, we didn’t try hyperparameter tuning for loss weights, but it empirically worked well in our experiments.
Losses are not directly correlated with the final generation scores (FID/IS/CLIP score), especially in later iterations. In other words, lower losses did not necessarily result in better generation scores.
If you want to verify the learning process, we suggest examining the final metrics and/or visual examples. Nevertheless, the losses should decrease during initial iterations.

bokyeong1015 · 2023-08-22T02:47:41Z

Please understand that we've changed the name of this issue, 'Batch Size' -> 'Scale of KD-feature loss for SD inpainting 1.5', to clarify the topic and make it easier for people to find in the future.

Bikesuffer · 2023-08-22T02:55:46Z

Thanks a lot for the information.

yajieC · 2023-08-31T06:23:23Z

hello, does this method work for SD inpainting 1.5?

bokyeong1015 · 2023-09-01T16:38:17Z

Hi, @yajieC
We haven't tried it, but we believe our models can be used after finetuning for SD-inpainting.

Our models are compressed from SD-v1.4, and SD-v1.x models share the same architecture (with different training recipes); SD-inpainting was based on SD-v1 backbone.

Bikesuffer · 2023-09-01T17:01:54Z

hello, does this method work for SD inpainting 1.5?

Yes, it worked for me.
I have successfully distill the unet in sd inpainting 1.5 to a smaller Unet
I would say the SD_base model distilled with batch size 256(I call it IP_Base_256) generate best result for me.

bokyeong1015 · 2023-09-01T17:20:33Z

Thanks for sharing the above and this good news! Happy to know you are okay with the inpainting results using our approach :) Could we ask if you have plans to release your models and/or code?

Edit: sorry for initial misunderstanding, you've clarified that "distill the unet in sd inpainting 1.5 to a smaller Unet", which means (Teacher, Student) = (SD-inpainting 1.5, BK-SDM modified using additional input channels) <- ~~please let us know if this is incorrect~~ updated. Thanks again for sharing! @Bikesuffer

Bikesuffer · 2023-09-05T07:43:01Z

Thanks for sharing the above and this good news! Happy to know you are okay with the inpainting results using our approach :) Could we ask if you have plans to release your models and/or code?

Edit: sorry for initial misunderstanding, you've clarified that "distill the unet in sd inpainting 1.5 to a smaller Unet", which means (Teacher, Student) = (SD-inpainting 1.5, BK-SDM) <- please let us know if this is incorrect. Thanks again for sharing! @Bikesuffer

Hi actually the student is a modified version of bk sdm since the input of unet in inpainting pipeline is 9 channel. But all the anchor points for calculating the loss are the same as bk sdm.

bokyeong1015 · 2023-09-05T08:33:11Z

Thanks for the clarification, and we've updated the student description in the above :)

yajieC · 2023-09-08T02:01:28Z

hi, I tried this method, but found that the performance was very poor. My experimental configuration was to train on laion_11k data for 10k steps, and the unet is bk_tiny. And I also replaced the pipeline to inpainting and the input data. I would like to ask you for any good suggestions, thanks.

bokyeong1015 · 2023-09-08T03:04:59Z

@yajieC Thanks for your inquiry. We would like to address this in a separate discussion for making it easier for future readers to find, because it seems a different topic. Please kindly refer to our response at that link.

bokyeong1015 changed the title ~~Batch Size~~ Scale of KD-feature loss for SD inpainting 1.5 Aug 22, 2023

Bikesuffer closed this as completed Aug 22, 2023

bokyeong1015 mentioned this issue Sep 8, 2023

Discussion on experimental settings #34

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scale of KD-feature loss for SD inpainting 1.5 #21

Scale of KD-feature loss for SD inpainting 1.5 #21

Bikesuffer commented Aug 21, 2023 •

edited

Loading

bokyeong1015 commented Aug 22, 2023 •

edited

Loading

bokyeong1015 commented Aug 22, 2023

Bikesuffer commented Aug 22, 2023

yajieC commented Aug 31, 2023

bokyeong1015 commented Sep 1, 2023

Bikesuffer commented Sep 1, 2023 •

edited

Loading

bokyeong1015 commented Sep 1, 2023 •

edited

Loading

Bikesuffer commented Sep 5, 2023 •

edited

Loading

bokyeong1015 commented Sep 5, 2023

yajieC commented Sep 8, 2023

bokyeong1015 commented Sep 8, 2023

Scale of KD-feature loss for SD inpainting 1.5 #21

Scale of KD-feature loss for SD inpainting 1.5 #21

Comments

Bikesuffer commented Aug 21, 2023 • edited Loading

bokyeong1015 commented Aug 22, 2023 • edited Loading

bokyeong1015 commented Aug 22, 2023

Bikesuffer commented Aug 22, 2023

yajieC commented Aug 31, 2023

bokyeong1015 commented Sep 1, 2023

Bikesuffer commented Sep 1, 2023 • edited Loading

bokyeong1015 commented Sep 1, 2023 • edited Loading

Bikesuffer commented Sep 5, 2023 • edited Loading

bokyeong1015 commented Sep 5, 2023

yajieC commented Sep 8, 2023

bokyeong1015 commented Sep 8, 2023

Bikesuffer commented Aug 21, 2023 •

edited

Loading

bokyeong1015 commented Aug 22, 2023 •

edited

Loading

Bikesuffer commented Sep 1, 2023 •

edited

Loading

bokyeong1015 commented Sep 1, 2023 •

edited

Loading

Bikesuffer commented Sep 5, 2023 •

edited

Loading