Skip to content

Conversation

@damian0815
Copy link
Contributor

@damian0815 damian0815 commented Feb 19, 2023

Some enhancements for the checkpoint_merger community pipeline cc @Abhinay1997

  • Different modules can be merged with different alphas via the module_override_alphas kwarg (eg {'unet': 0.2, 'text_encoder': 0.6})
  • Block-weighted merges, whereby a different weight can be used for different layers within the unet via blocks kwarg (eg [0,0,0,0,0,0,0,0,0,0,0,0,0.5,1,1,1,1,1,1,1,1,1,1,1,1]). Specify 12 weights for the down blocks (counting from the input layer), 1 weight for the middle block, and 12 weights for the up blocks (counting from the middle layer). You can find an explanation of block-weighted merging here (cw: waifus), and some weight presets to use here (illustrated here).

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@damian0815
Copy link
Contributor Author

damian0815 commented Feb 19, 2023

@Abhinay1997 re: your idea to make the block weights target class dynamic, i'm not sure that will work - the layer → block weighting index logic is very brittle and as-is only works for the unet (and it may stop working for SD3.0). eg :

DIFFUSERS_KEY_PREFIX_TO_WEIGHT_INDEX = {

@Abhinay1997
Copy link
Contributor

Abhinay1997 commented Feb 19, 2023

@damian0815, Yes what you said makes sense. For now, I think this is a good feature to be added to Checkpoint Merger. We'll have to figure out fixes to the block_weight dict lookup in case of mismatch with later diffusion models.

Gently pining @patrickvonplaten to review and add your thoughts.

@Abhinay1997
Copy link
Contributor

@pcuenca gently pinging you to add your thoughts on this

@patrickvonplaten
Copy link
Contributor

This looks good to me! Happy to go ahead with the PR if you like (cc @damian0815 and @Abhinay1997 )

@Abhinay1997
Copy link
Contributor

@damian0815 I'm good with the changes. Can you move this from draft to a final PR ?

@patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline.

@damian0815 damian0815 marked this pull request as ready for review March 4, 2023 08:49
@damian0815 damian0815 marked this pull request as draft March 4, 2023 08:49
@patrickvonplaten
Copy link
Contributor

@damian0815 I'm good with the changes. Can you move this from draft to a final PR ?

@patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline.

Sure, feel free to change :-)

@patrickvonplaten patrickvonplaten marked this pull request as ready for review March 7, 2023 11:12
@patrickvonplaten
Copy link
Contributor

@damian0815,

Let me know if good to merge for you

@damian0815
Copy link
Contributor Author

ahh @patrickvonplaten no not yet, there's some bug fixes in my other lib grate (https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py) that i need to find some time to copy over and test

@github-actions
Copy link
Contributor

github-actions bot commented Apr 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Apr 2, 2023
@Abhinay1997
Copy link
Contributor

@damian0815, do let me know if there’s anything I can help on incase you have time constraints :)

@pcuenca pcuenca removed the stale Issues that haven't received updates label Apr 10, 2023
@damian0815
Copy link
Contributor Author

damian0815 commented Apr 11, 2023

@damian0815, do let me know if there’s anything I can help on incase you have time constraints :)

uhh, yes actually - you could port the changes from https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py here (actually it should be enough to just copy/paste the whole file)

@damian0815
Copy link
Contributor Author

damian0815 commented Apr 11, 2023

however FYI snapshot_download is an ill-advised way of downloading models, because it will download eeeeeverything - if the model has .safetensors and .bin files you'll end up downloading 9GB of data for a 4.5GB model. and depending on if/how there's an fp16 revision of the model that might get downloaded too - 11GB of data for a 4.5GB model, potentially more. The snapshot_download should really be replaced with a call to StableDiffusionPipeline.from_pretrained(...) and then read off the individual components eg pipeline.unet, pipeline.text_encoder etc rather than attempting to load the files by hand from disk.

@patrickvonplaten
Copy link
Contributor

@damian0815 BTW we have a download function now as well:
https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.download

@Abhinay1997
Copy link
Contributor

Abhinay1997 commented Apr 12, 2023

Hi @damian0815 ! Thanks for the inputs. I'll add your changes over the weekend and test. As for the snapshot_download, it was a tradeoff at that time to not load all the pipelines to memory at once.

However given that we have DiffusionPipeline.download now, things are a lot easier. Let me make the changes after I port your changes here. Thanks for suggesting this @patrickvonplaten :)

Can't do it till the weekend though. 🥲

@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2023

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label May 6, 2023
@github-actions github-actions bot closed this May 14, 2023
@bghira
Copy link
Contributor

bghira commented May 18, 2023

=[

@HamnaAkram
Copy link

Hi! is there an update to this issue? I would like to use the checkpoint merger pipeline for a project but can't seem to find if it has been completed or not?

@Abhinay1997
Copy link
Contributor

Abhinay1997 commented Sep 8, 2023

@HamnaAkram you can find the checkpoint merger pipeline in the community pipelines. This issue was raised for an enhancement to the original.

@HamnaAkram
Copy link

Thanks for pointing me in the right direction but I have noticed an issue with the existing pipeline. When I initialize alpha=0, the merged model and the original first model should be identical and if alpha=1 merged model should be equivalent to second model. However, this is not the case. After merging the models at alpha=0 or 1 they become nonidentical. I would love to hear any feedback regarding that.

@Abhinay1997
Copy link
Contributor

@HamnaAkram that's weird. Can you open a new issue (helps others to find it in the future.) with a code sample ? Please include any output logs and the checkpoint names or model_index.json files if possible.

I'll try to reproduce it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Issues that haven't received updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants