-
Notifications
You must be signed in to change notification settings - Fork 6.6k
checkpoint_merger community pipeline: add block weighted and module-specific alpha options #2422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
checkpoint_merger community pipeline: add block weighted and module-specific alpha options #2422
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
|
@Abhinay1997 re: your idea to make the block weights target class dynamic, i'm not sure that will work - the layer → block weighting index logic is very brittle and as-is only works for the unet (and it may stop working for SD3.0). eg :
|
|
@damian0815, Yes what you said makes sense. For now, I think this is a good feature to be added to Checkpoint Merger. We'll have to figure out fixes to the block_weight dict lookup in case of mismatch with later diffusion models. Gently pining @patrickvonplaten to review and add your thoughts. |
|
@pcuenca gently pinging you to add your thoughts on this |
|
This looks good to me! Happy to go ahead with the PR if you like (cc @damian0815 and @Abhinay1997 ) |
|
@damian0815 I'm good with the changes. Can you move this from draft to a final PR ? @patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline. |
Sure, feel free to change :-) |
|
Let me know if good to merge for you |
|
ahh @patrickvonplaten no not yet, there's some bug fixes in my other lib |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
@damian0815, do let me know if there’s anything I can help on incase you have time constraints :) |
uhh, yes actually - you could port the changes from https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py here (actually it should be enough to just copy/paste the whole file) |
|
however FYI |
|
@damian0815 BTW we have a |
|
Hi @damian0815 ! Thanks for the inputs. I'll add your changes over the weekend and test. As for the However given that we have Can't do it till the weekend though. 🥲 |
|
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
|
=[ |
|
Hi! is there an update to this issue? I would like to use the checkpoint merger pipeline for a project but can't seem to find if it has been completed or not? |
|
@HamnaAkram you can find the checkpoint merger pipeline in the community pipelines. This issue was raised for an enhancement to the original. |
|
Thanks for pointing me in the right direction but I have noticed an issue with the existing pipeline. When I initialize alpha=0, the merged model and the original first model should be identical and if alpha=1 merged model should be equivalent to second model. However, this is not the case. After merging the models at alpha=0 or 1 they become nonidentical. I would love to hear any feedback regarding that. |
|
@HamnaAkram that's weird. Can you open a new issue (helps others to find it in the future.) with a code sample ? Please include any output logs and the checkpoint names or model_index.json files if possible. I'll try to reproduce it. |
Some enhancements for the checkpoint_merger community pipeline cc @Abhinay1997
module_override_alphaskwarg (eg{'unet': 0.2, 'text_encoder': 0.6})blockskwarg (eg[0,0,0,0,0,0,0,0,0,0,0,0,0.5,1,1,1,1,1,1,1,1,1,1,1,1]). Specify 12 weights for the down blocks (counting from the input layer), 1 weight for the middle block, and 12 weights for the up blocks (counting from the middle layer). You can find an explanation of block-weighted merging here (cw: waifus), and some weight presets to use here (illustrated here).