Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "TIES w\ DARE", ModelStock and Geometric Median (Original!) #28

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

6DammK9
Copy link
Contributor

@6DammK9 6DammK9 commented May 4, 2024

Self explained. I decided to put the condition as vote_sgn > 0.0 to bypass type conversion. Meanwhile it looks like a ReLU.
May use it in next version of AstolfoMix for comparasion.

@6DammK9
Copy link
Contributor Author

6DammK9 commented May 8, 2024

Found that the branch will be sync in auto. TIES w/ DARE has been added. Hyperparater is under searching. Hope that it will work.

@6DammK9 6DammK9 changed the title add TIES_SOUP support Add TIES w\ DARE support (with variants and degraded versions) May 11, 2024
@6DammK9 6DammK9 changed the title Add TIES w\ DARE support (with variants and degraded versions) Add "TIES w\ DARE", ModelStock and Geometric Median (Original!) Jun 6, 2024
@ljleb
Copy link
Owner

ljleb commented Jun 12, 2024

is this ready to be reviewed/merged?

@6DammK9
Copy link
Contributor Author

6DammK9 commented Jun 12, 2024

is this ready to be reviewed/merged?

Sure. The algorithm implementation is done for me.

Comment on lines +157 to +163
vote_sgn: Hyper = 0.0,
apply_stock: Hyper = 0.0,
cos_eps: Hyper = 1e-6,
apply_median: Hyper = 0.0,
eps: Hyper = 1e-6,
maxiter: Hyper = 100,
ftol: Hyper =1e-20,
Copy link
Owner

@ljleb ljleb Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to consider here is recipes that are already serialized and existing user code. If there are any comfyui workflows that use ties_sum from comfy-mecha for example, then the existing nodes will not see these new inputs until it is deleted and created again manually. This is not too much a problem as long as the default values would preserve the original behavior without the parameters. Can you confirm this is the case?

As a side note, I am generally in favor of writing new merge methods instead of adding parameters to existing ones. Do you think it is possible to put the new code in a separate fundamental merge method? I wonder how practical that would be. All of the new code, except for the vote_sgn parameter, seems to be located at the end of the method, so maybe it is? Let me know if you think it is possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In favor of writing new merge methods

I have made a research log for such SDXL merge. The following suggestion (split to 3 algos with no stacking) is up to you:

  • First, make a set of original TIES merge, then "TIES w/ DARE" as DARE, and try to figure out how to implement DARE on top of a known receipt. It may work for some (or most?) cases as implemented in other repos, but it is not enough for me for my Pony x SDXL merge. It can be just a monolithic method with union of parameters.
  • Then, obviously Geometric merge can be implemented alone, and I expect it behaves like weighted_sum but closer to base model.
  • Finally implement this TGMD merge exclusively. Since I have already overrided more then half of the original algorithm, I think it is suitable to trim all the parameters out, leaving only p=0.1 for the dropout part. Even maxiter / eps / ftol are safe to hide from user's sight. Even k=1.0 / alpha=1.0 are in the same case: They defines TGMD merge as an algorithm.

Since I'm the minority who use dedicated python notebook and a ridiculous workstation build, I always able to find a way to keep my TGMD merge after breaking changes accoring to ComfyUI adaption, so feel free to make your decision. To get users well informed, we can specify that such memory intensive (O(N) instead of O(1), which is still harsh to me) algoithm will be forced to use device='cpu' to prevent OOM.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I'm the minority who use dedicated python notebook and a ridiculous workstation build, I always able to find a way to keep my TGMD merge after breaking changes accoring to ComfyUI adaption, so feel free to make your decision.

Got it. I appreciate the cordial and constructive approach to the conversation. I'll try to break the code into smaller recipes and then we can merge.

Copy link
Owner

@ljleb ljleb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work overall! Sorry for the delay in checking out the code.

If we could remove some of the code comments or maybe reformat them to be less temporal, then when we are done with the comments above I have left, I think I would consider this good enough for merging.

@6DammK9
Copy link
Contributor Author

6DammK9 commented Jun 21, 2024

code comments or maybe reformat them

Maybe Linting with LaTeX variable names. However I still favour list comprehension because lambda expression is highly correlated to algorithm expression itself and being pythonic, meanwhile torch has already optimized in C layer, even faster than numpy / scipy / sklearn / networkx etc.
If the code is short and clean enough, even dropping the comment is feasible, like the Weiszfeld's algorithm alone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants