-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "TIES w\ DARE", ModelStock and Geometric Median (Original!) #28
base: main
Are you sure you want to change the base?
Conversation
Found that the branch will be sync in auto. TIES w/ DARE has been added. Hyperparater is under searching. Hope that it will work. |
is this ready to be reviewed/merged? |
Sure. The algorithm implementation is done for me. |
vote_sgn: Hyper = 0.0, | ||
apply_stock: Hyper = 0.0, | ||
cos_eps: Hyper = 1e-6, | ||
apply_median: Hyper = 0.0, | ||
eps: Hyper = 1e-6, | ||
maxiter: Hyper = 100, | ||
ftol: Hyper =1e-20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to consider here is recipes that are already serialized and existing user code. If there are any comfyui workflows that use ties_sum
from comfy-mecha for example, then the existing nodes will not see these new inputs until it is deleted and created again manually. This is not too much a problem as long as the default values would preserve the original behavior without the parameters. Can you confirm this is the case?
As a side note, I am generally in favor of writing new merge methods instead of adding parameters to existing ones. Do you think it is possible to put the new code in a separate fundamental merge method? I wonder how practical that would be. All of the new code, except for the vote_sgn
parameter, seems to be located at the end of the method, so maybe it is? Let me know if you think it is possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In favor of writing new merge methods
I have made a research log for such SDXL merge. The following suggestion (split to 3 algos with no stacking) is up to you:
- First, make a set of original TIES merge, then "TIES w/ DARE" as DARE, and try to figure out how to implement DARE on top of a known receipt. It may work for some (or most?) cases as implemented in other repos, but it is not enough for me for my Pony x SDXL merge. It can be just a monolithic method with union of parameters.
- Then, obviously Geometric merge can be implemented alone, and I expect it behaves like
weighted_sum
but closer to base model. - Finally implement this TGMD merge exclusively. Since I have already overrided more then half of the original algorithm, I think it is suitable to trim all the parameters out, leaving only
p=0.1
for the dropout part. Evenmaxiter
/eps
/ftol
are safe to hide from user's sight. Evenk=1.0
/alpha=1.0
are in the same case: They defines TGMD merge as an algorithm.
Since I'm the minority who use dedicated python notebook and a ridiculous workstation build, I always able to find a way to keep my TGMD merge after breaking changes accoring to ComfyUI adaption, so feel free to make your decision. To get users well informed, we can specify that such memory intensive (O(N) instead of O(1), which is still harsh to me) algoithm will be forced to use device='cpu'
to prevent OOM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I'm the minority who use dedicated python notebook and a ridiculous workstation build, I always able to find a way to keep my TGMD merge after breaking changes accoring to ComfyUI adaption, so feel free to make your decision.
Got it. I appreciate the cordial and constructive approach to the conversation. I'll try to break the code into smaller recipes and then we can merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work overall! Sorry for the delay in checking out the code.
If we could remove some of the code comments or maybe reformat them to be less temporal, then when we are done with the comments above I have left, I think I would consider this good enough for merging.
Maybe Linting with LaTeX variable names. However I still favour list comprehension because lambda expression is highly correlated to algorithm expression itself and being pythonic, meanwhile |
Self explained. I decided to put the condition as
vote_sgn > 0.0
to bypass type conversion.Meanwhile it looks like a ReLU.May use it in next version of AstolfoMix for comparasion.