Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging models with different structures in linear #324

Open
HawkClaws opened this issue May 20, 2024 · 2 comments
Open

Merging models with different structures in linear #324

HawkClaws opened this issue May 20, 2024 · 2 comments

Comments

@HawkClaws
Copy link

When merging models with different structures in linear, the following error occurred
I understand that errors can occur, but is there a way to skip specific layers where the error occurs?

Traceback (most recent call last):
  File "/usr/local/bin/mergekit-yaml", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/content/mergekit/mergekit/options.py", line 82, in wrapper
    f(*args, **kwargs)
  File "/content/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
    run_merge(
  File "/content/mergekit/mergekit/merge.py", line 92, in run_merge
    for _task, value in exec.run(quiet=options.quiet):
  File "/content/mergekit/mergekit/graph.py", line 197, in run
    res = task.execute(**arguments)
  File "/content/mergekit/mergekit/merge_methods/linear.py", line 52, in execute
    raise RuntimeError(
RuntimeError: Tensor size mismatch for model.layers.0.mlp.down_proj.weight, sizes: [torch.Size([4096, 14336]), torch.Size([4096, 11008])]

@HawkClaws HawkClaws changed the title Is there a feature to skip the merging of certain layers? Merging models with different structures in linear May 20, 2024
@cg123
Copy link
Collaborator

cg123 commented May 26, 2024

This currently isn't supported. Models have to have the same interior dimensions (hidden size & immediate size) to be merged.

Merging models with different sizes like this is an active area of research though. There are a few things we're trying internally and I'm hopeful one will pan out.

@HawkClaws
Copy link
Author

I see. I understand.
Thank you.
I look forward to seeing this repository evolve in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants