You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When merging models with different structures in linear, the following error occurred
I understand that errors can occur, but is there a way to skip specific layers where the error occurs?
Traceback (most recent call last):
File "/usr/local/bin/mergekit-yaml", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/content/mergekit/mergekit/options.py", line 82, in wrapper
f(*args, **kwargs)
File "/content/mergekit/mergekit/scripts/run_yaml.py", line 47, in main
run_merge(
File "/content/mergekit/mergekit/merge.py", line 92, in run_merge
for _task, value in exec.run(quiet=options.quiet):
File "/content/mergekit/mergekit/graph.py", line 197, in run
res = task.execute(**arguments)
File "/content/mergekit/mergekit/merge_methods/linear.py", line 52, in execute
raise RuntimeError(
RuntimeError: Tensor size mismatch for model.layers.0.mlp.down_proj.weight, sizes: [torch.Size([4096, 14336]), torch.Size([4096, 11008])]
The text was updated successfully, but these errors were encountered:
HawkClaws
changed the title
Is there a feature to skip the merging of certain layers?
Merging models with different structures in linear
May 20, 2024
This currently isn't supported. Models have to have the same interior dimensions (hidden size & immediate size) to be merged.
Merging models with different sizes like this is an active area of research though. There are a few things we're trying internally and I'm hopeful one will pan out.
When merging models with different structures in linear, the following error occurred
I understand that errors can occur, but is there a way to skip specific layers where the error occurs?
The text was updated successfully, but these errors were encountered: