Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beginner Question: Merge 2 7B models to make 13B model #16

Closed
fakerybakery opened this issue Nov 30, 2023 · 2 comments
Closed

Beginner Question: Merge 2 7B models to make 13B model #16

fakerybakery opened this issue Nov 30, 2023 · 2 comments

Comments

@fakerybakery
Copy link

Hi,
This is a beginner question, as I'm completely new to model merging. Is it possible to merge 2 7B models to create a larger 13B model?
Thank you

@cg123
Copy link
Collaborator

cg123 commented Nov 30, 2023

Yes, it is possible - people have had some pretty cool successes with this sort of model stacking lately.

For example, here's the config Undi95 used to make Mistral-11B-OpenOrcaPlatypus (one of the steps in his Mistral-11B-OmniMix):

slices:
  - sources:
    - model: Open-Orca/Mistral-7B-OpenOrca
      layer_range: [0, 24]
  - sources:
    - model: akjindal53244/Mistral-7B-v0.1-Open-Platypus
      layer_range: [8, 32]
merge_method: passthrough
dtype: bfloat16

@fakerybakery
Copy link
Author

Great, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants