Skip to content

Merge safetensor files using the technique described in "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch"

License

Notifications You must be signed in to change notification settings

martyn/safetensors-merge-supermario

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

safetensors-merge-supermario

Combine any two models using a Super Mario merge(DARE) as described in the linked whitepaper.

About

Combine capabilities from multiple models. Works with:

  • Stable Diffusion (1.5, XL/XL Turbo)
  • LLMs(Mistral, Llama, etc)
  • LoRas(must be same size)
  • Any two homologous models

Example

Model Description Image(same seed)
sd_xl_turbo Attempting 1024 SDXL turbo attempting to render at 1024
sdxl base 1.0 Attempting to use SDTurboScheduler SDXL attempting to use SDTurboScheduler
merged Mario merged(DARE) Merged model successfully rendering 1024

Usage

python3 merge.py -p [weight drop probability] -lambda [scaling factor] [base_safetensors_model_file_or_folder] [model_to_merge] [output_path]

Example

python3 merge.py -p 0.13 -lambda 3.0 sdxl_base.safetensors sd_xl_turbo_1.0_fp16.safetensors sdxl_merged.safetensors

Note: This also works with arguments reversed.

Models

ComfyUI workflow

Changelog

  • Dec 27 2023: Add support for using files, folders, or hf repos in the hf_merge.py merge list.
  • Dec 27 2023: Added mergekit-compatible yaml support for hf_merge.py. Always runs dare and ignores options outside model specification. weight is p and density is 1/λ.
  • Dec 12 2023: Added hf_merge.py for merging hf repos.
  • Dec 12 2023: Added support for folders. You can now merge LLMs(mistral, llama, etc) and other large models. Folders with .bin files are supported - the first specified model in the cli must be in .safetensors format.
  • Nov 28 2023: Initial release supporting stable diffusion.

References

About

Merge safetensor files using the technique described in "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch"

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages