Sihan Yang, Kexuan Shi, Weiyang Liu
The Chinese University of Hong Kong
🌐 Homepage | 📑 Paper | 📖 arXiv | 🤗 Models
🔥[2026-02-06]: We released our paper, models, and codes.
We introduce a geometry-preserving model merging framework, called Orthogonal Model Merging (OrthoMerge). For models trained with Orthogonal Finetuning (OFT), the orthogonal matrices representing these transformations are explicit. We map task-specific orthogonal transformations into the Lie algebra, where we perform a magnitude-corrected integration that accounts for both the direction and the intensity of the adaptations. Furthermore, we extend this strategy to models finetuned via standard additive methods (e.g., LoRA, full finetuning), where explicit orthogonal transformations are absent. We introduce an Orthogonal-Residual Decoupling strategy that solves the orthogonal Procrustes problem to extract the implicit orthogonal component from finetuned models. This allows us to merge the orthogonal components of the adaptation on the manifold, while handling the residuals by traditional merging in Euclidean space.
An intuitive comparison among (a) current model merging, the proposed (b) orthogonal merging and (c) orthogonal-residual decoupling merging.
Illustration of OrthoMerge. (a) To merge orthogonal transformations, we first map them to the Lie algebra
git clone https://github.com/Sphere-AI-Lab/OrthoMerge.git
conda create -n OrthoMerge python=3.10 -y
conda activate OrthoMerge
cd OrthoMerge
pip install -r requirements.txt
We utilize the following base models and task-specific fine-tuned models for our experiments.
- Base Model: meta-llama/Llama-3.1-8B
- Task-Specific Adapters: SphereLab/Llama-3.1-8B_OFT_adapters
Llama 3.2 Experiments:
- Base Model: meta-llama/Llama-3.2-3B
- Task-Specific Models: MergeBench Collection (Llama-3.2-3B)
Qwen 2.5 VL Experiments:
- Base Model: Qwen/Qwen2.5-VL-7B-Instruct
- Task-Specific Models:
# For OFT models
bash scripts/OrthoMerge_OFT_models.sh
# For non-OFT models
bash scripts/OrthoMerge_non_OFT_models.sh
For evaluation environments using lmms-eval, lm-eval-harness, bigcode-eval, and safety-eval, please follow the setup instructions provided in their respective repositories.
# For OFT models
bash scripts/OrthoMerge_OFT_models.sh
# For non-OFT models
bash scripts/OrthoMerge_non_OFT_models.sh
If you find our work and this codebase helpful, please consider starring this repo and cite:
@article{yang2026orthogonalmodelmerging,
title = {Orthogonal Model Merging},
author = {Yang, Sihan and Shi, Kexuan and Liu, Weiyang},
year = {2026},
journal = {arXiv preprint arXiv:2602.05943},
url = {https://arxiv.org/abs/2602.05943}
}- Sihan Yang: sihany077@gmail.com