PACER: Permutation-Aligned Consensus Expert Routing
A unified framework for base-free, interference-aware model merging in Large Language Models and Vision Transformers.
- ** No Base Model Required** - Synthesizes a Consensus Barycenter from input models
- ** Interference-Aware** - Dynamically decides between merging and MoE upcycling per layer
- ** Smart Routing** - Zero-shot router using Subspace Projection Affinity (no training needed)
- ** Vision Support** - Native ViT support with Visual Token Merging (ToMe)
- ** Minimal Parameter Growth** - Only upcycles high-conflict layers to MoE
git clone https://github.com/Akicuo/pacer.git
cd pacer
pip install -e .pip install torch transformers safetensors accelerate
pip install -r requirements.txtfrom pacerkit import PACERMerger
# Initialize merger with models
merger = PACERMerger([
"fluently/FluentlyQwen3-Coder-4B-0909",
"SamuelBang/AesCoder-4B"
])
# Run PACER merge pipeline
merged_model = merger.merge(
interference_threshold=0.35,
top_k_experts=2,
output_path="./merged_model"
)# Merge models using a config file
pacerwkit merge --config configs/qwen_coder_merge.yaml
# Analyze interference between models
pacerkit analyze --models model1 model2 --output report.jsonSee notebooks/pacer_quickstart.ipynb for an interactive guide.
PacerKit uses YAML configuration files:
project_name: "qwen-coder-merge"
models:
- "fluently/FluentlyQwen3-Coder-4B-0909"
- "SamuelBang/AesCoder-4B"
output:
path: "./merged_model"
save_format: "safetensors"
pacer:
interference_threshold: 0.35
top_k_experts: 2
dropout_rate: 0.1
anchor_strategy: "first"
enable_moe_upcycle: trueSee configs/ for more examples.
PACER operates in three phases:
Aligns permutation symmetries of N models into a shared geometric basin using weight matching and the Hungarian algorithm.
Computes the Fréchet Mean of aligned models to create a synthetic "base model", then calculates deviation vectors.
- Low interference layers → DARE-TIES merge (0% parameter increase)
- High interference layers → MoE upcycling with zero-shot routing
| Metric | Dense Ensemble (4x) | Standard MoE | PACER |
|---|---|---|---|
| Total Params | 400% | 400% | ~136% |
| Active Params | 400% | 100% | ~100% |
| Interference | None | Low | None |
- Methodology - Full technical details
- Configuration Reference - All config options
- API Reference - Python API documentation
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Built on research from:
- Git Re-Basin (Ainsworth et al.)
- TIES-Merging (Yadav et al.)
- Token Merging (Bolya et al.)
- MergeME (Model Merging for MoEs)