| Paper | Project Page | MuLoCo-1 Optimizer Code | Full Research Code | Tweet Thread |
This repository contains the research code used to run all experiments presented in the paper MuLoCo: Muon is a Practical Inner Optimizer for DiLoCo. Our code bundles the following popular packages:
- torchtitan - PyTorch native platform for training generative AI models
- torchft - Fault-tolerant training utilities
- lm-evaluation-harness - Language model evaluation framework
Muon and AdamW implementations are in TorchTitan, while the outer optimizer and communication code is spread between the outer optimizer classes themselves and TorchFT. The code allows for WandB logging and evaluation during training via LM Eval Harness.
NOTE: We use a proprietary dataset internally at Meta which is not publicly available. As such, we have reverted the code to the default TorchTitan dataloader, but note that this is not what we used in our experiments.
For instructions on launching training experiments on a SLURM cluster, see the torchtitan README. It covers configuration files, quick-start commands for DDP and DiLoCo/MuLoCo training, and how to reproduce the paper experiments.
- Python 3.11+
- UV package manager
- Rust (for building torchft)
- Protocol Buffers compiler (
protoc) - required for torchft
curl -LsSf https://astral.sh/uv/install.sh | shRun the provided installation script (no sudo required):
cd $MULOCO_PATH
./install_dependencies.shThis script will:
- Install Rust via rustup (if not already installed)
- Download and install
protocfrom GitHub releases to~/.local/ - Automatically detect your OS and architecture (Linux/macOS, x86_64/arm64)
After running, add the following to your ~/.bashrc or ~/.zshrc:
export PATH="$HOME/.local/bin:$PATH"
source "$HOME/.cargo/env"Manual installation (alternative)
Rust:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/envProtocol Buffers (protoc):
mkdir -p $HOME/.local/bin
PROTOC_VERSION=29.3
curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-x86_64.zip
unzip protoc-${PROTOC_VERSION}-linux-x86_64.zip -d $HOME/.local/protoc
ln -sf $HOME/.local/protoc/bin/protoc $HOME/.local/bin/protoc
export PATH="$HOME/.local/bin:$PATH"
rm protoc-${PROTOC_VERSION}-linux-x86_64.zipexport MULOCO_PATH=/path/to/MuLoCocd $MULOCO_PATH
uv syncThis will:
- Create a
.venvvirtual environment - Install all dependencies from
pyproject.toml - Install local packages (torchtitan, torchft, lm-eval) in editable mode
Editable installations
The following local packages are installed in editable mode via [tool.uv.sources] in pyproject.toml:
| Package | Path | Description |
|---|---|---|
| torchtitan | torchtitan/ |
PyTorch training framework |
| torchft | torchft/ |
Fault-tolerant training |
| lm-eval | lm-evaluation-harness/ |
Language model evaluation |
To manually install lm-evaluation-harness in editable mode (standalone):
cd $MULOCO_PATH
uv pip install -e lm-evaluation-harnessTo install with optional dependencies (e.g., math tasks):
uv pip install -e "lm-evaluation-harness[math]"cd $MULOCO_PATH
uv pip install -e torchft[dev]
uv pip install -e lm-evaluation-harnessexport MULOCO_PATH=/path/to/MuLoCo
source $MULOCO_PATH/setup.shThis will:
- Activate the UV virtual environment
- Configure PYTHONPATH for all repositories
- Set up WandB and HuggingFace credentials
- Configure Rust/Cargo if available
If you only need the virtual environment without additional setup:
source $MULOCO_PATH/.venv/bin/activateMuLoCo/
├── pyproject.toml # UV project configuration with all dependencies
├── setup.sh # Environment setup script
├── install_dependencies.sh # Rust and protoc installer (no sudo)
├── README.md # This file
├── torchtitan/ # PyTorch training framework
├── torchft/ # Fault-tolerant training
└── lm-evaluation-harness/ # LM evaluation
Install optional dependency groups as needed:
# Development tools
uv sync --extra dev
# Math evaluation tasks
uv sync --extra math
# IFEval tasks
uv sync --extra ifeval
# vLLM support
uv sync --extra vllm
# Multilingual support
uv sync --extra multilingualIf you see "ERROR: MULOCO_PATH environment variable is not set", run:
export MULOCO_PATH=/path/to/MuLoCoIf the .venv directory doesn't exist, create it with:
cd $MULOCO_PATH && uv syncIf you see errors about missing rustc, cargo, or protoc, run the installation script:
cd $MULOCO_PATH
./install_dependencies.sh
source $HOME/.cargo/env
export PATH="$HOME/.local/bin:$PATH"Alternatively, set the PROTOC environment variable directly if protoc is installed elsewhere:
export PROTOC=/path/to/protoc@article{therien2025muloco,
title={MuLoCo: Muon is a Practical Inner Optimizer for DiLoCo},
author={Therien, Benjamin and Huang, Xiaolong and Defazio, Aaron
and Rish, Irina and Belilovsky, Eugene},
journal={arXiv preprint arXiv:2505.23725},
year={2025},
url={https://arxiv.org/abs/2505.23725}
}This project is licensed under the MIT License - see the LICENSE file for details.