Skip to content

MuLoCo Muon is a practical inner optimizer for DiLoCo

Notifications You must be signed in to change notification settings

facebookresearch/MuLoCo

MuLoCo

Muon is a Practical Inner Optimizer for DiLoCo

| Paper | Project Page | MuLoCo-1 Optimizer Code | Full Research Code | Tweet Thread |

arXiv License Python 3.11+ PyTorch 2.7+


Description

This repository contains the research code used to run all experiments presented in the paper MuLoCo: Muon is a Practical Inner Optimizer for DiLoCo. Our code bundles the following popular packages:

Muon and AdamW implementations are in TorchTitan, while the outer optimizer and communication code is spread between the outer optimizer classes themselves and TorchFT. The code allows for WandB logging and evaluation during training via LM Eval Harness.

NOTE: We use a proprietary dataset internally at Meta which is not publicly available. As such, we have reverted the code to the default TorchTitan dataloader, but note that this is not what we used in our experiments.

Running Jobs

For instructions on launching training experiments on a SLURM cluster, see the torchtitan README. It covers configuration files, quick-start commands for DDP and DiLoCo/MuLoCo training, and how to reproduce the paper experiments.

Prerequisites

  • Python 3.11+
  • UV package manager
  • Rust (for building torchft)
  • Protocol Buffers compiler (protoc) - required for torchft

Installation

1. Install UV (if not already installed)

curl -LsSf https://astral.sh/uv/install.sh | sh

2. Install Rust and Protocol Buffers (required for torchft)

Run the provided installation script (no sudo required):

cd $MULOCO_PATH
./install_dependencies.sh

This script will:

  • Install Rust via rustup (if not already installed)
  • Download and install protoc from GitHub releases to ~/.local/
  • Automatically detect your OS and architecture (Linux/macOS, x86_64/arm64)

After running, add the following to your ~/.bashrc or ~/.zshrc:

export PATH="$HOME/.local/bin:$PATH"
source "$HOME/.cargo/env"
Manual installation (alternative)

Rust:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env

Protocol Buffers (protoc):

mkdir -p $HOME/.local/bin
PROTOC_VERSION=29.3
curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-x86_64.zip
unzip protoc-${PROTOC_VERSION}-linux-x86_64.zip -d $HOME/.local/protoc
ln -sf $HOME/.local/protoc/bin/protoc $HOME/.local/bin/protoc
export PATH="$HOME/.local/bin:$PATH"
rm protoc-${PROTOC_VERSION}-linux-x86_64.zip

3. Set required environment variables

export MULOCO_PATH=/path/to/MuLoCo

4. Create and sync the virtual environment

cd $MULOCO_PATH
uv sync

This will:

  • Create a .venv virtual environment
  • Install all dependencies from pyproject.toml
  • Install local packages (torchtitan, torchft, lm-eval) in editable mode
Editable installations

The following local packages are installed in editable mode via [tool.uv.sources] in pyproject.toml:

Package Path Description
torchtitan torchtitan/ PyTorch training framework
torchft torchft/ Fault-tolerant training
lm-eval lm-evaluation-harness/ Language model evaluation

To manually install lm-evaluation-harness in editable mode (standalone):

cd $MULOCO_PATH
uv pip install -e lm-evaluation-harness

To install with optional dependencies (e.g., math tasks):

uv pip install -e "lm-evaluation-harness[math]"

5. Install TorchFT and lm eval harness

cd $MULOCO_PATH
uv pip install -e torchft[dev]
uv pip install -e lm-evaluation-harness

Usage

Activate the environment

export MULOCO_PATH=/path/to/MuLoCo
source $MULOCO_PATH/setup.sh

This will:

  • Activate the UV virtual environment
  • Configure PYTHONPATH for all repositories
  • Set up WandB and HuggingFace credentials
  • Configure Rust/Cargo if available

Manual activation (alternative)

If you only need the virtual environment without additional setup:

source $MULOCO_PATH/.venv/bin/activate

Directory Structure

MuLoCo/
├── pyproject.toml          # UV project configuration with all dependencies
├── setup.sh                # Environment setup script
├── install_dependencies.sh # Rust and protoc installer (no sudo)
├── README.md               # This file
├── torchtitan/             # PyTorch training framework
├── torchft/                # Fault-tolerant training
└── lm-evaluation-harness/  # LM evaluation

Optional Dependencies

Install optional dependency groups as needed:

# Development tools
uv sync --extra dev

# Math evaluation tasks
uv sync --extra math

# IFEval tasks
uv sync --extra ifeval

# vLLM support
uv sync --extra vllm

# Multilingual support
uv sync --extra multilingual

Troubleshooting

MULOCO_PATH not set

If you see "ERROR: MULOCO_PATH environment variable is not set", run:

export MULOCO_PATH=/path/to/MuLoCo

Virtual environment not found

If the .venv directory doesn't exist, create it with:

cd $MULOCO_PATH && uv sync

Rust or protoc not found (for torchft)

If you see errors about missing rustc, cargo, or protoc, run the installation script:

cd $MULOCO_PATH
./install_dependencies.sh
source $HOME/.cargo/env
export PATH="$HOME/.local/bin:$PATH"

Alternatively, set the PROTOC environment variable directly if protoc is installed elsewhere:

export PROTOC=/path/to/protoc

Citation

@article{therien2025muloco,
    title={MuLoCo: Muon is a Practical Inner Optimizer for DiLoCo},
    author={Therien, Benjamin and Huang, Xiaolong and Defazio, Aaron
            and Rish, Irina and Belilovsky, Eugene},
    journal={arXiv preprint arXiv:2505.23725},
    year={2025},
    url={https://arxiv.org/abs/2505.23725}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

MuLoCo Muon is a practical inner optimizer for DiLoCo

Resources

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors