PACER: Acyclic Causal Discovery from Large-scale Interventional Data

PACER is a causal discovery method that jointly learns a causal ordering (via the Plackett-Luce distribution) and edge probabilities (via Bernoulli gates) from large-scale interventional data.

Figure: PACER models a topological ordering of variables using a Plackett-Luce distribution. Nodes with higher weight are more likely to precede nodes with lower weight in downstream DAGs. Samples from this distribution induce complete DAGs, which are further filtered via samples from independent, edge-specific Bernoulli distributions. This defines our Bernoulli-Plackett-Luce distribution over DAGs. At train time, we sample multiple candidate graphs and score them based on a likelihood-based objective function. We then optimize the parameters of the Bernoulli-Plackett-Luce model using either an analytic estimator or REINFORCE gradient updates (see paper for more details).

Installation

conda create -n pacer python=3.11
conda activate pacer

# JAX with GPU support (adjust cuda version as needed)
pip install "jax[cuda12]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

# Install PACER
pip install -e .

Quick start

from pacer import PACER

# x       : (N, d) float  — expression matrix
# masks   : (N, d) float  — 1 = not intervened, 0 = intervened on
# regimes : (N,)   int    — 0 = observational, k>0 = k-th interventional regime

pacer = PACER(
    n_vars       = d,         # number of genes / variables
    n_layers     = 2,         # MLP depth
    hdim         = 4,         # MLP hidden dimension
    n_steps      = 5000,      # optimisation steps
    lr           = 1e-2,      # learning rate
    batch_size   = 64,        # batch size
    n_mc_samples = 200,       # REINFORCE MC samples
    lambd        = 1.0,       # sparsity regularisation
    seed         = 0,         # random seed
    fit_analytic = False,     # whether to estimate the causal graph using the analytic method (currently supports linear-Gaussian mechanisms only)
)

pacer.fit(x_train, masks_train, regimes_train,
          x_val=x_val, masks_val=masks_val, regimes_val=regimes_val)

edge_probs = pacer.predict_proba()   # (d, d) — P(i → j)
pred_dag   = pacer.predict(threshold=0.5)   # (d, d) binary adjacency

Key hyperparameters

Parameter	Default	Description
`n_layers`	2	MLP depth (0 = linear)
`hdim`	4	Hidden dimension
`n_steps`	5000	Gradient steps
`lr`	1e-2	Adam learning rate
`batch_size`	64	Mini-batch size
`n_mc_samples`	200	REINFORCE MC samples
`lambd`	1	Sparsity weight (larger → sparser graph)
`fit_analytic`	False	Use analytic estimator instead of REINFORCE. Currently supports linear-Gaussian mechanisms only.
`mask_TF_path`	None	Path to TF list TSV to restrict parent candidates

Running on synthetic data

Interactive notebook

Open examples/demo_synthetic.ipynb for a self-contained walkthrough:

generates a random DAG and linear-Gaussian interventional data
fits PACER
evaluates SHD / precision / recall
visualises the inferred vs ground-truth network

Generate DCDI datasets

One can generate new synthetic datasets using the generate_data.py script from the DCDI repo:

cd dcdi/data/generation

python generate_data.py \
    --mechanism linear \
    --intervention-type structural \
    --initial-cause gaussian \
    --noise gaussian \
    --nb-nodes 20 \
    --expected-degree 4 \
    --nb-dag 3 \
    --nb-points 5000 \
    --rescale \
    --suffix "my_experiment" \
    --intervention \
    --obs-data

Next steps

We welcome contributions. Here are a few extensions that would be great to implement:

Implement extension to imperfect interventions.
Implement extension to unknown targets.
Implement extension to multivariate Normal models (analytic estimator).
Implement extension to general models (analytic estimator).

Please see the manuscript Appendix for an outline of these extensions.

Citation

@article{vinas2026pacer,
  title   = {PACER: Acyclic Causal Discovery from Large-scale Interventional Data},
  author  = {Vi{\~n}as Torn{\'e}, Ramon and F{\`a}bregas Salazar, S\'{\i}lvia and Park, Soyon and Ban, Ivo Alexander and Gadetsky, Artyom and Doikov, Nikita and Brbi{\'c}, Maria},
  journal = {Proceedings of the 43rd International Conference on Machine Learning (ICML)},
  year    = {2026},
}

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
examples		examples
src/pacer		src/pacer
.gitignore		.gitignore
PACER_overview.png		PACER_overview.png
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PACER: Acyclic Causal Discovery from Large-scale Interventional Data

Table of Contents

Installation

Quick start

Running on synthetic data

Interactive notebook

Generate DCDI datasets

Next steps

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PACER: Acyclic Causal Discovery from Large-scale Interventional Data

Table of Contents

Installation

Quick start

Running on synthetic data

Interactive notebook

Generate DCDI datasets

Next steps

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages