CoLiDE is a framework for learning linear directed acyclic graphs (DAGs) from observational data. Recognizing that DAG learning from observational data is in general an NP-hard problem, recent efforts have advocated a continuous relaxation approach which offers an efficient means of exploring the space of DAGs. We propose a new convex score function for sparsity-aware learning of linear DAGs, which incorporates concomitant estimation of scale parameters to enhance DAG topology inference using continuous first-order optimization. We augment this least-square-based score function with a smooth, nonconvex acyclicity penalty term to arrive at CoLiDE (Concomitant Linear DAG Estimation), a simple regression-based criterion that facilitates efficient computation of gradients and estimation of exogenous noise levels via closed-form expressions.
This is an official implementation of the following paper:
S. S. Saboksayr, G. Mateos, and M. Tepper, CoLiDE: Concomitant linear DAG estimation, Proc. Int. Conf. Learn. Representations (ICLR), Vienna, Austria, May 7-11, 2024.
If you find this code beneficial, kindly consider citing:
@inproceedings{saboksayr2024colide,
title={{CoLiDE: Concomitant Linear DAG Estimation}},
author={Saboksayr, Seyed Saman and Mateos, Gonzalo and Tepper, Mariano},
booktitle={International Conference on Learning Representations},
year={2024}
}
We recommend using a virtual environment via virtualenv
or conda
, and use pip
to install the requirements.
$ pip install -r requirements.txt
- Python 3.7+
numpy
scipy
tqdm
networkx
The simplest way to try out CoLiDE is by executing a straightforward example.
$ python main.py --nodes 10 --edges 20 --samples 1000 --graph er --vartype ev --seed 0
This example initially generates a random Erdos-Renyi DAG with 10 nodes and 20 edges. Subsequently, utilizing a linear SEM, 1000 i.i.d samples are generated, assuming equal noise variance and a Gaussian noise distribution. Next, both CoLiDE-EV and CoLiDE-NV are applied to this data, and the graph recovery performance will be displayed.
Given the data matrix
where
Noteworthy methods advocate an exact acyclicity characterization using nonconvex, smooth functions
Minimizing
We start our exposition with a simple scenario whereby all exogenous variables
Notably, the weighted, regularized LS score function
With regards to the choice of the acyclicity function, we select
where the schedule of hyperparameters
CoLiDE-EV jointly estimates the noise level
where
We also address the more challenging endeavor of learning DAGs in heteroscedastic scenarios, where noise variables have non-equal variances (NV)
Note that
where
We express our gratitude to the authors of the DAGMA repository for providing their code. A portion of our code is derived from their implementation, particularly incorporating their acyclicity function and optimization scheme.