Skip to content

xmlou/linear-polytree-SEM

 
 

Repository files navigation

Linear polytree structure equation modeling

This repository contains the inference and simulation codes accompying the paper: X. Lou, Y. Hu, X. Li, Linear Polytree Structural Equation Models: Structural Learning and Inverse Correlation Estimation, arXiv, 2021.

Installation

The majority of the code is written in Python 3 along with several assisting R scripts.

For Python, it requires netwowrkx and graphviz, pydot (for visualizing graphs only) packages in addition to standard packages such as numpy, scipy.

For R, we use the package bnlearn for the implementation of the comparative hill-climbing algorithm and benchmark data.

The code also requires a \data and \figure sub-folders to store results.

Main files

infer_polytree.py contains the functions for learning the CPDAG from data based on a linear polytree model, as well as functions for randomly generate polytree models and visualizing the graphs.

example_infer_vs_hc.py gives a simple example of applying the polytree learning to a synthetic data, and compare the result with the hill climbing algorithm.

polytree_simulation_vs_hc_PC.py test the performance of the polytree learning algorithm on randomly generated polytree models under various parameters p, n, r_min ($\rho_\min$), din_max (see the paper for details). The code produces Fig.1 of the paper (using pre-computed simulation data run_id=3,4). Note, the code may take a while to run with a large number of n and ntrial.

DAG_simulation_vs_hc_PC.py test the performance of the polytree learning algorithm on randomly generated DAG models under various parameters p, n_edge. The code produces Fig.2 of the paper (using pre-computed simulation data run_id=2). Note, the code may take a while to run with a large number of n and ntrial.

asia_data.py applies the polytree learning to the benchmark data Asia, Lauritzen and Spiegelhalter (1988). It produces Fig.3 of the paper.

alarm_data.py applies the polytree learning to the benchmark data ALARM Beinlich et al. (1989). It produces Fig.4 of the paper.

Acknowledgement

The DAG benchmark ALARM data is downloaded from the online resources of the paper The Max-Min Hill-Climbing Bayesian Network Structure Learning Algorithm I. Tsamardinos, L. E. Brown, C. F. Aliferis, Machine Learning, 2006.

We use a simple Pyhton implementation of the Kruskal MST algorithm from Pedro Lobato@GitHub, kruskal.py. We found it is faster than the buildt-in function from networkx.

Citation

Please give citations to the paper: X. Lou, Y. Hu, X. Li, Linear Polytree Structural Equation Models: Structural Learning and Inverse Correlation Estimation, arXiv, 2021.

About

Efficient inference of directed graphical structures based on linear polytree models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.6%
  • R 6.4%