Skip to content

jirehhuang/phsl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

phsl: Partitioned Hybrid Structure Learning

Jireh Huang (jirehhuang@ucla.edu)

This package implements the Bayesian network structure learning algorithms developed in Huang and Zhou (2022), interfacing with the R package bnlearn. In particular, this package features the partitioned PC (pPC), p-value adjacency thresholding (PATH), and hybrid greedy initialization (HGI) algorithms, which culminate in the partitioned hybrid greedy search (pHGS) algorithm.

Installation

Run the following code to install the package from GitHub.

devtools::install_github("jirehhuang/phsl")

Usage

The pPC, PATH, and HGI algorithms are integrated into the bnsl() function, which is a mix-and-match Bayesian network structure learning function modeled after the rsmax2() function from the bnlearn package. The pPC algorithm is available as a constraint-based algorithm in the restrict argument, PATH may be specified by the path argument when restrict = "ppc", and HGI may be activated by the hgi argument for constraint-based and hybrid approaches. See help(bnsl) for more details and examples. Wrapper functions ppc() and phgs() contain presets to implement pPC with PATH and pHGS, respectively.

Scaling

Up to 1200 copies of the CANCER network were tiled to obtain the 12 networks with numbers of nodes and edges shown in the following table.

p |E|
5 4
25 27
50 56
125 140
250 273
500 558
1000 1088
2000 2249
3000 3342
4000 4467
5000 5638
6000 6653

Five datasets were generated for each network configuration with n = 1000 samples, and the following algorithms were executed on these datasets with alpha = 0.05 and max.sx = 3. bnlearn::pc.stable(...) is the bnlearn implementation of PC(-stable), phsl::ppc(..., max_groups = 1) is the phsl version by executing pPC with κ = 1, and phsl::ppc(...) is the pPC algorithm. The timing results are shown in the following figure.

While pPC and PC in phsl executed on the datasets with p = 6000 variables in approximately 80 minutes and 13 hours, respectively, the bnlearn implementation of PC struggled to execute on even 2000 variables, requiring over 50 hours, over 850 times slower than pPC.

References

Please cite the following paper when using any part of this package, modified or as is.

Huang, J., & Zhou, Q. (2022). Partitioned hybrid learning of Bayesian network structures. Machine Learning. https://doi.org/10.1007/s10994-022-06145-4

This package relies heavily on and borrows functionalities from the bnlearn package.

Scutari, M. (2010). Learning Bayesian Networks with the bnlearn R Package. Journal of Statistical Software, 35(3):1––22. http://www.jstatsoft.org/v35/i03/

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published