Skip to content

Distributed Score Computation (DiSC) is a scalable approach for fast, approximate score computation to learn multinomial Bayesian networks over distributed data

License

Notifications You must be signed in to change notification settings

UMKC-BigDataLab/DiSC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DiSC

DiSC is a system for fast approximate score computation for learning multinomial Bayesian networks on large-scale distributed data. It employs decentralized computation using gossip algorithms, hashing techniques for load balancing, and a probabilistic approach for lowering resource consumption during score computation. DiSC significantly outperforms MapReduce-style computation for computing scores, which is a fundamental task during Bayesian structure learning.

Publications

Praveen Rao, Anas Katib, Kobus Barnard, Charles Kamhoua, Kevin Kwiat, Laurent Njilla - Scalable Score Computation for Learning Multinomial Bayesian Networks Over Distributed Data. In the AAAI 2017 Workshop on Distributed Machine Learning (DML 2017), pages 498-504, San Francisco, CA, 2017. PDF

Anas Katib, Praveen Rao, Kobus Barnard, Charles Kamhoua - Fast Approximate Score Computation on Large-Scale Distributed Data for Learning Multinomial Bayesian Networks. In the ACM Transactions on Knowledge Discovery from Data (TKDD), 13(2):14:1-14:40, 2019. PDF

Arun Zachariah, Praveen Rao, Anas Katib, Monica Senapati, Kobus Barnard - A Gossip-Based System for Fast Approximate Score Computation in Multinomial Bayesian Networks. In the 35th IEEE International Conference on Data Engineering (ICDE), pages 1968-1971, Macau, China, 2019. PDF

Contributors

Faculty PI: Praveen Rao

PhD Students: Anas Katib, Arun Zachariah, Monica Senapati

Others: Kobus Barnard, Charles Kamhoua, Laurent Njilla, Kevin Kwiat

Acknowledgments

We would like the acknowledge the partial support of NSF Grant No. 1747751.

About

Distributed Score Computation (DiSC) is a scalable approach for fast, approximate score computation to learn multinomial Bayesian networks over distributed data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published