Bayesian inference for Gaussian mixture model to reduce over-clustering via the powered Chinese restaurant process (pCRP). We use collapsed Gibbs sampling for posterior inference.
|-- GMM # base class for Gaussian mixture model
|---- IGMM # base class for infinite Gaussian mixture model
|------ CRPMM ## traditional Chinese restaurant process (CRP) mixture model
|------ PCRPMM ## powered Chinese restaurant process (pCRP) mixture model
What do we include:
-
Chinese restaurant process mixture model (CRPMM)
-
Powered Chinese restaurant process (pCRP) mixture model
Code | Description |
---|---|
CRPMM 1d | Chinese restaurant process mixture model for 1d data |
CRPMM 2d | Chinese restaurant process mixture model for 2d data |
pCRPMM 1d | powered Chinese restaurant process mixture model for 1d data |
pCRPMM 2d | powered Chinese restaurant process mixture model for 2d data |
- See requirements.txt
MIT
The repo is based on the following research articles:
- Lu, Jun, Meng Li, and David Dunson. "Reducing over-clustering via the powered Chinese restaurant process." arXiv preprint arXiv:1802.05392 (2018).
- H. Kamper, A. Jansen, S. King, and S. Goldwater, "Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings", in Proceedings of the IEEE Spoken Language Technology Workshop (SLT), 2014.
- Murphy, Kevin P. "Conjugate Bayesian analysis of the Gaussian distribution." def 1.2σ2 (2007): 16.
- Murphy, Kevin P. Machine learning: a probabilistic perspective. MIT press, 2012.
- Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of Machine Learning Research 12.Oct (2011): 2825-2830.
- Rasmussen, Carl Edward. "The infinite Gaussian mixture model." Advances in neural information processing systems. 2000.
- Tadesse, Mahlet G., Naijun Sha, and Marina Vannucci. "Bayesian variable selection in clustering high-dimensional data." Journal of the American Statistical Association 100.470 (2005): 602-617.