Bayesian Classification with Regularized Gaussian Models
Bayesian classifiers with regularized estimators for the class priors, vector of means and covariance matrix
This work presents a novel approach to reduce the effects of the violations of the attribute independence assumption on which the Gaussian naive Bayes classifier is based. A Regularized Gaussian Bayes (RGB) algorithm is introduced, that considers the correlation structure among variables to learn the class posterior probabilities. The proposed RGB classifier avoids overfitting by replacing the sample covariance estimate with well-conditioned regularized estimates. So, RGB aims to find the best trade-off between non-naivety and prediction accuracy.
Moreover, improvements in RGB accuracy and stability are achieved using Adaptive Boosting (AdaBoost). In short, the proposed Boosted RGB (BRGB) classifier generates a sequentially weighted set of RGB base classifiers that are combined to form a robust classifier. Classification experiments have demonstrated that the BRGB achieves prediction performance comparable to the best off-the-shelf ensemble based architectures, such as Random Forests, Extremely Randomized Trees (ExtraTrees) and Gradient Boosting Machines (GBMs), using few (10 to 20) base classifiers.
BRGB Decision Boundary as boosting iterations proceed:
 Ledoit, Olivier, and Michael Wolf. "A well-conditioned estimator for large-dimensional covariance matrices." Journal of multivariate analysis 88.2 (2004): 365-411.
 Chen, Yilun, et al. "Shrinkage algorithms for MMSE covariance estimation." Signal Processing, IEEE Transactions on 58.10 (2010): 5016-5029.
 Schäfer, Juliane, and Korbinian Strimmer. "A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics." Statistical applications in genetics and molecular biology 4.1 (2005).
 Opgen-Rhein, Rainer, and Korbinian Strimmer. "Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach." Statistical Applications in Genetics and Molecular Biology 6.1 (2007).
 Tipping, Michael E., and Christopher M. Bishop. "Probabilistic principal component analysis." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61.3 (1999): 611-622.
 Minka, Thomas P. "Automatic choice of dimensionality for PCA." NIPS. Vol. 13. 2000.
 Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics (2009): kxp008.
 Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. "Sparse inverse covariance estimation with the graphical lasso." Biostatistics 9.3 (2008): 432-441.
 Hsieh, Cho-Jui, et al. "Sparse inverse covariance matrix estimation using quadratic approximation." Advances in Neural Information Processing Systems. 2011.
 Freund, Yoav, Robert Schapire, and N. Abe. "A short introduction to boosting." Journal-Japanese Society For Artificial Intelligence 14, no. 771-780 (1999): 1612.
 Schapire, Robert E., and Yoav Freund. "Boosting: Foundations and algorithms." MIT press, 2012.
 Niculescu-Mizil, Alexandru, and Rich Caruana. "Predicting good probabilities with supervised learning." In Proceedings of the 22nd international conference on Machine learning, pp. 625-632. ACM, 2005.