Bayesian Classification with Regularized Gaussian Models

Bayesian classifiers with regularized estimators for the class priors, vector of means and covariance matrix

This work presents a novel approach to reduce the effects of the violations of the attribute independence assumption on which the Gaussian naive Bayes classifier is based. A Regularized Gaussian Bayes (RGB) algorithm is introduced, that considers the correlation structure among variables to learn the class posterior probabilities. The proposed RGB classifier avoids overfitting by replacing the sample covariance estimate with well-conditioned regularized estimates. So, RGB aims to find the best trade-off between non-naivety and prediction accuracy.

Moreover, improvements in RGB accuracy and stability are achieved using Adaptive Boosting (AdaBoost). In short, the proposed Boosted RGB (BRGB) classifier generates a sequentially weighted set of RGB base classifiers that are combined to form a robust classifier. Classification experiments have demonstrated that the BRGB achieves prediction performance comparable to the best off-the-shelf ensemble based architectures, such as Random Forests, Extremely Randomized Trees (ExtraTrees) and Gradient Boosting Machines (GBMs), using few (10 to 20) base classifiers.

BRGB Decision Boundary as boosting iterations proceed:

References

[1] Ledoit, Olivier, and Michael Wolf. "A well-conditioned estimator for large-dimensional covariance matrices." Journal of multivariate analysis 88.2 (2004): 365-411.

[2] Chen, Yilun, et al. "Shrinkage algorithms for MMSE covariance estimation." Signal Processing, IEEE Transactions on 58.10 (2010): 5016-5029.

[3] Schäfer, Juliane, and Korbinian Strimmer. "A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics." Statistical applications in genetics and molecular biology 4.1 (2005).

[4] Opgen-Rhein, Rainer, and Korbinian Strimmer. "Accurate ranking of differentially expressed genes by a distribution-free shrinkage approach." Statistical Applications in Genetics and Molecular Biology 6.1 (2007).

[5] Tipping, Michael E., and Christopher M. Bishop. "Probabilistic principal component analysis." Journal of the Royal Statistical Society: Series B (Statistical Methodology) 61.3 (1999): 611-622.

[6] Minka, Thomas P. "Automatic choice of dimensionality for PCA." NIPS. Vol. 13. 2000.

[7] Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics (2009): kxp008.

[8] Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. "Sparse inverse covariance estimation with the graphical lasso." Biostatistics 9.3 (2008): 432-441.

[9] Hsieh, Cho-Jui, et al. "Sparse inverse covariance matrix estimation using quadratic approximation." Advances in Neural Information Processing Systems. 2011.

[10] Freund, Yoav, Robert Schapire, and N. Abe. "A short introduction to boosting." Journal-Japanese Society For Artificial Intelligence 14, no. 771-780 (1999): 1612.

[11] Schapire, Robert E., and Yoav Freund. "Boosting: Foundations and algorithms." MIT press, 2012.

[12] Niculescu-Mizil, Alexandru, and Rich Caruana. "Predicting good probabilities with supervised learning." In Proceedings of the 22nd international conference on Machine learning, pp. 625-632. ACM, 2005.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
experiments		experiments
README.md		README.md
brgb_example.ipynb		brgb_example.ipynb
cover_boostedRGB.gif		cover_boostedRGB.gif
thesis.pdf		thesis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian Classification with Regularized Gaussian Models

References

About

Releases

Packages

Languages

davpinto/master-thesis

Folders and files

Latest commit

History

Repository files navigation

Bayesian Classification with Regularized Gaussian Models

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages