Skip to content

HolyBayes/ard-em

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ARD EM

ARD (Automatic Relevance Determination) EM implementation on Python. The classical EM-algorithm for reconstructing a mixture of normal distributions does not allow to determine the amount of components of the mixture. The ARD EM implementation suggests algorithm for automatically determining the number of components ARD EM, based on the method of relevant vectors. The idea of the algorithm is to use at the initial stage of a knowingly excessive amount of the components of the mixture with further determination of the relevant components by maximizing validity. Experiments on model problems show that the number of found clusters either coincides with the true one, or slightly excels him. In addition, clustering with ARD EM is closer to the true than the analogs based on sliding control and character of the minimum description length. It's EM algorithm with automatic determination of number of components. It's powerful and fast algorithm for gaussian mixture learning and clustering with unknown number of components.

Implementation

The implemented GaussianMixtureARD class has the same interface as SkLearn's GaussianMixture one, but with 3 additional parameters:

init_components="sqrt" # Initial number of components. sqrt(N) if "sqrt"
alpha_bound=1e3 # Drop all components with weight_reg (alpha) > alpha_bound
weight_bound=1e-3 # Drop all components with weight < weight_bound

and without n_components one.

Installation

Stable release

pip install ard-em

Develop

pip install git+https://github.com/Leensman/ard-em.git

Example

from ard_em import GaussianMixtureARD
gmm = GaussianMixtureARD()
gmm = gmm.fit(X)
print('Bayesian information criterion: ', gmm.bic(X))
best_n_components = gmm.n_components
print('Best number of components: ', best_n_components)
gmm.predict(X)

For more examples go to GaussianMixture.ipynb

Authors

Original paper

Contacts

Artem Ryzhikov, LAMBDA laboratory, Higher School of Economics, Yandex School of Data Analysis

E-mail: artemryzhikoff@yandex.ru

Linkedin: https://www.linkedin.com/in/artem-ryzhikov-2b6308103/

Link: https://www.hse.ru/org/persons/190912317

About

ARD EM python implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages