GitHub - mnonnenm/vonMisesFisher_negativeEntropy: Python code for Bregman Clustering on the Hypersphere (i.e. mean-parameter expectation maximization for von Mises-Fisher mixtures)

Code repository for the study

"A solution for the mean parametrization of the von Mises-Fisher distribution"

This code repository contains an implementation of (gradient / Hessian of) the log-partition function and negative entropy function for the D-dimensional von Mises-Fisher distribution, alongside code for fitting mixtures of vMF distribution in both natural and mean parametrization, all in Python. Sparse matrix support allows fast execution in high-dimensional feature spaces when features are sparse, as is common in document embeddings such as TF-IDF.

We make use of the mean-parameterized form in Bregman clustering for document clustering in high-dimensional (>25k dim) text embedding spaces, for the classical document datasets '20-newsgroups' and 'classic3'.

The example notebook contains a small toy example of Bregman clustering that fits a mixture of von Mises-Fisher distribution in mean paramtrization to several hundred data points in $1000$ dimensions within a fraction of a second. The experiments notebook can be used to reproduce all figures and numerical results of the manuscript.

Background

The von Mises-Fisher distribution vMF($m, \kappa)$ in D dimensions is a natural exponential family

$\log p(x|\eta) = \eta^\top{}x - \Phi(\eta) + \log h(x) $

with base measure $h(x)$ uniform over the hypersphere $S^{D-1}$, natural parameter $\eta = \kappa m$ and log-partition function

$\Phi(\eta) = \Phi(||\eta||) = \log I_{D/2−1}(||\eta||) − (D/2 − 1) \log ||\eta|| + const.$

As a natural exponential family with strictly convex log-partition function, the vMF family of distributions can equivalently be written as

$\log p(x|\mu) = \nabla{}\Psi(\mu)^\top{}(x - \mu) + \Psi(\mu) + \log h(x)$

with 'negative entropy' function

$\Psi(\mu) = \max_\eta \ \mu^\top\eta - \Phi(\eta) = - H[X|\mu] + const.$

and mean parameter $\mu = \nabla\Phi(\eta) = \mathbb{E}[X]$. The mean-parameter form of exponential families is convenient for maximum likelihood or variants thereof such as expectation maximization, but the required function $\Psi(\mu)$ defined for all $||\mu|| < 1$ and its gradient are not known in closed form.

We derived a second-order ODE on $\Psi(||\mu||)$, the radial profile of $\Psi(\mu)$, which can be used to compute quantities $\Psi(\mu)$, $\nabla\Psi(\mu)$, $\nabla^2\Psi(\mu)$, and hence work with the von Mises-Fisher distribution in mean-parameterized exponential family form. We also provide fast closed-form approximations to the solutions of the ODE for large dimensions $D$.

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
data		data
results		results
vMFne		vMFne
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_bregman_clustering.ipynb		example_bregman_clustering.ipynb
experiments.ipynb		experiments.ipynb
exps.py		exps.py
run_exps.py		run_exps.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contents

Background

About

Releases

Packages

Languages

License

mnonnenm/vonMisesFisher_negativeEntropy

Folders and files

Latest commit

History

Repository files navigation

Contents

Background

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages