Skip to content

KevinMoonLab/EnDive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

KDE-Based Ensemble Divergence Estimators (EnDive)

DOI:10.3390/e20080560

Source code for the EnDive estimator of $f$-divergence functional integrals. Based on the paper here. Code is currently only available in Matlab.

$f$-Divergence Functionals

Divergence functionals are integral functionals of two probability distributions. Divergence functionals play a big role in applications in machine learning, information theory, statistics, and signal processing. In particular, divergence functionals can be related to the best probability of error rate of a classification problem. Divergences are also often used to measure dissimilarity between probability distributions.

While the paper referenced above covers general divergence functionals, this code is written for $f$-divergence functionals which include the Kullback-Leibler (KL) divergence, the Renyi-$\alpha$ divergence integral, the Hellinger distance, the total variation distance, and the Henze-Penrose or Dp divergence. $f$-divergence functionals have the form of:

$D_f(q,p)=\int f\left(\frac{q(x)}{p(x)}\right) p(x)dx,$

where $q$ and $p$ are probability densities and $f$ is some function. For $D_f$ to be a true divergence, certain properties need to be satisfied.

Ensemble Divergence Estimation

EnDive computes an ensemble of kernel density estimators (KDE) of the densities $p$ and $q$ with different bandwith values for each estimator. Plugging in the KDEs gives an ensemble of divergence estimators. The EnDive estimator then takes a weighted sum of the ensemble, where the weights are chosen to minimize the mean squared error (MSE). This results in an estimator that achieves an MSE of $O(1/N)$ where $N$ is the number of samples from each density.

The bandwidths can be provided by the user. Otherwise, the default is to compute the set of bandwidths based on the $k$-nearest neighbor distances.

References

If you find this work useful, please cite:

@article{moon2018endive,
  title={Ensemble estimation of information divergence},
  author={Moon, Kevin R and Sricharan, Kumar and Greenewald, Kristjan and Hero, Alfred O},
  journal={Entropy},
  year={2018},
  volume={20},
  number={8},
  pages={560}
}

Other related papers that may be of interest:

[1] K.R Moon, K. Sricharan, K. Greenewald, A.O. Hero III, "Improving convergence of divergence functional ensemble estimators," IEEE International Symposium on Information Theory (ISIT), pp. 1133-1137, July 2016.

[2] K.R. Moon, V. Delouille, and A.O. Hero III, "Meta learning of bounds on the Bayes classifier error," IEEE Signal Processing and SP Education Workshop, pp. 13-18, Aug. 2015.