Skip to content

tuqpham0/BANN

Repository files navigation

Bregman Approximate Nearest Neighbours for Python

This is a python wrapper for the (Decomposable) Bregman Approximate Nearest Neighbour(Bregman ANN, or BANN) package adapted by Hubert Wagner and Tuyen Pham. The original ANN package was written by David Mount and Sunil Arya. The original package uses Kd-trees to search for nearest neighbours in Euclidean space equipped with an $L^{p}$-norm. See David Mount's page for documention for the original ANN package.

For a Python wrapped version of the original ANN, see:

Bregman Divergences

Bregman divergences are measurements of generalized distances in a space. Unlike metrics, they are often assymmetric and do not globally satisfy the triangle inequality. Recently, these divergences have been useful in machine learning, with the most prominent example being the Kullback--Leibler divergence.

About

The BANN package currently uses Kd-trees for two primary computations:

  • (approximate) $k$-nearest neighbour searches with decomposable Bregman divergences
  • Bregman--Hausdorff divergence for decomposable Bregman divergences

Currently, this package supports the following divergences:

  • Kullback--Leibler divergence (primal and dual)
  • Itakura--Saito divergence (primal and dual)
  • Squared Euclidean divergence

Additional decomposable divergences can be simply added to the source code, and passing a divergence from Python is a planned implementation.

Details

Let $D_F:\Omega\times\Omega\to [0,\infty]$ be a decomposable Bregman divergence and let $P = \{p_n\}^N_{n=1}$, $Q = \{q_m\}^M_{m=1}$ be subsets of $\Omega$.

Bregman $k$-nn search

For $q\in Q$, the Bregman $k$-nearest neighbour search returns the ordered list of indices $(x_1,x_2,\dots,x_k)$, such that $D_F(q\|p_{x_1})\leq D_F(q\|p_{x_2})\leq\cdots\leq D_F(q\|p_{x_k})$ and $D_F(q\|p_{x_k})\leq D_F(q\|p_{\ell})$ for all $\ell\notin\{x_{1},x_{2},\dots,x_{k}\}$. As Bregman divergences are rarely symmetric, we can reverse the arguments as necessary.

This package also supports $\epsilon$-approximate nearest neighbour searches, where the divergence to the reported nearest neighbour is at most $(1+\epsilon)$-times the divergence to the true nearest neighbour.

Further details of using Kd-trees with Bregman Divergences are discussed here.

Bregman—Hausdorff divergence

The Bregman—Hausdorff divergence generalizes the Bregman divergence between two vectors to the Bregman divergence between to sets of vectors. The Bregman—Hausdorff divergence was introduced by Pham, Dal Poz Kouřimská, and Wagner, where they also provide algorithms for its computation. Specifically, we compute $$H_{D_F}(P|Q) = \inf \{r\geq0 : P\subseteq\bigcup_{q\in Q}B_F(q;r)\}$$ and $$H_{D_F}' = \inf \{r\geq0 : P\subseteq\bigcup_{q\in Q}B'_F(q;r)\}$$ via the shell algorithm, where $B_F(q;r)=\{x\in \Omega,:, D_F(q|x)\leq r\}$ and $B'_F(q;r) = \{x\in \Omega,:,D_F(x|q)\leq r\}$. Note that the directions of computations for the Bregman--Hausdorff divergences are reversed compared to the directions for the nearest neighbour searches.

The Bregman—Hausdorff divergence and shell algorithm for computation are introduced here.

Further details

For further details and example uses, see the documentation.

Requirements

Python Version

BANN requires Python >=3.11.

Dependencies

Installation

Feedback

Bug reports, pull requests after forking, and other questions may be sent to the maintainer: tuyen.pham@ufl.edu

Copyright and License

See Copyright and License for copyright and license information.

About

Bregman kd-trees

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors