Skip to content
A clean and easy interface for performing nearest-neighbor lookups
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
docs
simpleneighbors
tests
.editorconfig
.gitignore
.travis.yml
AUTHORS.rst
HISTORY.rst
LICENSE
MANIFEST.in
Makefile
README.rst
requirements.txt
setup.cfg
setup.py
tox.ini

README.rst

Simple Neighbors

https://coveralls.io/repos/github/aparrish/simpleneighbors/badge.svg?branch=master

Simple Neighbors is a clean and easy interface for performing nearest-neighbor lookups on items from a corpus. For example, here's how to find the most similar color to a color in the xkcd colors list:

>>> from simpleneighbors import SimpleNeighbors
>>> import json
>>> color_data = json.load(open('xkcd.json'))['colors']
>>> hex2int = lambda s: [int(s[n:n+2], 16) for n in range(1,7,2)]
>>> colors = [(item['color'], hex2int(item['hex'])) for item in color_data]
>>> sim = SimpleNeighbors(3)
>>> sim.feed(colors)
>>> sim.build()
>>> list(sim.neighbors('pink', 5))
['pink', 'bubblegum pink', 'pale magenta', 'dark mauve', 'light plum']

Read the documentation here: https://simpleneighbors.readthedocs.org.

Approximate nearest-neighbor lookups are a quick way to find the items in your data set that are closest (or most similar to) any other item in your data, or an arbitrary point in the space that your data defines. Your data items might be colors in a (R, G, B) space, or sprites in a (X, Y) space, or word vectors in a 300-dimensional space.

You could always perform pairwise distance calculations to find nearest neighbors in your data, but for data of any appreciable size and complexity, this kind of calculation is unbearably slow. This library uses Annoy behind the scenes for approximate nearest-neighbor lookups, which are ultimately a little less accurate than pairwise calculations but much, much faster.

The library also keeps track of your data, sparing you the extra step of mapping each item in your data to its integer index in Annoy (at the potential cost of some redundancy in data storage, depending on your application).

I made Simple Neighbors because I use Annoy all the time and found myself writing and rewriting the same bits of wrapper code over and over again. I wanted to hide a little bit of the complexity of using Annoy to make it easier to build small prototypes and teach workshops using nearest-neighbor lookups.

Installation

Install with pip like so:

pip install simpleneighbors

You can also download the source code and install manually:

python setup.py install
You can’t perform that action at this time.