Skip to content

enthought/vpsearch

Repository files navigation


vpsearch - Fast Vantage-Point Tree Search for Sequence Databases

PyPI version Tests status DOI

This is a package for indexing and querying a sequence database for fast nearest-neighbor search by means of vantage point trees. For reasonably large databases, such as RDP, this results in sequence lookups that are typically 5-10 times faster than other alignment-based lookup methods.

Vantage-point tree search uses global-to-global alignment to compare sequences, rather than seed-and-extend approximative methods as used for example by BLAST.

Installation and usage

VPsearch can be installed and updated through pip:

    pip install -U vpsearch

This will install a standalone command-line utility vpsearch into your environment, which can be used to build and query a sequence database. For more information on how to do so, see the documentation.

Citing vpsearch

If you use vpsearch, please cite our paper:

  • Joris Vankerschaver, Steven J. Kern, Robert Kern. VPsearch: fast exact sequence similarity search for genomic sequences. Journal of Open Source Software, 7(78), 4236, 2002. https://doi.org/10.21105/joss.04236

License

This package is licensed under the 3-clause BSD license.