Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.Sign up
spatialtree: Python module for spatial trees Author: Brian McFee <email@example.com> CREATED: 2011-11-13 16:12:29 This code is distributed under the GNU GPL license. See LICENSE for details, or http://www.gnu.org/licenses/gpl-3.0.txt . If you use this code for academic research, please cite the following publication:  McFee, B. and Lanckriet, G.R.G. Large-scale music similarity search with spatial trees. 12th International Society for Music Information Retrieval (ISMIR) conference, 2011. INTRODUCTION ------------ This module provides a unified interface to constructing various flavors of spatial tree data structures for accelerating approximate nearest-neighbor retrieval in high-dimensional data. The supported methods for generating spatial trees include: * KD-trees (maximum-variance)  * PCA-trees  * 2-means trees  * Random projection trees  The methods listed above provide different rules for generating a recursive partitioning of high-dimensional vector data. The spatialtree package also provides support for spill trees, which use redundant mappings to improve the accuracy of nearest-neighbor retrieval . The spill tree functionality may be combined with any of the above rules. Spatialtree supports indexing of raw vector/matrix data (in the form of numpy arrays), or structured key-value stores. Spatialtree is semi-dynamic, in that the tree may be pruned to a fixed height, and data may be removed (and added, if using key-value stores), but the tree does not re-balance. For static data sets, we provide an efficient and light-weight inverted map data structure for answering (approximate) nearest neighbor queries of items within the set. Several example programs are provided, demonstrating the various use-cases. Class and method documentation is provided in doc-strings (pydoc spatialtree). INSTALLATION ------------ From the command-line (as root/sudo): # python setup.py install REQUIREMENTS ------------ This module depends on numpy and scipy. REFERENCES ----------  J.L. Bentley. Multidimensional binary search trees used for associative searching. Commun. ACM, 18:509–517, Sep. 1975.  Nakul Verma, Samory Kpotufe, and Sanjoy Dasgupta. Which spatial partition trees are adaptive to intrinsic dimension? In Uncertainty in Artificial Intelligence, pages 565–574, 2009.  Sanjoy Dasgupta and Yoav Freund. Random projection trees and low dimensional manifolds. In ACM Symposium on Theory of Computing, pages 537–546, 2008.  Ting Liu, Andrew W. Moore, Alexander Gray, and Ke Yang. An investigation of practical approximate nearest neighbor algorithms. In NIPS, pages 825–832. 2005.