Fetching latest commit…
Cannot retrieve the latest commit at this time.
|Failed to load latest commit information.|
pyrandomprojection ------------------ by Joseph Turian Random projection library for Python, converting a dictionary to low-dimensional numpy vector "Random projection is a simple geometric technique for reducing the dimensionality of a set of points in Euclidean space while preserving pairwise distances approximately" -Santosh Vempala <http://www-math.mit.edu/~vempala/rp.html> See also: http://github.com/turian/random-indexing-wordrepresentations A more end-to-end package (rather than drop-in function) for specifically inducing word representations over a corpus, using random indexing (a specific application of random projection). WARNING: * Read warnings in common.gaussian and common.deterministicrandom: http://github.com/turian/common/blob/master/gaussian.py http://github.com/turian/common/blob/master/deterministicrandom.py that indicate potential issues with the RNG and hash function. NOTE: * The runtime is typically bounded by the number of calls to randomrow. If you have a single instance, simply call project() on it. If you have a list of instances, you shouldn't just call project on each instance, because that will be #nonzeros * randomrow calls. Instead, iterate over the columns (i.e. iterate feature-major order, not instance-major order), so you only do one randomrow op per feature type. There is example code that does this here: http://github.com/glorotxa/DeepANN/blob/master/exp_scripts/randomprojection.py #import math #import murmur TODO: Need test suite to make sure that values are consistent across architectures. REQUIREMENTS: * numpy Output is put in a (dense) numpy vector * My python common library: http://github.com/turian/common which in turn will require: * Murmur: easy_install murmur http://pypi.python.org/pypi/Murmur/ Provides fast murmur hashes for strings, files, and ziped files.