C implementation of SpeedyFx algorithm
C
Switch branches/tags
Nothing to show
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore don't use rand() anymore Dec 31, 2012
README.md don't use rand() anymore Dec 31, 2012
speedyfx.c don't use rand() anymore Dec 31, 2012

README.md

SpeedyFx algorithm

Tokenize/hash large amount of strings efficiently.

Original Java implementation ported to C by Stanislaw Pusep.

Compile with:

clang -lm -O3 -o speedyfx speedyfx.c

or:

gcc -lm -O3 -o speedyfx speedyfx.c

Then use as:

./speedyfx enwik9 > fv.bin

To generate 128KB feature vector for enwik9 text file.

Benchmark

Test data: https://cs.fit.edu/~mmahoney/compression/enwik9.bz2

Hardware: Intel(R) Xeon(R) CPU E5620 @ 2.40GHz

Average feature vector build speed: 213.83 MB/s