Implementation of "An Optimal Suffix Array Construction Algorithm" described in a Technical Report by Ge Nong
Go
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore Initial commit Feb 23, 2013
LICENSE
README.md
common.go When possible, replace for loops with ranges. Feb 25, 2013
gosaca_test.go
level0.go When possible, replace for loops with ranges. Feb 25, 2013
level1.go

README.md

gosaca

Description

Pure Go implementation of An Optimal Suffix Array Construction Algorithm, a paper by Ge Nong.

Benchmarks

More extensive tests and benchmarks run on large copora are available as gosaca-bigtests so as not to balloon the size of this repo.

The following table compares sa-is with gosaca running on a 2012 Macbook Pro. The comparision is not really fair in many senses (SA-IS is an earlier, slightly less efficient algorithm, but it's implemented in optimized C code); here it is nonetheless. Times are in seconds.

File Size sa-is gosaca
chr22dna 34553758 5.893 10.807
etext99 105277340 21.507 41.862
gcc30tar 86630400 10.252 23.149
howto 39422105 6.050 12.123
jdk13c 69728899 6.567 19.807
linux245tar 116254720 14.635 32.696
rctail96 114711151 15.143 40.225
rfc 116421901 16.191 36.505
sprot34dat 109617186 17.485 39.164
w3c2 104201579 10.395 30.164
abac 200000 0.005 0.016
abba 10500600 0.646 2.326
book1x20 15375420 1.620 5.126
fib_s14930352 14930352 0.981 4.579
fss10 12078908 0.739 3.539
fss9 2851443 0.145 0.518
houston 3840000 0.102 0.288
paper5x80 981924 0.041 0.099
test1 2097152 0.096 0.236
test2 2097152 0.103 0.251
test3 2097152 0.094 0.206

Copyright

Copyright © John Gallagher. MIT License; see LICENSE for further details.