Trope is a golang package for dealing with very large immutable arrays at decent performance with respect to random access edits.
It uses a rope-like tree structure (which is neither balanced by default, nor is it binary as with regular ropes).
Most rope-like structures perform poorly at small sizes. So, the package provides a Hybrid implementation which uses regular slices for small sizes and switches to the rope-like structure at configured high threshold to get the best of both worlds.
Trope is mainly focused on good performance under random edits (not just appends).
The benchmark is run via something like:
$> go test --bench=. -strlen=1000000
The following table shows the time (in microseconds) taken for each of the implementations for 100 random splice operations (on my machine). Trope performs at a relatively fixed cost without much dependency on the size of the array but this fixed cost only outweighs the raw slice performance at around 15k size.
String Size | 5k | 10k | 100k | 1M |
---|---|---|---|---|
Trope | 219 | 188 | 181 | 188 |
String | 70 | 134 | 1040 | 10195 |
Skiprope | 123 | 193 | 1240 | 12159 |
Rope | 233 | 256 | 452 | 2501 |
The memory usage is also a bit of a concern. Here too, the trope implementation has a much more steady usage that does not depend that much on the size of the string.
$ go test --bench=. --strlen=100000 -benchmem
goos: darwin
goarch: amd64
pkg: github.com/perdata/trope
BenchmarkTrope-4 10000 180376 ns/op 172648 B/op 1519 allocs/op
BenchmarkString-4 2000 1079424 ns/op 10084772 B/op 200 allocs/op
BenchmarkSkiprope-4 1000 1235641 ns/op 339567 B/op 3231 allocs/op
BenchmarkRope-4 3000 451715 ns/op 671616 B/op 4398 allocs/op
For string sizes larger than 100k, trope does about the same while other implementations get worse (either in B/op or in allocs/op):
$ go test --bench=. --strlen=1000000 -benchmem
goos: darwin
goarch: amd64
pkg: github.com/perdata/trope
BenchmarkTrope-4 10000 186289 ns/op 188648 B/op 1577 allocs/op
BenchmarkString-4 100 10645611 ns/op 100033109 B/op 200 allocs/op
BenchmarkSkiprope-4 100 11908668 ns/op 3331194 B/op 31355 allocs/op
BenchmarkRope-4 500 2436868 ns/op 4276162 B/op 4399 allocs/op
For relatively smaller string sizes, trope stays at roughly the same usage and only Skiprope is better.
$ go test --bench=. --strlen=10000 -benchmem
goos: darwin
goarch: amd64
pkg: github.com/perdata/trope
BenchmarkTrope-4 10000 193261 ns/op 182008 B/op 1602 allocs/op
BenchmarkString-4 10000 136390 ns/op 796433 B/op 200 allocs/op
BenchmarkSkiprope-4 10000 188041 ns/op 33977 B/op 415 allocs/op
BenchmarkRope-4 5000 267254 ns/op 323072 B/op 4584 allocs/op
All these combinations point to a custom "hybrid" method that uses flat strings for small strings and the actual tree structure only for large strings. Hence the Hybrid type which implements this idea
A Hybrid implementation which switches from string to trope.Node at 15k size and switches back at 10k size has following characterestics at 5k string size:
BenchmarkTrope-4 10000 220341 ns/op 211384 B/op 1871 allocs/op
BenchmarkHybrid-4 20000 88510 ns/op 292664 B/op 413 allocs/op
BenchmarkString-4 20000 71310 ns/op 283440 B/op 200 allocs/op
This basically tracks the string slice costs at low sizes. At 200k string sizes, the Hybrid performs similar to Trope basically getting the best of both worlds
BenchmarkTrope-4 10000 186928 ns/op 188264 B/op 1583 allocs/op
BenchmarkHybrid-4 10000 193612 ns/op 188264 B/op 1583 allocs/op
BenchmarkString-4 1000 2243631 ns/op 20226499 B/op 200 allocs/op