investigate switch to bigcache #3

mroth · 2018-12-11T18:45:39Z

Investigation of bigcache performance, which should theory speed concurrent reads. I also like that it has a configurable hard max size for cache which could prevent OOM killer in certain edge case scenarios.

Initial results look fairly good:

benchmark                               old ns/op     new ns/op     delta
BenchmarkHTTPRequest-16                 3460          3424          -1.04%
BenchmarkHTTPRequestWithCache-16        611           597           -2.29%
BenchmarkHTTPRequestPar-16              459           442           -3.70%
BenchmarkHTTPRequestParWithCache-16     176           151           -14.20%

benchmark                               old allocs     new allocs     delta
BenchmarkHTTPRequest-16                 9              9              +0.00%
BenchmarkHTTPRequestWithCache-16        5              7              +40.00%
BenchmarkHTTPRequestPar-16              9              9              +0.00%
BenchmarkHTTPRequestParWithCache-16     5              7              +40.00%

benchmark                               old bytes     new bytes     delta
BenchmarkHTTPRequest-16                 840           839           -0.12%
BenchmarkHTTPRequestWithCache-16        661           789           +19.36%
BenchmarkHTTPRequestPar-16              911           1017          +11.64%
BenchmarkHTTPRequestParWithCache-16     790           982           +24.30%

This does add 2 memory allocations to the cache-hit case (which is consistent with the benchmarks in bigcache's readme compared to a map based solution), but we see an overall 14% improvement in throughput in the parallel test, which seems good!

However things get more interesting if I increase the benchtime significantly, testing here with it bumped up to 1m:

benchmark                               old ns/op     new ns/op     delta
BenchmarkHTTPRequest-16                 3457          3323          -3.88%
BenchmarkHTTPRequestWithCache-16        722           802           +11.08%
BenchmarkHTTPRequestPar-16              627           684           +9.09%
BenchmarkHTTPRequestParWithCache-16     896           973           +8.59%

benchmark                               old allocs     new allocs     delta
BenchmarkHTTPRequest-16                 9              9              +0.00%
BenchmarkHTTPRequestWithCache-16        5              7              +40.00%
BenchmarkHTTPRequestPar-16              9              9              +0.00%
BenchmarkHTTPRequestParWithCache-16     5              7              +40.00%

benchmark                               old bytes     new bytes     delta
BenchmarkHTTPRequest-16                 854           854           +0.00%
BenchmarkHTTPRequestWithCache-16        721           849           +17.75%
BenchmarkHTTPRequestPar-16              899           903           +0.44%
BenchmarkHTTPRequestParWithCache-16     721           853           +18.31%

Interesting to note that in both these cases (go-cache and bigcache), the overall BenchmarkParallel performance on this machine (8-core Xeon) becomes significantly worse... I suspected what may be happening is the impact of garbage collection, however the cached version being disproportionally effected is confusing.

mroth · 2018-12-11T18:45:50Z

Trying again with GOMAXPROCS reduced so we still see parallelism but not to the same degree. (Hyperthreading issues?)

$ go test -bench=HTTPRequestPar -cpu 1,2,4,8,16,32 -benchtime 1s
goos: darwin
goarch: amd64
pkg: github.com/mroth/geominder
BenchmarkHTTPRequestPar                	  500000	      3309 ns/op
BenchmarkHTTPRequestPar-2              	 1000000	      1733 ns/op
BenchmarkHTTPRequestPar-4              	 2000000	       947 ns/op
BenchmarkHTTPRequestPar-8              	 2000000	       538 ns/op
BenchmarkHTTPRequestPar-16             	 3000000	       437 ns/op
BenchmarkHTTPRequestPar-32             	 3000000	       436 ns/op
BenchmarkHTTPRequestParWithCache       	 2000000	       602 ns/op
BenchmarkHTTPRequestParWithCache-2     	 3000000	       408 ns/op
BenchmarkHTTPRequestParWithCache-4     	 5000000	       269 ns/op
BenchmarkHTTPRequestParWithCache-8     	10000000	       164 ns/op
BenchmarkHTTPRequestParWithCache-16    	10000000	       131 ns/op
BenchmarkHTTPRequestParWithCache-32    	10000000	       117 ns/op
PASS
ok  	github.com/mroth/geominder	26.977s

go test -bench=HTTPRequestPar -cpu 1,2,4,8,16,32 -benchtime 10s 
goos: darwin
goarch: amd64
pkg: github.com/mroth/geominder
BenchmarkHTTPRequestPar                	 5000000	      3340 ns/op
BenchmarkHTTPRequestPar-2              	10000000	      1779 ns/op
BenchmarkHTTPRequestPar-4              	20000000	       972 ns/op
BenchmarkHTTPRequestPar-8              	30000000	       528 ns/op
BenchmarkHTTPRequestPar-16             	30000000	       426 ns/op
BenchmarkHTTPRequestPar-32             	30000000	       418 ns/op
BenchmarkHTTPRequestParWithCache       	20000000	       624 ns/op
BenchmarkHTTPRequestParWithCache-2     	30000000	       360 ns/op
BenchmarkHTTPRequestParWithCache-4     	50000000	       248 ns/op
BenchmarkHTTPRequestParWithCache-8     	100000000	       292 ns/op
BenchmarkHTTPRequestParWithCache-16    	50000000	       332 ns/op
BenchmarkHTTPRequestParWithCache-32    	100000000	       365 ns/op
PASS
ok  	github.com/mroth/geominder	239.573s

mroth · 2018-12-11T22:40:33Z

Since this is at least an improvement, I'm going to go ahead and merge this for now, and make a mental TODO to investigate whether it's worth creating my own cache based around sync.Map that may be even more performant.

(Notes for that: I think the way to do it would essentially be to use a fnv hash for key value to avoid pointers in GC sweep [see httpcache], and construct a simple FIFO queue keeping track of entries for sweeping out periodically. However needing to RWMutex lock the queue would create a potential bottleneck that obviates the fancy perf characteristics of sync.Map versus a normal mutex coordinated map.)

investigate switch to bigcache

4c4af41

mroth merged commit dc1b1d9 into master Dec 11, 2018

mroth deleted the bigcache branch December 17, 2018 21:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

investigate switch to bigcache #3

investigate switch to bigcache #3

mroth commented Dec 11, 2018

mroth commented Dec 11, 2018

mroth commented Dec 11, 2018

investigate switch to bigcache #3

investigate switch to bigcache #3

Conversation

mroth commented Dec 11, 2018

mroth commented Dec 11, 2018

mroth commented Dec 11, 2018