TL;DR: Prometheus counters use atomic primitives, which is slow. If I didn't do something stupid, the buffered counter implementation gives potenial speedup of 200x in hot loops across multiple goroutines, and up to 6x speedup for "local" use. The buffered version approaches "you can't do it faster" version.
Also may apply to other things like gauges.
Full version plus benchmarks: https://github.com/ppanyukov/go-bench/tree/master/counter
TL;DR: Prometheus counters use atomic primitives, which is slow. If I didn't do something stupid, the buffered counter implementation gives potenial speedup of 200x in hot loops across multiple goroutines, and up to 6x speedup for "local" use. The buffered version approaches "you can't do it faster" version.
Also may apply to other things like gauges.
Full version plus benchmarks: https://github.com/ppanyukov/go-bench/tree/master/counter