1brc-go

Parse and aggregate 1B rows of text very quickly using Go for the 1BRC challenge.

Mainly done as an excuse to play around with pprof.

_{NOTE: Originally developed at elh/1brc-go and copied into gunnarmorling/1brc.}

Performance

UPDATE: with hyperfine parameter testing, got a 20% latency reduction by changing the read buffer size via PARSE_CHUNK_SIZE_MB.

make evaluate
# Benchmark 1: ./bin/1brc-go 2>&1
#   Time (mean ± σ):      4.665 s ±  0.028 s    [User: 39.196 s, System: 1.816 s]
#   Range (min … max):    4.634 s …  4.695 s    5 runs

Original Jan 14 testing [Outdated]

On Jan 14, as measured on my 2023 Apple M2 Pro Macbook with 1brc's evaluate.sh script, this performs very competitively against the top Java and Go submissions. At the very least this means, I have put in some effort optimizing this for the machine I have in front of me. There is a pretty funny lesson to be learned here that everyone's solution is the fastest on their own machine :)

Props to AlexanderYastrebov who is pushing on enabling language agnostic solutions in the leaderboard. Very curious now to see where this stacks up if I can run it on the test environment. I haven't looked at other Go solution code yet, but the discussion his started would motivate me to write my own.

#	Result (m:s.ms)	Implementation	JDK	Submitter	Notes
1	00:06.876	link	🔷 Go 1.21.5	Eugene Huang	Mine! 👈
2	00:07.602	link	🔷 Go 1.21.5	Corlin Palmer	Go implementation
3	00:13.765	link	21.0.1-graal	Roy van Rijn	GraalVM native binary
	00:13.989	link	21.0.1-graal	Artsiom Korzun
	00:14.044	link	21.0.1-graal	Thomas Wuerthinger	GraalVM native binary
	00:14.464	link	🔷 Go 1.21.5	Alexander Yastrebov	Go implementation
	00:14.839	link	21.0.1-graal	Marko Topolnik
	00:14.949	link	21.0.1-open	merykittyunsafe
	00:16.075	link	🔷 Go 1.21.5	Jan-Otto Kröpke	Go implementation

Usage

# builds the go executable and runs the following test
make run

# Result:
# ./bin/1brc-go  37.90s user 3.98s system 590% cpu 7.089 total
diff measurements.out <(time ./bin/1brc-go)

# takes an optional single command line arg for the measurements file path. If not provided, defaults to `measurements.txt`
# PROFILE is optional and enables profiling if "true"
go run main.go

Add a gitignore-d measurements.txt file in the project root or use command line arg to override the path.

Log

Times measured on 2023 Apple M2 Pro Macbook with 10 cores and 16GB RAM.

140s - Naive implementation. Reading lines with Scanner.Text.
83s - Using Scanner.ReadSlice to reduce a massive string allocation.
54s - Using file.Read directly and avoided some bytes helper functions to iterate bytes in a single scan.
45s - Using a far larger buffer to reduce IO latency. Removed a redundant map lookup. Using unsafe.String to avoid additional string allocs.
35s - Implemented a custom, simplified replacement for strconv.ParseFloat.
- Most of the time just goes into scanning byte slices. Biggest other bottleneck is map[string]*Stats lookups. Decently pleased with single threaded optimizations so now making it concurrent.
7s - Concurrently read and parse the file as byte index addressed chunks, then merge results in a single final goroutine. See tunable numParsers and parseChunkSize parameters.
5.3s - Select concurrency and read buffer size parameters for my machine using hyperfine parameter testing.
5.3s - Profile-guided optimization makes some compilation changes but does not affect runtime performance.
4.7s - Change concurrency and read buffer size parameters again. 14 goroutines (for 10 CPU cores) and 16MB parser chunk size.

Ideas

Optimize concurrency.
- Pipeline more work.
- ~~Tune concurrency and read buffer size parameters. I picked current values for my machine very quickly and unscientifically.~~ Picked w/ hyperfine
Custom map/set implementation for string->float
- try dolthub/swiss?
Memory arena (or other optimizations of locality)?
Faster way to iterate bytes?

Reading

https://github.com/DataDog/go-profiler-notes/blob/main/guide/README.md#cpu-profiler ⭐️
https://pkg.go.dev/bufio
https://pkg.go.dev/unsafe#String
https://go.dev/doc/pgo
https://go.dev/src/arena/arena.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

1brc-go

Performance

Original Jan 14 testing [Outdated]

Usage

Log

Ideas

Reading

Files

README.md

Latest commit

History

README.md

File metadata and controls

1brc-go

Performance

Original Jan 14 testing [Outdated]

Usage

Log

Ideas

Reading