-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Description
go version devel +214be5b302 Sun Mar 26 04:40:20 2017 +0000 linux/amd64
I was looking at profiles and noticed that strings.FieldsFunc was surprisingly hot.
In the following gist, I measure variations on splitting functions, including a proposed improvement of FieldsFunc that's 40% faster.
https://gist.github.com/gaal/497361737dd55125cf2de998b49d948b
Summarizing the results there:
BenchmarkFields/Fields-12 500000 2327 ns/op 73.04 MB/s
BenchmarkFields/FieldsFuncUnicodeIsSpace-12 1000000 2294 ns/op 74.10 MB/s
BenchmarkFields/FieldsFuncAltUnicodeIsSpace-12 1000000 1394 ns/op 121.94 MB/s
BenchmarkFields/Split-12 3000000 543 ns/op 312.93 MB/s
strings.Split does less work than Fields etc., so it's not surprising that it's faster; I include it as a bound of sorts on potential savings.
FieldsFunc's slowness comes in part by it calling its predicate twice as often as it needs to: once to count the length of the output slice, and again to extract the fields. The alternate implementation has the same length in lines of code but instead finds the offsets to the field spans in one pass and copies them in another. It does the spans housekeeping using a preallocated slice that does not escape the function (the parameter 16 was chosen arbitrarily; presumably most clients will not have a large amount of inputs with many fields).