Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
cmd/compile: optimize len(rune(string)) to just count runes #24923
What version of Go are you using (
referenced this issue
Apr 18, 2018
If its used often and decided to be optimized a possible improvement would be to detect the pattern in compiler walk pass and replace it with a new runecount runtime builtin function that is:
For even better speed on non-ascii input the decoderune function could be inlined into runeCount and tuned to not need to store the runes. Then utf8.RuneCountInString could be made to return len(rune(s)) to use the same code. Also see: https://go-review.googlesource.com/c/go/+/33637
changed the title
cmd/compile: len(rune(string)) performs worst than utf8.RuneCountInString
Apr 18, 2018
Here is the benchmark result
I find the result surprising, since, after b9a59d9, the correct method to count runes in a string now results in worse performance compared to the incorrect method.
Using utf8.RuneCountInString on a constant string is probably rare, so I'm not sure if optimizing this case is worth the effort. But from b9a59d9 the code is already here.
That len(rune()) is faster on constant strings has been the case before b9a59d9 since at least go1.4 (see benchmarks) and has not been changed by the cl/108985 :
And is due to the compiler optimizing rune(constantstring) at compile time in:
Special casing utf8.RuneCountInString in the same manner by detecting the function name in the compiler would mean that copying the utf8 code to another package will result in performance loss which would also be surprising.
@martisch Thanks for the clarification. Special cases are no good, but also the fact that len(rune(s)) and utf8.RuneCountInString performances with constant and non constant strings are so different is probably no good.
By the way, it seems there is a typo in the comment for the isRuneCount function.