Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: replace []rune(s)[0] to decoderune(s, 0) to reduce allocations #45260

Closed
tdakkota opened this issue Mar 27, 2021 · 7 comments
Closed
Labels

Comments

@tdakkota
Copy link

@tdakkota tdakkota commented Mar 27, 2021

What version of Go are you using (go version)?

$ go version
go version devel +359f44910f Fri Mar 26 19:40:37 2021 +0000 windows/amd64

Does this issue reproduce with the latest release?

Yes.

What did you do?

https://play.golang.org/p/Y81VmhfOpXV
and got

BenchmarkRune/DecodeRune-4         	272125996	         4.135 ns/op	       0 B/op	       0 allocs/op
BenchmarkRune/[]rune(s)[0]-4       	    9229	    110860 ns/op	  163840 B/op	       1 allocs/op

What did you expect to see?

Equal performance of DecodeRuneInString(s) and []rune(s)[0] without []rune allocation.

What did you see instead?

[]rune(s)[0] causes unnecessary allocation of rune slice using runtime.stringtoslicerune (https://go.godbolt.org/z/nb9538EvP). This expression just can be replaced by runtime.decoderune and bounds check.

@martisch
Copy link
Contributor

@martisch martisch commented Mar 27, 2021

I dont think this pattern is common enough to warrant special handling. Generally the unicode packages should be preferred over []rune. Otherwise we end up in a state were some []rune patterns are optimised and can be used and others arent or dont have equivalent []rune expressions.

All the optimisations on strings also create an asymmetry because equivalent byte slice expressions are not optimized.

Specific to this case I think its also not very practical to use []rune(s)[0] since it would need a check for an empty string to not cause index failures. Since to me its seems that check can be easily forgotten I think its better to use DecodeRuneInString. []rune(s)[0] also doesnt allow to differentiate the case of invalid encoding vs decoding a RuneError rune which is needed for string processing loops which try to decode a sequence of runes from a string.

As []rune(s)[0] has some new pitfalls and less utility than DecodeRuneInString I think we should not further encourage its usage by making that pattern faster.

Encouraging []rune(s)[0] over DecodeRuneInString also increases the performance differences between Go compilers as other Go compilers would also need to implement the optimisation. It seems better to me to just to keep on using the already fast version of DecodeRuneInString that doesnt need any special pattern but only generic code optimisations.

As a side note: If I will be allowed to remove compiler special handling of len([]rune) I would remove it. Which is I think the only special handling of this case.

Loading

@inliquid
Copy link

@inliquid inliquid commented Mar 27, 2021

Maybe offtopic, but let me ask.

As a side note: If I will be allowed to remove compiler special handling of len([]rune) I would remove it. Which is I think the only special handling of this case.

Why you think it's better to remove? I'm using this construct often and it has very good bench results (near the same as utf8.*) and readability.

Loading

@networkimprov
Copy link

@networkimprov networkimprov commented Mar 27, 2021

Possible vet warning for the []rune(s)[0] form?

cc @mdempsky

Loading

@guodongli-google
Copy link

@guodongli-google guodongli-google commented Mar 29, 2021

I did some code search. The number of occurrences of []rune(s)[?] is not super high, e.g. a couple hundreds in 5m lines of code, so it may merit a static checker in StaticCheck, but it is not a good candidate as a vet checker because:

  • It does not occur very often.
  • It is about performance rather than correctness.
  • s may be a short string such that the cost of rune(s) is very low. This issue is serious only if s is a very long string.

Loading

@mdempsky
Copy link
Member

@mdempsky mdempsky commented Mar 29, 2021

I agree that the benefits here for adding an optimization or vet check seem to small to justify the added maintenance cost of that code. It should be easy enough for someone to write their own linter for this using x/tools/go/analysis.

Loading

@mdempsky mdempsky closed this Mar 29, 2021
@timothy-king
Copy link
Contributor

@timothy-king timothy-king commented Mar 29, 2021

@dominikh staticcheck might be a possible alternative home.

(For more on criteria for vet checkers: https://golang.org/src/cmd/vet/README)

Loading

@dominikh
Copy link
Member

@dominikh dominikh commented Mar 30, 2021

Feel free to file an issue on Staticcheck at https://github.com/dominikh/go-tools/issues

Loading

quasilyte pushed a commit to go-critic/go-critic that referenced this issue Apr 29, 2021
Added a simple checker to find []rune("abc")[0] and suggest utf8.DecodeRuneInString("abc").
Expressions like

```
[]rune("abc")[0]
```

that cause unnecessary rune slice allocation.

See golang/go#45260
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
8 participants