-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor toNormalisedLower
: shorter and slightly faster.
#14299
Refactor toNormalisedLower
: shorter and slightly faster.
#14299
Conversation
toNormalisedLower
: shorter and slightly faster.toNormalisedLower
: shorter and ~slightly faster~.
toNormalisedLower
: shorter and ~slightly faster~.toNormalisedLower
: shorter and slightly faster.
1654677
to
f0439f0
Compare
This is ready for review. |
@@ -215,3 +216,7 @@ func contains(s []Label, n string) bool { | |||
} | |||
return false | |||
} | |||
|
|||
func yoloString(b []byte) string { | |||
return *((*string)(unsafe.Pointer(&b))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would using string(buf) be more efficient and safer, or are we opting for *((*string)(unsafe.Pointer(&b))) solely to avoid an allocation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're using the trick to avoid an allocation (and also avoid copying the contents, which would happen if we did string(buf)
)
"FOO": "foo", | ||
"Foo": "foo", | ||
"foO": "foo", | ||
"fOo": "foo", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to incorporate a test case into both TestToNormalisedLower and BenchmarkToNormalizedLower? The test case should include all uppercase ASCII characters, with the final character being non-ASCII.
Something like "AAAAſ".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I benchmarked this:
b.Run("ascii uppercase then utf8", func(b *testing.B) {
inputs := make([]string, 10)
for i := range inputs {
inputs[i] = benchCase(9, "all", true, i) + "Ж"
}
b.ResetTimer()
for n := 0; n < b.N; n++ {
toNormalisedLower(inputs[n%len(inputs)])
}
})
And the results were:
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
│ main_asciithenunicode │ new_asciithenunicode │
│ sec/op │ sec/op vs base │
ToNormalizedLower/ascii_uppercase_then_utf8-12 85.16n ± 0% 84.91n ± 1% ~ (p=0.221 n=6)
│ main_asciithenunicode │ new_asciithenunicode │
│ B/op │ B/op vs base │
ToNormalizedLower/ascii_uppercase_then_utf8-12 32.00 ± 0% 32.00 ± 0% ~ (p=1.000 n=6) ¹
¹ all samples are equal
│ main_asciithenunicode │ new_asciithenunicode │
│ allocs/op │ allocs/op vs base │
ToNormalizedLower/ascii_uppercase_then_utf8-12 2.000 ± 0% 2.000 ± 0% ~ (p=1.000 n=6) ¹
¹ all samples are equal
Given that fitting this benchmark into current benchmarks structure would require a refactor, and it's not measuring any difference, I'd prefer to leave the benchmarks unmodified in this PR for the sake of clarity.
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
There's now in common code without buildtags. Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
818419d
to
4340269
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Two learnings from this PR:
- utf8.RuneSelf lets you know if a character is single byte or multibyte
- our yoloString method to avoid copying the string. Wondering how different the performance would be if we used strings.Builder instead.
I agree method is more readable now, used split view to compare 👍
|
This is a follow up on #14170
TL;DR, this version of code is much shorter, easier to understand (IMO) and it's also faster!
Old description, in case you want to understand why there are so many commits here
In the series of "optimising methods nobody cares about", I spent some optimising this one. I started from removing the redundant `isASCII` check, but I felt I couldn't just sent a PR on that so I tried to make it shorter and hopefully faster.I tried several approaches, some with trade-off, finally ended in a version that optimizes the performance for an already-lowercase string, while not penalizing the "uppercase" in more than 20%.
Expand to see the benchmarks results
All in all, I think that the new version is not just shorter, but also easier to read because of that: it fits in one screen, also some things I changed:
written
topos
: it's more descriptive, IMO.written
to the scope of the loop: outside of the loop we can just useb.Len()
, which might be slightly slower to call, but benchmarks show it's not making overall results slower (it's inlined)I tried doing:
But that's way slower on all uppercase chars.
Interesting things I took from this: replacing
strings.Builder
with abytes.Buffer
with a stack-backed array is much slower: I thought that the reason is that unfortunatelybytes.Buffer.WriteString
is not inlined, as opposed tostrings.Builder.WriteString()
strings.Builder
does thebytealg.MakeNoZero
call to grow, which is something that simple mortals like me can't do..., or maybe I can?I tried doing this ugly hack (I wasn't going to propose it as a production solution, but since we're already in this rabbithole...)
And then using that as a slice, with usual appends, and then yolo-ing the string: this way we could skip all the
copyChecks
that strings builder does. It turned out to be ~20% faster in the case when all chars are ASCII uppercase.Ugly hack benchmarks