Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor toNormalisedLower: shorter and slightly faster. #14299

Merged
merged 7 commits into from
Jun 18, 2024

Conversation

colega
Copy link
Contributor

@colega colega commented Jun 14, 2024

This is a follow up on #14170

TL;DR, this version of code is much shorter, easier to understand (IMO) and it's also faster!

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
                                                             │     main      │                 new                 │
                                                             │    sec/op     │   sec/op     vs base                │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12       19.840n ±  0%   5.546n ± 0%  -72.05% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12       138.2n ±  0%   120.7n ± 1%  -12.63% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12       20.66n ±  1%   18.52n ± 5%  -10.36% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=false-12      142.6n ±  1%   145.0n ± 1%   +1.72% (p=0.004 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=true-12        21.60n ±  0%   19.54n ± 1%   -9.52% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=false-12       150.3n ±  4%   132.6n ± 1%  -11.81% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12         26.14n ±  2%   20.84n ± 1%  -20.31% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=false-12        143.8n ±  1%   130.8n ± 1%   -9.01% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12       67.97n ±  0%   48.21n ± 1%  -29.08% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12      1.743µ ±  2%   1.719µ ± 1%   -1.38% (p=0.017 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12      69.95n ±  2%   67.11n ± 1%   -4.07% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=false-12     1.721µ ±  0%   1.797µ ± 0%   +4.42% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=true-12       71.70n ± 14%   68.12n ± 1%   -4.99% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=false-12      1.779µ ±  2%   1.728µ ± 1%   -2.87% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12       147.70n ±  1%   99.92n ± 0%  -32.35% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=false-12       1.632µ ±  2%   1.652µ ± 0%   +1.19% (p=0.022 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12      550.6n ±  4%   407.6n ± 0%  -25.96% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12     14.79µ ±  1%   14.50µ ± 1%   -1.95% (p=0.002 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12     554.6n ±  1%   554.8n ± 1%        ~ (p=0.927 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12    14.67µ ±  0%   15.14µ ± 0%   +3.21% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12      542.8n ±  2%   551.5n ± 0%   +1.60% (p=0.027 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12     14.74µ ±  0%   14.55µ ± 0%   -1.30% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1205.5n ±  1%   814.5n ± 1%  -32.43% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12      13.76µ ±  1%   13.66µ ± 0%   -0.72% (p=0.035 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12      2.173µ ±  0%   1.606µ ± 0%  -26.09% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12     58.72µ ±  1%   58.03µ ± 2%   -1.17% (p=0.035 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12     2.196µ ±  1%   2.177µ ± 0%   -0.87% (p=0.001 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12    58.27µ ±  0%   59.97µ ± 0%   +2.93% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12      2.147µ ±  1%   2.095µ ± 5%        ~ (p=0.102 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12     58.73µ ±  1%   58.20µ ± 1%   -0.90% (p=0.001 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12       4.800µ ±  0%   3.265µ ± 1%  -31.97% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12      54.47µ ±  0%   54.53µ ± 1%        ~ (p=0.853 n=10)
geomean                                                         951.6n         832.9n       -12.47%

                                                             │     main     │                    new                    │
                                                             │     B/op     │     B/op      vs base                     │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12         16.00 ± 0%      0.00 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12       25.000 ± 0%     9.000 ± 0%   -64.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12        16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=first/ascii=false-12       20.00 ± 0%     17.00 ± 0%   -15.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=true-12         16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=false-12        27.00 ± 0%     11.00 ± 0%   -59.26% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12          16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=all/ascii=false-12         32.00 ± 0%     22.00 ± 0%   -31.25% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12        112.0 ± 0%       0.0 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12     1.159Ki ± 0%   1.050Ki ± 0%    -9.44% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12       112.0 ± 0%     112.0 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=first/ascii=false-12    1.126Ki ± 0%   1.104Ki ± 0%    -1.91% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=true-12        112.0 ± 0%     112.0 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=false-12     1.170Ki ± 0%   1.061Ki ± 0%    -9.35% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12         112.0 ± 0%     112.0 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=all/ascii=false-12      1.203Ki ± 0%   1.137Ki ± 0%    -5.52% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12     1.000Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12    5.862Ki ± 0%   4.862Ki ± 0%   -17.06% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12    1.000Ki ± 0%   1.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12   5.524Ki ± 0%   5.375Ki ± 0%    -2.70% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12     1.000Ki ± 0%   1.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12    5.975Ki ± 0%   4.975Ki ± 0%   -16.74% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1.000Ki ± 0%   1.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12     6.312Ki ± 0%   5.712Ki ± 0%    -9.51% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12     4.000Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12    21.41Ki ± 0%   17.41Ki ± 0%   -18.68% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12    4.000Ki ± 0%   4.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12   19.99Ki ± 0%   19.49Ki ± 0%    -2.50% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12     4.000Ki ± 0%   4.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12    21.89Ki ± 0%   17.89Ki ± 0%   -18.28% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12      4.000Ki ± 0%   4.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12     23.31Ki ± 0%   20.91Ki ± 0%   -10.30% (p=0.000 n=10)
geomean                                                          647.2                      ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                                                             │    main    │                   new                   │
                                                             │ allocs/op  │ allocs/op   vs base                     │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12       1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12      1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=first/ascii=false-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=true-12       1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=false-12      1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12        1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=all/ascii=false-12       2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12      1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12     5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=first/ascii=false-12    5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=false-12     5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12       1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=all/ascii=false-12      6.000 ± 0%   5.000 ± 0%   -16.67% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12     1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12    1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12   5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12     6.000 ± 0%   5.000 ± 0%   -16.67% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12     1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12    1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12   5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12     6.000 ± 0%   5.000 ± 0%   -16.67% (p=0.000 n=10)
geomean                                                        1.901                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean
Old description, in case you want to understand why there are so many commits here In the series of "optimising methods nobody cares about", I spent some optimising this one. I started from removing the redundant `isASCII` check, but I felt I couldn't just sent a PR on that so I tried to make it shorter and hopefully faster.

I tried several approaches, some with trade-off, finally ended in a version that optimizes the performance for an already-lowercase string, while not penalizing the "uppercase" in more than 20%.

Expand to see the benchmarks results
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
                                                             │     main      │                 new                 │
                                                             │    sec/op     │   sec/op     vs base                │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12       19.840n ±  0%   5.356n ± 2%  -73.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12       138.2n ±  0%   119.0n ± 1%  -13.90% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12       20.66n ±  1%   20.48n ± 1%   -0.82% (p=0.019 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=false-12      142.6n ±  1%   140.2n ± 1%   -1.61% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=true-12        21.60n ±  0%   22.35n ± 0%   +3.50% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=false-12       150.3n ±  4%   129.3n ± 1%  -13.94% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12         26.14n ±  2%   29.07n ± 1%  +11.21% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=false-12        143.8n ±  1%   129.8n ± 1%   -9.74% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12       67.97n ±  0%   39.77n ± 3%  -41.48% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12      1.743µ ±  2%   1.671µ ± 6%   -4.10% (p=0.014 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12      69.95n ±  2%   68.27n ± 2%   -2.41% (p=0.009 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=false-12     1.721µ ±  0%   1.756µ ± 0%   +2.06% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=true-12       71.70n ± 14%   70.92n ± 0%        ~ (p=0.122 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=false-12      1.779µ ±  2%   1.700µ ± 1%   -4.47% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12        147.7n ±  1%   163.5n ± 2%  +10.73% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=false-12       1.632µ ±  2%   1.627µ ± 0%   -0.31% (p=0.002 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12      550.6n ±  4%   401.5n ± 2%  -27.07% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12     14.79µ ±  1%   14.26µ ± 0%   -3.58% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12     554.6n ±  1%   548.4n ± 0%   -1.13% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12    14.67µ ±  0%   14.91µ ± 1%   +1.68% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12      542.8n ±  2%   556.1n ± 4%        ~ (p=0.796 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12     14.74µ ±  0%   14.44µ ± 0%   -2.01% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12       1.206µ ±  1%   1.445µ ± 1%  +19.87% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12      13.76µ ±  1%   13.60µ ± 0%   -1.15% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12      2.173µ ±  0%   1.594µ ± 1%  -26.67% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12     58.72µ ±  1%   56.95µ ± 1%   -3.01% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12     2.196µ ±  1%   2.159µ ± 1%   -1.71% (p=0.002 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12    58.27µ ±  0%   59.81µ ± 0%   +2.64% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12      2.147µ ±  1%   2.198µ ± 3%        ~ (p=0.617 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12     58.73µ ±  1%   57.30µ ± 4%   -2.43% (p=0.022 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12       4.800µ ±  0%   5.691µ ± 1%  +18.57% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12      54.47µ ±  0%   54.02µ ± 0%   -0.82% (p=0.000 n=10)
geomean                                                         951.6n         880.6n        -7.46%

                                                             │     main     │                    new                    │
                                                             │     B/op     │     B/op      vs base                     │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12         16.00 ± 0%      0.00 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12       25.000 ± 0%     9.000 ± 0%   -64.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12        16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=first/ascii=false-12       20.00 ± 0%     17.00 ± 0%   -15.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=true-12         16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=false-12        27.00 ± 0%     11.00 ± 0%   -59.26% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12          16.00 ± 0%     16.00 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=all/ascii=false-12         32.00 ± 0%     22.00 ± 0%   -31.25% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12        112.0 ± 0%       0.0 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12     1.159Ki ± 0%   1.050Ki ± 0%    -9.44% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12       112.0 ± 0%     112.0 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=first/ascii=false-12    1.126Ki ± 0%   1.104Ki ± 0%    -1.91% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=true-12        112.0 ± 0%     112.0 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=false-12     1.170Ki ± 0%   1.061Ki ± 0%    -9.35% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12         112.0 ± 0%     112.0 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=all/ascii=false-12      1.203Ki ± 0%   1.137Ki ± 0%    -5.52% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12     1.000Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12    5.862Ki ± 0%   4.862Ki ± 0%   -17.06% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12    1.000Ki ± 0%   1.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12   5.524Ki ± 0%   5.375Ki ± 0%    -2.70% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12     1.000Ki ± 0%   1.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12    5.975Ki ± 0%   4.975Ki ± 0%   -16.74% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1.000Ki ± 0%   1.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12     6.312Ki ± 0%   5.712Ki ± 0%    -9.51% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12     4.000Ki ± 0%   0.000Ki ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12    21.41Ki ± 0%   17.41Ki ± 0%   -18.68% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12    4.000Ki ± 0%   4.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12   19.99Ki ± 0%   19.49Ki ± 0%    -2.50% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12     4.000Ki ± 0%   4.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12    21.89Ki ± 0%   17.89Ki ± 0%   -18.28% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12      4.000Ki ± 0%   4.000Ki ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12     23.31Ki ± 0%   20.91Ki ± 0%   -10.30% (p=0.000 n=10)
geomean                                                          647.2                      ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

                                                             │    main    │                   new                   │
                                                             │ allocs/op  │ allocs/op   vs base                     │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12       1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12      1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=first/ascii=false-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=true-12       1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=false-12      1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12        1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=all/ascii=false-12       2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12      1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12     5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=first/ascii=false-12    5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=false-12     5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12       1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=all/ascii=false-12      6.000 ± 0%   5.000 ± 0%   -16.67% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12     1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12    1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12   5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12     6.000 ± 0%   5.000 ± 0%   -16.67% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12     1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12    1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12   5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12     1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12    5.000 ± 0%   4.000 ± 0%   -20.00% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12      1.000 ± 0%   1.000 ± 0%         ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12     6.000 ± 0%   5.000 ± 0%   -16.67% (p=0.000 n=10)
geomean                                                        1.901                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

All in all, I think that the new version is not just shorter, but also easier to read because of that: it fits in one screen, also some things I changed:

  • Renamed written to pos: it's more descriptive, IMO.
  • Moved written to the scope of the loop: outside of the loop we can just use b.Len(), which might be slightly slower to call, but benchmarks show it's not making overall results slower (it's inlined)
  • Removed the check before writing the rest of the string: the compiled code will check that anyway, no need to do it twice.
  • Lazily grown the strings builder only when needed.

I tried doing:

		if 'A' <= c && c <= 'Z' {
			b.WriteString(s[b.Len():i])
			b.WriteByte(c + 'a' - 'A')
		}

But that's way slower on all uppercase chars.

Interesting things I took from this: replacing strings.Builder with a bytes.Buffer with a stack-backed array is much slower: I thought that the reason is that unfortunately bytes.Buffer.WriteString is not inlined, as opposed to strings.Builder.WriteString()

  • Okay, so I tried the good old slice-append approach, backed by the stack array then allocating a string from that: it's also much slower.
  • I also tried just allocating a slice on the heap, and then building an unsafe string from that: also much slower
  • I wonder if one of the reasons is that strings.Builder does the bytealg.MakeNoZero call to grow, which is something that simple mortals like me can't do..., or maybe I can?

I tried doing this ugly hack (I wasn't going to propose it as a production solution, but since we're already in this rabbithole...)

func noZeroSlice(cap int) []byte {
	sb := strings.Builder{}
	sb.Grow(cap)
	s := sb.String()

	var b []byte
	*(*string)(unsafe.Pointer(&b)) = s
	(*reflect.SliceHeader)(unsafe.Pointer(&b)).Cap = cap
	return b
}

And then using that as a slice, with usual appends, and then yolo-ing the string: this way we could skip all the copyChecks that strings builder does. It turned out to be ~20% faster in the case when all chars are ASCII uppercase.

Ugly hack benchmarks
goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
                                                             │     main      │                 new                 │                yolo                 │
                                                             │    sec/op     │    sec/op     vs base               │   sec/op     vs base                │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12        19.84n ±  0%    18.96n ± 2%  -4.41% (p=0.000 n=10)   19.52n ± 1%   -1.61% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=none/ascii=false-12       138.2n ±  0%    131.0n ± 1%  -5.14% (p=0.000 n=10)   137.3n ± 9%        ~ (p=0.490 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=true-12       20.66n ±  1%    19.86n ± 0%  -3.87% (p=0.000 n=10)   20.30n ± 5%        ~ (p=0.159 n=10)
ToNormalizedLower/length=10/uppercase=first/ascii=false-12      142.6n ±  1%    149.5n ± 1%  +4.88% (p=0.001 n=10)   154.0n ± 2%   +8.07% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=true-12        21.60n ±  0%    22.23n ± 1%  +2.92% (p=0.000 n=10)   22.27n ± 2%   +3.13% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=false-12       150.3n ±  4%    145.3n ± 1%  -3.29% (p=0.000 n=10)   147.2n ± 2%   -2.06% (p=0.003 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=true-12         26.14n ±  2%    26.64n ± 0%  +1.89% (p=0.022 n=10)   24.45n ± 1%   -6.50% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=all/ascii=false-12        143.8n ±  1%    136.8n ± 4%  -4.83% (p=0.001 n=10)   138.6n ± 1%   -3.55% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=true-12       67.97n ±  0%    68.26n ± 2%  +0.43% (p=0.002 n=10)   69.22n ± 3%        ~ (p=0.101 n=10)
ToNormalizedLower/length=100/uppercase=none/ascii=false-12      1.743µ ±  2%    1.718µ ± 1%  -1.43% (p=0.041 n=10)   1.721µ ± 4%        ~ (p=0.148 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=true-12      69.95n ±  2%    68.53n ± 1%  -2.03% (p=0.000 n=10)   69.03n ± 0%   -1.33% (p=0.014 n=10)
ToNormalizedLower/length=100/uppercase=first/ascii=false-12     1.721µ ±  0%    1.781µ ± 0%  +3.49% (p=0.000 n=10)   1.806µ ± 1%   +4.94% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=true-12       71.70n ± 14%    70.61n ± 1%       ~ (p=0.089 n=10)   72.08n ± 0%        ~ (p=0.670 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=false-12      1.779µ ±  2%    1.752µ ± 2%  -1.55% (p=0.014 n=10)   1.757µ ± 1%   -1.26% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=true-12        147.7n ±  1%    140.6n ± 2%  -4.84% (p=0.000 n=10)   113.6n ± 2%  -23.09% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=all/ascii=false-12       1.632µ ±  2%    1.649µ ± 2%  +1.04% (p=0.030 n=10)   1.634µ ± 0%        ~ (p=0.423 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12      550.6n ±  4%    540.6n ± 1%  -1.81% (p=0.002 n=10)   556.0n ± 3%        ~ (p=0.218 n=10)
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12     14.79µ ±  1%    14.56µ ± 1%  -1.57% (p=0.001 n=10)   14.95µ ± 4%   +1.04% (p=0.029 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12     554.6n ±  1%    540.7n ± 1%  -2.51% (p=0.000 n=10)   562.0n ± 9%   +1.34% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12    14.67µ ±  0%    15.05µ ± 1%  +2.60% (p=0.000 n=10)   15.36µ ± 2%   +4.69% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12      542.8n ±  2%    525.8n ± 3%  -3.13% (p=0.003 n=10)   555.7n ± 1%   +2.38% (p=0.009 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12     14.74µ ±  0%    14.48µ ± 2%  -1.73% (p=0.009 n=10)   14.70µ ± 1%        ~ (p=0.529 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1205.5n ±  1%   1200.0n ± 0%       ~ (p=0.867 n=10)   944.7n ± 1%  -21.63% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12      13.76µ ±  1%    13.75µ ± 1%       ~ (p=0.684 n=10)   13.84µ ± 0%        ~ (p=0.325 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12      2.173µ ±  0%    2.158µ ± 1%       ~ (p=0.085 n=10)   2.207µ ± 9%   +1.56% (p=0.001 n=10)
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12     58.72µ ±  1%    57.99µ ± 1%  -1.24% (p=0.002 n=10)   59.18µ ± 6%        ~ (p=0.579 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12     2.196µ ±  1%    2.143µ ± 1%  -2.44% (p=0.000 n=10)   2.195µ ± 5%        ~ (p=1.000 n=10)
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12    58.27µ ±  0%    60.13µ ± 0%  +3.20% (p=0.000 n=10)   60.91µ ± 1%   +4.54% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12      2.147µ ±  1%    2.154µ ± 6%       ~ (p=0.898 n=10)   2.237µ ± 0%   +4.19% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12     58.73µ ±  1%    58.71µ ± 5%       ~ (p=0.837 n=10)   58.52µ ± 1%        ~ (p=0.424 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12       4.800µ ±  0%    4.968µ ± 3%  +3.51% (p=0.000 n=10)   3.717µ ± 1%  -22.55% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12      54.47µ ±  0%    56.57µ ± 3%  +3.86% (p=0.000 n=10)   54.58µ ± 1%        ~ (p=0.165 n=10)
geomean                                                         951.6n          945.1n       -0.68%                  934.7n        -1.77%

                                                             │     main     │                  new                   │                  yolo                  │
                                                             │     B/op     │     B/op      vs base                  │     B/op      vs base                  │
ToNormalizedLower/length=10/uppercase=none/ascii=true-12         16.00 ± 0%     16.00 ± 0%        ~ (p=1.000 n=10) ¹     16.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=none/ascii=false-12        25.00 ± 0%     25.00 ± 0%        ~ (p=1.000 n=10) ¹     25.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=first/ascii=true-12        16.00 ± 0%     16.00 ± 0%        ~ (p=1.000 n=10) ¹     16.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=first/ascii=false-12       20.00 ± 0%     27.00 ± 0%  +35.00% (p=0.000 n=10)       27.00 ± 0%  +35.00% (p=0.000 n=10)
ToNormalizedLower/length=10/uppercase=last/ascii=true-12         16.00 ± 0%     16.00 ± 0%        ~ (p=1.000 n=10) ¹     16.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=last/ascii=false-12        27.00 ± 0%     27.00 ± 0%        ~ (p=1.000 n=10) ¹     27.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=all/ascii=true-12          16.00 ± 0%     16.00 ± 0%        ~ (p=1.000 n=10) ¹     16.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=10/uppercase=all/ascii=false-12         32.00 ± 0%     32.00 ± 0%        ~ (p=1.000 n=10) ¹     32.00 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=none/ascii=true-12        112.0 ± 0%     112.0 ± 0%        ~ (p=1.000 n=10) ¹     112.0 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=none/ascii=false-12     1.159Ki ± 0%   1.159Ki ± 0%        ~ (p=1.000 n=10) ¹   1.159Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=first/ascii=true-12       112.0 ± 0%     112.0 ± 0%        ~ (p=1.000 n=10) ¹     112.0 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=first/ascii=false-12    1.126Ki ± 0%   1.170Ki ± 0%   +3.90% (p=0.000 n=10)     1.170Ki ± 0%   +3.90% (p=0.000 n=10)
ToNormalizedLower/length=100/uppercase=last/ascii=true-12        112.0 ± 0%     112.0 ± 0%        ~ (p=1.000 n=10) ¹     112.0 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=last/ascii=false-12     1.170Ki ± 0%   1.170Ki ± 0%        ~ (p=1.000 n=10) ¹   1.170Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=all/ascii=true-12         112.0 ± 0%     112.0 ± 0%        ~ (p=1.000 n=10) ¹     112.0 ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=100/uppercase=all/ascii=false-12      1.203Ki ± 0%   1.203Ki ± 0%        ~ (p=1.000 n=10) ¹   1.203Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=none/ascii=true-12     1.000Ki ± 0%   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=none/ascii=false-12    5.862Ki ± 0%   5.862Ki ± 0%        ~ (p=1.000 n=10) ¹   5.862Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=first/ascii=true-12    1.000Ki ± 0%   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=first/ascii=false-12   5.524Ki ± 0%   5.975Ki ± 0%   +8.15% (p=0.000 n=10)     5.975Ki ± 0%   +8.15% (p=0.000 n=10)
ToNormalizedLower/length=1000/uppercase=last/ascii=true-12     1.000Ki ± 0%   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=last/ascii=false-12    5.975Ki ± 0%   5.975Ki ± 0%        ~ (p=1.000 n=10) ¹   5.975Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=all/ascii=true-12      1.000Ki ± 0%   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹   1.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=1000/uppercase=all/ascii=false-12     6.312Ki ± 0%   6.312Ki ± 0%        ~ (p=1.000 n=10) ¹   6.312Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=none/ascii=true-12     4.000Ki ± 0%   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=none/ascii=false-12    21.41Ki ± 0%   21.41Ki ± 0%        ~ (p=1.000 n=10) ¹   21.41Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=first/ascii=true-12    4.000Ki ± 0%   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=first/ascii=false-12   19.99Ki ± 0%   21.89Ki ± 0%   +9.51% (p=0.000 n=10)     21.89Ki ± 0%   +9.51% (p=0.000 n=10)
ToNormalizedLower/length=4000/uppercase=last/ascii=true-12     4.000Ki ± 0%   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=last/ascii=false-12    21.89Ki ± 0%   21.89Ki ± 0%        ~ (p=1.000 n=10)     21.89Ki ± 0%        ~ (p=1.000 n=10)
ToNormalizedLower/length=4000/uppercase=all/ascii=true-12      4.000Ki ± 0%   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹   4.000Ki ± 0%        ~ (p=1.000 n=10) ¹
ToNormalizedLower/length=4000/uppercase=all/ascii=false-12     23.31Ki ± 0%   23.31Ki ± 0%        ~ (p=1.000 n=10) ¹   23.31Ki ± 0%        ~ (p=1.000 n=10) ¹
geomean                                                          647.2          657.6        +1.60%                      657.6        +1.60%
¹ all samples are equal

@colega colega marked this pull request as draft June 14, 2024 07:38
@colega colega marked this pull request as ready for review June 14, 2024 08:09
@colega colega changed the title Refactor toNormalisedLower: shorter and slightly faster. Refactor toNormalisedLower: shorter and ~slightly faster~. Jun 14, 2024
@colega colega changed the title Refactor toNormalisedLower: shorter and ~slightly faster~. Refactor toNormalisedLower: shorter and slightly faster. Jun 14, 2024
@colega colega force-pushed the refactor-to-normalised-lower branch from 1654677 to f0439f0 Compare June 14, 2024 14:08
@colega colega marked this pull request as draft June 14, 2024 16:38
@colega colega marked this pull request as ready for review June 14, 2024 16:49
@colega
Copy link
Contributor Author

colega commented Jun 15, 2024

This is ready for review.

@@ -215,3 +216,7 @@ func contains(s []Label, n string) bool {
}
return false
}

func yoloString(b []byte) string {
return *((*string)(unsafe.Pointer(&b)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using string(buf) be more efficient and safer, or are we opting for *((*string)(unsafe.Pointer(&b))) solely to avoid an allocation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the trick to avoid an allocation (and also avoid copying the contents, which would happen if we did string(buf))

"FOO": "foo",
"Foo": "foo",
"foO": "foo",
"fOo": "foo",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to incorporate a test case into both TestToNormalisedLower and BenchmarkToNormalizedLower? The test case should include all uppercase ASCII characters, with the final character being non-ASCII.
Something like "AAAAſ".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I benchmarked this:

	b.Run("ascii uppercase then utf8", func(b *testing.B) {
		inputs := make([]string, 10)
		for i := range inputs {
			inputs[i] = benchCase(9, "all", true, i) + "Ж"
		}
		b.ResetTimer()
		for n := 0; n < b.N; n++ {
			toNormalisedLower(inputs[n%len(inputs)])
		}
	})

And the results were:

goos: darwin
goarch: arm64
pkg: github.com/prometheus/prometheus/model/labels
                                               │ main_asciithenunicode │     new_asciithenunicode     │
                                               │        sec/op         │   sec/op     vs base         │
ToNormalizedLower/ascii_uppercase_then_utf8-12             85.16n ± 0%   84.91n ± 1%  ~ (p=0.221 n=6)

                                               │ main_asciithenunicode │     new_asciithenunicode      │
                                               │         B/op          │    B/op     vs base           │
ToNormalizedLower/ascii_uppercase_then_utf8-12              32.00 ± 0%   32.00 ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

                                               │ main_asciithenunicode │     new_asciithenunicode      │
                                               │       allocs/op       │ allocs/op   vs base           │
ToNormalizedLower/ascii_uppercase_then_utf8-12              2.000 ± 0%   2.000 ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

Given that fitting this benchmark into current benchmarks structure would require a refactor, and it's not measuring any difference, I'd prefer to leave the benchmarks unmodified in this PR for the sake of clarity.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
There's now in common code without buildtags.

Signed-off-by: Oleg Zaytsev <mail@olegzaytsev.com>
@colega colega force-pushed the refactor-to-normalised-lower branch from 818419d to 4340269 Compare June 17, 2024 07:07
Copy link
Member

@jesusvazquez jesusvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Two learnings from this PR:

  • utf8.RuneSelf lets you know if a character is single byte or multibyte
  • our yoloString method to avoid copying the string. Wondering how different the performance would be if we used strings.Builder instead.

I agree method is more readable now, used split view to compare 👍

@jesusvazquez jesusvazquez enabled auto-merge (squash) June 18, 2024 09:47
@colega
Copy link
Contributor Author

colega commented Jun 18, 2024

Wondering how different the performance would be if we used strings.Builder instead.

strings.Builder was in one of the intermediate commits, we can't modify string.Builder's contents (boo!), so we need to write the string as we go, instead of this approach where we can just copy the entire string (and rely on CPU & compiler optimizations to do that efficiently) and then modify the uppercase chars.

@jesusvazquez jesusvazquez merged commit 4f78cc8 into prometheus:main Jun 18, 2024
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants