Skip to content

runtime, internal/buildcfg: 2.74% regression in Invocation/interpreter/random_mat_mul_size_20-88 sec/op on linux/amd64 at 2a93576 #79703

@dr2chase

Description

@dr2chase

Discovered a regression in sec/op of 2.74% for benchmark Invocation/interpreter/random_mat_mul_size_20-88 at 2a93576.

This was "interrnal/buildcfg: enable SizeSpecializedMalloc by default", a performance regression seems unpossible, yet here we are.

TLDR: to write the repro instruction I ran the benchmark, and at least for me on my laptop, it got faster, not slower.
But for posterity...

It's a bent benchmark, to reproduce:

# choose directories as appropriate
export BASELINE=/tmp/baseline
export TEST=/tmp/test

go install golang.org/x/benchmarks/cmd/bent@latest
go install golang.org/x/perf/cmd/benchstat@latest

git clone git clone https://go.googlesource.com/go $BASELINE
git clone git clone https://go.googlesource.com/go $TEST
(cd $BASELINE/src; git fetch; git checkout 2a93576965^ ; ./make.bash )
(cd $TEST/src;         git fetch; git checkout 2a93576965  ; ./make.bash )

mkdir foo; cd foo
bent -I # <- capital letter "i" , as in GH I JKL

# You need a configurations.toml
cat > configurations.toml <<\\EOF
[[Configurations]]
  Name = "Baseline"
  Root = "$BASELINE"
 
[[Configurations]]
  Name = "Test"
  Root = "$TEST"
\EOF

# run 25 iterations of the wazero benchmark, randomly linked
bent -b wazero -R=25

# look at the benchmark results
cd bench
alias bs='benchstat -col toolchain -ignore pkg,shortname'

# you want the last two stdout files; this is a real example run
# and, uh, looks like it did not reproduce -- the new version is faster, at least on an Apple laptop.
bs 20260527T190219.Baseline.stdout 20260527T190219.Test.stdout
goos: darwin
goarch: arm64
cpu: Apple M4
                                                      │  Baseline   │                Test                │
                                                      │   sec/op    │   sec/op     vs base               │
Invocation/interpreter/fib_for_20-10                    1.148m ± 0%   1.056m ± 1%  -8.04% (p=0.000 n=25)
Invocation/interpreter/string_manipulation_size_50-10   479.4µ ± 0%   463.6µ ± 0%  -3.28% (p=0.000 n=25)
Invocation/interpreter/random_mat_mul_size_20-10        3.485m ± 0%   3.464m ± 1%  -0.60% (p=0.006 n=25)
Compilation/with_extern_cache-10                        142.9µ ± 0%   142.8µ ± 0%       ~ (p=0.430 n=25)
Compilation/without_extern_cache-10                     7.393m ± 1%   7.396m ± 1%       ~ (p=0.893 n=25)
geomean                                                 1.152m        1.124m       -2.44%

Metadata

Metadata

Assignees

Labels

NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

Type

No type
No fields configured for issues without a type.

Projects

Status
Todo

Relationships

None yet

Development

No branches or pull requests

Issue actions