Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perf experiment] Ignore inline(always) in unoptimized builds #121417

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

saethlin
Copy link
Member

Yes I know we have a codegen test for this. But based on this perf run I'm concerned this is having unexpected perf implications so I want to measure what they are: #121369 (comment)

r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Feb 21, 2024
@saethlin
Copy link
Member Author

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Feb 21, 2024
@saethlin saethlin removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 21, 2024
@bors
Copy link
Contributor

bors commented Feb 21, 2024

⌛ Trying commit 7a29818 with merge 393ef12...

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 21, 2024
[perf experiment] Ignore inline(always) in unoptimized builds

Yes I know we have a codegen test for this. But based on this perf run I'm concerned this is having unexpected perf implications so I want to measure what they are: rust-lang#121369 (comment)

r? `@ghost`
@rust-log-analyzer
Copy link
Collaborator

The job x86_64-gnu-llvm-16 failed! Check out the build log: (web) (plain)

Click to see the possible cause of the failure (guessed by this bot)
GITHUB_ENV=/home/runner/work/_temp/_runner_file_commands/set_env_f7399dcc-57d6-406b-bdc5-3a3508673bf1
GITHUB_EVENT_NAME=pull_request
GITHUB_EVENT_PATH=/home/runner/work/_temp/_github_workflow/event.json
GITHUB_GRAPHQL_URL=https://api.github.com/graphql
GITHUB_HEAD_REF=no-opt-no-inline
GITHUB_JOB=pr
GITHUB_PATH=/home/runner/work/_temp/_runner_file_commands/add_path_f7399dcc-57d6-406b-bdc5-3a3508673bf1
GITHUB_REF=refs/pull/121417/merge
GITHUB_REF_NAME=121417/merge
GITHUB_REF_PROTECTED=false
---
#12 writing image sha256:943f04602bd649d1b31cb5133d3394653cdbb758d5a05d5021c19524730c733e done
#12 naming to docker.io/library/rust-ci done
#12 DONE 9.8s
##[endgroup]
Setting extra environment values for docker:  --env ENABLE_GCC_CODEGEN=1 --env GCC_EXEC_PREFIX=/usr/lib/gcc/
[CI_JOB_NAME=x86_64-gnu-llvm-16]
##[group]Clock drift check
  local time: Wed Feb 21 22:46:43 UTC 2024
  network time: Wed, 21 Feb 2024 22:46:43 GMT
  network time: Wed, 21 Feb 2024 22:46:43 GMT
##[endgroup]
sccache: Starting the server...
##[group]Configure the build
configure: processing command line
configure: 
configure: build.configure-args := ['--build=x86_64-unknown-linux-gnu', '--llvm-root=/usr/lib/llvm-16', '--enable-llvm-link-shared', '--set', 'rust.thin-lto-import-instr-limit=10', '--set', 'change-id=99999999', '--enable-verbose-configure', '--enable-sccache', '--disable-manage-submodules', '--enable-locked-deps', '--enable-cargo-native-static', '--set', 'rust.codegen-units-std=1', '--set', 'dist.compression-profile=balanced', '--dist-compression-formats=xz', '--set', 'build.optimized-compiler-builtins', '--disable-dist-src', '--release-channel=nightly', '--enable-debug-assertions', '--enable-overflow-checks', '--enable-llvm-assertions', '--set', 'rust.verify-llvm-ir', '--set', 'rust.codegen-backends=llvm,cranelift,gcc', '--set', 'llvm.static-libstdcpp', '--enable-new-symbol-mangling']
configure: target.x86_64-unknown-linux-gnu.llvm-config := /usr/lib/llvm-16/bin/llvm-config
configure: llvm.link-shared     := True
configure: rust.thin-lto-import-instr-limit := 10
configure: change-id            := 99999999
---
##[endgroup]
Testing GCC stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
   Compiling y v0.1.0 (/checkout/compiler/rustc_codegen_gcc/build_system)
    Finished release [optimized] target(s) in 1.26s
     Running `/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-codegen/x86_64-unknown-linux-gnu/release/y test --use-system-gcc --use-backend gcc --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools/cg_gcc --release --no-default-features --mini-tests --std-tests`
Using system GCC
Using system GCC
[BUILD] example
[AOT] mini_core_hello_world
/checkout/obj/build/x86_64-unknown-linux-gnu/stage1-tools/cg_gcc/mini_core_hello_world
abc
---
---- [run-make] tests/run-make/inline-always-many-cgu stdout ----

error: make failed
status: exit status: 2
command: cd "/checkout/tests/run-make/inline-always-many-cgu" && env -u CARGO_MAKEFLAGS -u MAKEFLAGS -u MFLAGS -u RUSTFLAGS AR="ar" CC="cc -ffunction-sections -fdata-sections -fPIC -m64" CXX="c++ -ffunction-sections -fdata-sections -fPIC -m64" HOST_RPATH_DIR="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib" LD_LIB_PATH_ENVVAR="LD_LIBRARY_PATH" LLVM_BIN_DIR="/usr/lib/llvm-16/bin" LLVM_COMPONENTS="aarch64 aarch64asmparser aarch64codegen aarch64desc aarch64disassembler aarch64info aarch64utils aggressiveinstcombine all all-targets amdgpu amdgpuasmparser amdgpucodegen amdgpudesc amdgpudisassembler amdgpuinfo amdgputargetmca amdgpuutils analysis arm armasmparser armcodegen armdesc armdisassembler arminfo armutils asmparser asmprinter avr avrasmparser avrcodegen avrdesc avrdisassembler avrinfo binaryformat bitreader bitstreamreader bitwriter bpf bpfasmparser bpfcodegen bpfdesc bpfdisassembler bpfinfo cfguard codegen core coroutines coverage debuginfocodeview debuginfodwarf debuginfogsym debuginfologicalview debuginfomsf debuginfopdb demangle dlltooldriver dwarflinker dwarflinkerparallel dwp engine executionengine extensions filecheck frontendhlsl frontendopenacc frontendopenmp fuzzercli fuzzmutate globalisel hexagon hexagonasmparser hexagoncodegen hexagondesc hexagondisassembler hexagoninfo instcombine instrumentation interfacestub interpreter ipo irprinter irreader jitlink lanai lanaiasmparser lanaicodegen lanaidesc lanaidisassembler lanaiinfo libdriver lineeditor linker loongarch loongarchasmparser loongarchcodegen loongarchdesc loongarchdisassembler loongarchinfo lto m68k m68kasmparser m68kcodegen m68kdesc m68kdisassembler m68kinfo mc mca mcdisassembler mcjit mcparser mips mipsasmparser mipscodegen mipsdesc mipsdisassembler mipsinfo mirparser msp430 msp430asmparser msp430codegen msp430desc msp430disassembler msp430info native nativecodegen nvptx nvptxcodegen nvptxdesc nvptxinfo objcarcopts objcopy object objectyaml option orcjit orcshared orctargetprocess passes perfjitevents powerpc powerpcasmparser powerpccodegen powerpcdesc powerpcdisassembler powerpcinfo profiledata remarks riscv riscvasmparser riscvcodegen riscvdesc riscvdisassembler riscvinfo riscvtargetmca runtimedyld scalaropts selectiondag sparc sparcasmparser sparccodegen sparcdesc sparcdisassembler sparcinfo support symbolize systemz systemzasmparser systemzcodegen systemzdesc systemzdisassembler systemzinfo tablegen target targetparser textapi transformutils ve veasmparser vecodegen vectorize vedesc vedisassembler veinfo webassembly webassemblyasmparser webassemblycodegen webassemblydesc webassemblydisassembler webassemblyinfo webassemblyutils windowsdriver windowsmanifest x86 x86asmparser x86codegen x86desc x86disassembler x86info x86targetmca xcore xcorecodegen xcoredesc xcoredisassembler xcoreinfo xray" LLVM_FILECHECK="/usr/lib/llvm-16/bin/FileCheck" NODE="/usr/bin/node" PYTHON="/usr/bin/python3" RUSTC="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc" RUSTDOC="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustdoc" RUST_BUILD_STAGE="stage2-x86_64-unknown-linux-gnu" S="/checkout" TARGET="x86_64-unknown-linux-gnu" TARGET_RPATH_DIR="/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib/rustlib/x86_64-unknown-linux-gnu/lib" TMPDIR="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/inline-always-many-cgu/inline-always-many-cgu" "make"
--- stdout -------------------------------
LD_LIBRARY_PATH="/checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/inline-always-many-cgu/inline-always-many-cgu:/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/lib:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-bootstrap-tools/x86_64-unknown-linux-gnu/release/deps:/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/lib" '/checkout/obj/build/x86_64-unknown-linux-gnu/stage2/bin/rustc' --out-dir /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/inline-always-many-cgu/inline-always-many-cgu -L /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/inline-always-many-cgu/inline-always-many-cgu  -Ainternal_features foo.rs --emit llvm-ir -C codegen-units=2
if cat /checkout/obj/build/x86_64-unknown-linux-gnu/test/run-make/inline-always-many-cgu/inline-always-many-cgu/*.ll | "/checkout/src/etc/cat-and-grep.sh" -e '\bcall\b'; then \
 echo "found call instruction when one wasn't expected"; \
Build completed unsuccessfully in 0:34:28
fi
fi
[[[ begin stdout ]]]
; ModuleID = 'foo.5be5606e1f6aa79b-cgu.0'
source_filename = "foo.5be5606e1f6aa79b-cgu.0"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; foo::a::foo
; Function Attrs: inlinehint nonlazybind uwtable
define internal void @_ZN3foo1a3foo17h9ed28b0d5896c05dE() unnamed_addr #0 {
  ret void
}


; Function Attrs: nonlazybind uwtable
define void @bar() unnamed_addr #1 {
start:
; call foo::a::foo
  call void @_ZN3foo1a3foo17h9ed28b0d5896c05dE()
}


attributes #0 = { inlinehint nonlazybind uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" }
attributes #1 = { nonlazybind uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 8, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}
!2 = !{!"rustc version 1.78.0-nightly (e244ff2c7 2024-02-21)"}
; ModuleID = 'foo.5be5606e1f6aa79b-cgu.1'
source_filename = "foo.5be5606e1f6aa79b-cgu.1"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

; foo::a::bar
; Function Attrs: nonlazybind uwtable
define void @_ZN3foo1a3bar17h11af45a6c59c3fd7E() unnamed_addr #0 {
  ret void
}


attributes #0 = { nonlazybind uwtable "probe-stack"="inline-asm" "target-cpu"="x86-64" }

!llvm.module.flags = !{!0, !1}
!llvm.ident = !{!2}

!0 = !{i32 8, !"PIC Level", i32 2}
!1 = !{i32 2, !"RtLibUseGOT", i32 1}
!2 = !{!"rustc version 1.78.0-nightly (e244ff2c7 2024-02-21)"}

[[[ end stdout ]]]
found call instruction when one wasn't expected
--- stderr -------------------------------
--- stderr -------------------------------
warning: ignoring emit path because multiple .ll files were produced
warning: 1 warning emitted

make: *** [Makefile:5: all] Error 1
------------------------------------------

@bors
Copy link
Contributor

bors commented Feb 22, 2024

☀️ Try build successful - checks-actions
Build commit: 393ef12 (393ef12c970fbc7f294cd96c35cb76f9591bc1d6)

@rust-timer

This comment has been minimized.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (393ef12): comparison URL.

Overall result: ❌✅ regressions and improvements - ACTION NEEDED

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

Next Steps: If you can justify the regressions found in this try perf run, please indicate this with @rustbot label: +perf-regression-triaged along with sufficient written justification. If you cannot justify the regressions please fix the regressions and do another perf run. If the next run shows neutral or positive results, the label will be automatically removed.

@bors rollup=never
@rustbot label: -S-waiting-on-perf +perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.7% [0.2%, 1.0%] 5
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-8.9% [-45.3%, -0.5%] 19
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -6.9% [-45.3%, 1.0%] 24

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.0% [1.0%, 1.0%] 1
Regressions ❌
(secondary)
3.7% [3.7%, 3.7%] 1
Improvements ✅
(primary)
-11.4% [-25.0%, -3.4%] 11
Improvements ✅
(secondary)
-3.6% [-3.6%, -3.6%] 1
All ❌✅ (primary) -10.4% [-25.0%, 1.0%] 12

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.5% [2.2%, 2.7%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-8.8% [-43.8%, -1.1%] 19
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -7.7% [-43.8%, 2.7%] 21

Binary size

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.5% [0.0%, 1.7%] 40
Regressions ❌
(secondary)
0.8% [0.0%, 2.2%] 18
Improvements ✅
(primary)
-3.4% [-9.3%, -0.0%] 29
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -1.2% [-9.3%, 1.7%] 69

Bootstrap: 649.698s -> 651.565s (0.29%)
Artifact size: 310.95 MiB -> 310.97 MiB (0.01%)

@rustbot rustbot added perf-regression Performance regression. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Feb 22, 2024
@saethlin
Copy link
Member Author

The comment that motivated the codegen test #45201 (comment) suggests that things will break if we stop doing this. Considering how harmful inlining at -Copt-level=0 is, I'm going to challenge that statement.

@craterbot run mode=build-and-test

@craterbot
Copy link
Collaborator

👌 Experiment pr-121417 created and queued.
🤖 Automatically detected try build 393ef12
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🚧 Experiment pr-121417 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🎉 Experiment pr-121417 is completed!
📊 97 regressed and 113 fixed (419702 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Mar 3, 2024
@saethlin
Copy link
Member Author

saethlin commented Mar 3, 2024

@craterbot
Copy link
Collaborator

👌 Experiment pr-121417-1 created and queued.
🤖 Automatically detected try build 393ef12
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 3, 2024
@craterbot
Copy link
Collaborator

🚧 Experiment pr-121417-1 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🎉 Experiment pr-121417-1 is completed!
📊 23 regressed and 12 fixed (97 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the blacklist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
perf-regression Performance regression. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants