Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BOLT error while instrumenting Clang Executable on AArch64 #63255

Closed
liamhouston opened this issue Jun 11, 2023 · 15 comments
Closed

BOLT error while instrumenting Clang Executable on AArch64 #63255

liamhouston opened this issue Jun 11, 2023 · 15 comments
Labels
BOLT question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@liamhouston
Copy link

liamhouston commented Jun 11, 2023

I ran llvm-bolt <clang-17 executable> -instrument -o <output file> and received the following output:

BOLT-INFO: shared object or position-independent executable detected
BOLT-INFO: Target architecture: aarch64
BOLT-INFO: BOLT version: 4fcbe5fbeda15220bbbd8f4dbd6909a66a19b779
BOLT-INFO: first alloc address is 0x0
BOLT-INFO: creating new program header table at address 0x5e00000, offset 0x5e00000
BOLT-INFO: enabling relocation mode
BOLT-INFO: forcing -jump-tables=move for instrumentation
BOLT-INFO: disabling -align-macro-fusion on non-x86 platform
BOLT-WARNING: 1 collisions detected while hashing binary objects. Use -v=1 to see the list.
BOLT-INFO: number of removed linker-inserted veneers: 0
BOLT-INFO: 0 out of 136265 functions in the binary (0.0%) have non-empty execution profile
BOLT-INFO: UCE removed 0 blocks and 0 bytes of code.
BOLT-INFO: Starting stub-insertion pass
BOLT-INFO: Inserted 114 stubs in the hot area and 0 stubs in the cold area. Shared 0 times, iterated 2 times.
 #0 0x0000aaaaab77ec64 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (./llvm-bolt+0xcdec64)
 #1 0x0000aaaaab77ce84 llvm::sys::RunSignalHandlers() (./llvm-bolt+0xcdce84)
 #2 0x0000aaaaab77f4dc SignalHandler(int) Signals.cpp:0:0
 #3 0x00004000000707a0 (linux-vdso.so.1+0x7a0)
 #4 0x0000aaaaabac2f7c llvm::bolt::InstrumentationRuntimeLibrary::emitBinary(llvm::bolt::BinaryContext&, llvm::MCStreamer&) (./llvm-bolt+0x1022f7c)
 #5 0x0000aaaaac296584 llvm::bolt::emitBinaryContext(llvm::MCStreamer&, llvm::bolt::BinaryContext&, llvm::StringRef) (./llvm-bolt+0x17f6584)
 #6 0x0000aaaaab7c9400 llvm::bolt::RewriteInstance::emitAndLink() (./llvm-bolt+0xd29400)
 #7 0x0000aaaaab7c2884 llvm::bolt::RewriteInstance::run() (./llvm-bolt+0xd22884)
 #8 0x0000aaaaab417324 main (./llvm-bolt+0x977324)
 #9 0x0000400000564384 __libc_start_main (/lib64/libc.so.6+0x24384)
#10 0x0000aaaaab4157b4 _start (./llvm-bolt+0x9757b4)
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.

Running on LLVM version 17.0.0, targeting AArch64

@llvmbot
Copy link
Collaborator

llvmbot commented Jun 12, 2023

@llvm/issue-subscribers-bolt

@liamhouston liamhouston changed the title BOLT error while instrumenting Clang Executable BOLT error while instrumenting Clang Executable on AArch64 Jun 12, 2023
@ElvinaYakubova
Copy link
Contributor

Hi! Instrumentation for AArch64 is not merged yet, but you can try to apply patches 1 2 3

@liamhouston
Copy link
Author

Hi Elvina,

I have applied the patches and I'm encountering this error message on the build step for bolt_rt

3_patches

@ElvinaYakubova
Copy link
Contributor

Thanks for reporting this! Could you please try to comment if block here https://github.com/llvm/llvm-project/blob/main/bolt/runtime/CMakeLists.txt#L42-L53 for a while (in case you're not using MachO) and try to build again?

@liamhouston
Copy link
Author

liamhouston commented Jun 14, 2023

With the patches and the if block commented out, the build ran successfully. Then I ran bolt with llvm-bolt [clang-17 executable] -instrument -o [output file] and it produced the following output:
image

@ElvinaYakubova
Copy link
Contributor

@liamhouston Hello! Could you please provide instructions on how to reproduce your error? I see your directory name is "stage3-train", are you using "Optimizing Clang" tutorial?

@liamhouston
Copy link
Author

@ElvinaYakubova yes I'm following the Optimizing Clang tutorial.

Here is the configure for the stage3-train clang-17 that I run the bolt instrument command on:
cmake ../llvm-project/llvm -DLLVM_TARGETS_TO_BUILD=AArch64 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" -DLLVM_USE_LINKER=lld -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_FLAGS="-ftime-trace" -DCMAKE_CXX_FLAGS="-ftime-trace" -G Ninja

I applied the three patches and commented out the if block and then configuring the BOLT build.

Here is the configure for the BOLT
cmake ../llvm-project/llvm -DLLVM_TARGETS_TO_BUILD=AArch64 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;bolt;lld;compiler-rt" -DLLVM_USE_LINKER=lld -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -G Ninja

Then I tried to use that bolt instrument on the stage3-train clang-17 executable

@ElvinaYakubova
Copy link
Contributor

@liamhouston do you use clang from stage2-prof-gen here DCMAKE_C_COMPILER=clang? I'm trying to reproduce the error, but the error occurred earlier, on buildCFG step.

@liamhouston
Copy link
Author

@ElvinaYakubova I use the clang from stage2-prof-gen to build the stage3-train clang. To build BOLT I use the baseline stage1 clang.

@liamhouston
Copy link
Author

liamhouston commented Jun 21, 2023

@ElvinaYakubova Here's a script that recreates the error on my machine.
It should work as is, just adjust the variables LLVM to point to the upper level of a clone of LLVM, and the same for LLVM_PATCH except with the 3 patches and the commenting on the CMakeLists.txt. And adjust CORES depending on your machine.


#!/usr/bin/env bash

DIR=$PWD
# adjust these variables accordingly ###################
LLVM=~/llvm-project
LLVM_PATCH=~/llvm-project-patched
CORES=36
########################################################

module load llvm/16.0.0
# build our baseline clang
mkdir -p baseline
cd baseline
cmake $LLVM/llvm -DLLVM_TARGETS_TO_BUILD=AArch64 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" -DLLVM_USE_LINKER=lld -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_FLAGS="-ftime-trace" -DCMAKE_CXX_FLAGS="-ftime-trace" -G Ninja && ninja -j $CORES

if [ $? -ne 0]; then
  exit
fi

cd ..
mkdir -p stage1
cd stage1
# build the clang whose binary will be instrument
export LDFLAGS="-Wl,-q"
export PATH=$DIR/baseline/bin:$PATH
cmake $LLVM/llvm -DLLVM_TARGETS_TO_BUILD=AArch64 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" -DLLVM_USE_LINKER=lld -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_FLAGS="-ftime-trace" -DCMAKE_CXX_FLAGS="-ftime-trace" -G Ninja && ninja -j $CORES

if [ $? -ne 0 ]; then
  exit
fi

cd ..
mkdir -p bolt
cd bolt
# build bolt from the patched llvm
export PATH=$DIR/stage1/bin:$PATH
cmake $LLVM_PATCH/llvm -DLLVM_TARGETS_TO_BUILD=AArch64 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="bolt;clang;lld;compiler-rt" -DLLVM_USE_LINKER=lld -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_FLAGS="-ftime-trace" -DCMAKE_CXX_FLAGS="-ftime-trace" -G Ninja && ninja -j $CORES

if [ $? -ne 0 ]; then
  exit
fi

cd ..
./bolt/bin/llvm-bolt -instrument stage1/bin/clang-17 -o stage1/bin/clang-17-instrumented

@ElvinaYakubova
Copy link
Contributor

@liamhouston Thank you! I'll try to use it, because still can't get the same error as you

@aaupov
Copy link
Contributor

aaupov commented Jun 22, 2023

As a suggestion: please try building with clang/cmake/caches/BOLT-PGO.cmake cache file, it simplifies the application of BOLT over (bootstrapped) PGO build. It uses a different profiling mechanism for PGO and BOLT (perf-training) but other than that, it follows the same steps.

An example build configuration is given here: https://github.com/aaupov/llvm-devmtg-2022/blob/main/driver.sh, BOLT_LTO_PGO_ARGS cmake args set up Clang bootstrapped build with InstrumentationPGO and ThinLTO, and BOLT on top of that.

@liamhouston
Copy link
Author

I removed the flag DCMAKE_[C|CXX]_FLAGS="-ftime-trace" in the configuration for the clang that BOLT was trying to instrument (the stage3-train clang). This allowed BOLT to instrument properly.

@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Jul 5, 2023
@ElvinaYakubova
Copy link
Contributor

I removed the flag DCMAKE_[C|CXX]_FLAGS="-ftime-trace" in the configuration for the clang that BOLT was trying to instrument (the stage3-train clang). This allowed BOLT to instrument properly.
That's good to hear. From my side, the error is still a different one, and yours is not reproducible, no matter if I remove "-ftime-trace" flag or not.
Did you manage to collect the profile? Does everything work after that?

@liamhouston
Copy link
Author

@ElvinaYakubova
Hmmm that's very strange. But on my side I'm able to collect the profile with the instrumented compiler and use BOLT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BOLT question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

5 participants