Skip to content

BOLT crashes with --update-debug-sections on DWARF v5 when optimizing libpython.so #67966

@indygreg

Description

@indygreg

I can reliably elicit a crash out of BOLT when optimizing an x86-64 ELF binary with DWARF v5. The crash appears identical to #56277:

cpython-3.12> BOLT-INFO: FRAME ANALYSIS: 36855 function(s) were not optimized.
cpython-3.12> BOLT-INFO: FRAME ANALYSIS: 651 function(s) (51.5% dyn cov) could not have its frame indices restored.
cpython-3.12> BOLT-INFO: Shrink wrapping moved 45 spills inserting load/stores and 5 spills inserting push/pops
cpython-3.12> BOLT-INFO: Shrink wrapping reduced 5692310 store executions (0.1% total instructions executed, 1.3% store instructions)
cpython-3.12> BOLT-INFO: Shrink wrapping failed at reducing 0 store executions (0.0% total instructions executed, 0.0% store instructions)
cpython-3.12> BOLT-INFO: Allocation combiner: 38 empty spaces coalesced (dyn count: 3197588).
cpython-3.12> #0 0x0000556536b4d188 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/tools/llvm/bin/llvm-bolt+0x16b7188)
cpython-3.12>  #1 0x0000556536b4b23c llvm::sys::RunSignalHandlers() (/tools/llvm/bin/llvm-bolt+0x16b523c)
cpython-3.12>  #2 0x0000556536b4d91d SignalHandler(int) Signals.cpp:0:0
cpython-3.12>  #3 0x00007f0a984dc890 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0xf890)
cpython-3.12>  #4 0x000055653712a0fb llvm::bolt::BinaryContext::addDebugFilenameToUnit(unsigned int, unsigned int, unsigned int) (/tools/llvm/bin/llvm-bolt+0x1c940fb)
cpython-3.12>  #5 0x0000556537139d3e (anonymous namespace)::BinaryEmitter::emitFunctionBody(llvm::bolt::BinaryFunction&, llvm::bolt::FunctionFragment&, bool) BinaryEmitter.cpp:0:0
cpython-3.12>  #6 0x000055653713ac9b (anonymous namespace)::BinaryEmitter::emitFunction(llvm::bolt::BinaryFunction&, llvm::bolt::FunctionFragment&) BinaryEmitter.cpp:0:0
cpython-3.12>  #7 0x000055653713a677 (anonymous namespace)::BinaryEmitter::emitFunctions()::$_0::operator()(std::vector<llvm::bolt::BinaryFunction*, std::allocator<llvm::bolt::BinaryFunction*>> const&) const BinaryEmitter.cpp:0:0
cpython-3.12>  #8 0x0000556537138d41 llvm::bolt::emitBinaryContext(llvm::MCStreamer&, llvm::bolt::BinaryContext&, llvm::StringRef) (/tools/llvm/bin/llvm-bolt+0x1ca2d41)
cpython-3.12>  #9 0x0000556536b9ce82 llvm::bolt::RewriteInstance::emitAndLink() (/tools/llvm/bin/llvm-bolt+0x1706e82)
cpython-3.12> #10 0x0000556536b94af3 llvm::bolt::RewriteInstance::run() (/tools/llvm/bin/llvm-bolt+0x16feaf3)
cpython-3.12> #11 0x0000556535797032 main (/tools/llvm/bin/llvm-bolt+0x301032)
cpython-3.12> #12 0x00007f0a971a7b45 __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x21b45)
cpython-3.12> #13 0x00005565357952db _start (/tools/llvm/bin/llvm-bolt+0x2ff2db)
cpython-3.12> PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
cpython-3.12> Stack dump:
cpython-3.12> 0.	Program arguments: /tools/llvm/bin/llvm-bolt libpython3.12.so.1.0.prebolt -o libpython3.12.so.1.0.bolt -data=libpython3.12.so.1.0.fdata -update-debug-sections -reorder-blocks=ext-tsp -reorder-functions=hfsort+ -split-functions -icf=1 -inline-all -split-eh -reorder-functions-use-hot-size -peepholes=none -jump-tables=aggressive -inline-ap -indirect-call-promotion=all -dyno-stats -use-gnu-stack -frame-opt=hot
cpython-3.12> Segmentation fault (core dumped)
cpython-3.12> make[1]: *** [profile-bolt-stamp] Error 139
cpython-3.12> Makefile:789: recipe for target 'profile-bolt-stamp' failed
cpython-3.12> make[1]: Leaving directory '/build/Python-3.12.0rc3'

I can crash at least 16.0.3 and 17.0.1.

The crash goes away if I compile all inputs with -fdebug-default-version=4. So it appears it has something to do with updating DWARF v5 debug symbols.

Steps to reproduce (sorry for not being minimal):

  1. git clone https://github.com/indygreg/python-build-standalone
  2. git checkout fa5d1fcebdde07aa04b6c661a55c77c85f414508 (astral-sh/python-build-standalone@fa5d1fc from the bolt-crash branch)
  3. ./build-linux.py --optimizations pgo --python cpython-3.12 --break-on-failure

This builds CPython and its dependencies from source inside highly deterministic Docker containers. After a few minutes it should eventually get to BOLT optimizations and crash. That --break-on-failure keeps the container from exiting on failure, giving you the opportunity to docker exec into it to debug.

CI logs showing the crash should appear at https://github.com/indygreg/python-build-standalone/actions/runs/6379047670 within a few hours.

If you tell me how to figure out which object file / symbol it is crashing on, I can dump DWARF of the offending file / symbol if you aren't able to reproduce.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions