Skip to content

Fix Symbol mapping issue when we have multiple executable segments #765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jul 31, 2025

Conversation

yinghai
Copy link
Contributor

@yinghai yinghai commented May 9, 2025

It turns out that we can have multiple executable segment for python binaries

➜  py-spy git:(master) ✗ grep libpython3 /proc/3033958/maps
7b9d5718a000-7b9d5718b000 r-xp 0065e000 09:00 271849319                  /tmp/yinghai/cpython-3.12.9-linux-x86_64-gnu/lib/libpython3.12.so.1.0
7b9d580f1000-7b9d58311000 r--p 00000000 09:00 271849319                  /tmp/yinghai/cpython-3.12.9-linux-x86_64-gnu/lib/libpython3.12.so.1.0
7b9d58311000-7b9d5902e000 r-xp 00220000 09:00 271849319                  /tmp/yinghai/cpython-3.12.9-linux-x86_64-gnu/lib/libpython3.12.so.1.0
7b9d5902e000-7b9d5962a000 r--p 00f3d000 09:00 271849319                  /tmp/yinghai/cpython-3.12.9-linux-x86_64-gnu/lib/libpython3.12.so.1.0
7b9d5962a000-7b9d5974f000 r--p 01538000 09:00 271849319                  /tmp/yinghai/cpython-3.12.9-linux-x86_64-gnu/lib/libpython3.12.so.1.0
7b9d5974f000-7b9d598e6000 rw-p 0165d000 09:00 271849319                  /tmp/yinghai/cpython-3.12.9-linux-x86_64-gnu/lib/libpython3.12.so.1.0

Note that both segement 7b9d5718a000 and 7b9d58311000 are executable but first segment has a lower address but a higher offset than the second one. In the code to parse binary, we find the first executable with lowerest address but in symbol map resolution, we use elf in linux to find the first executable with lowest offset to get absolute address of a symbol.

py-spy/src/binary_parser.rs

Lines 139 to 145 in 1fa3a6d

let program_header = elf
.program_headers
.iter()
.find(|header| {
header.p_type == goblin::elf::program_header::PT_LOAD
&& header.p_flags & goblin::elf::program_header::PF_X != 0
})

This inconsistency causes the symbol address to wrong. Hence py-spy will fail to find python context of the process. Typical error message is

Failed to find python version from target process

We have a few of such issues (#756, #550), which could be related although there are other reasons that can lead to this.

The fix here is to scan all the executable segments and pick the lowest offset one to parse binary so that we are consistent.

@korniltsev
Copy link

Can you please share the libpython binary or a way to reproduce / obtain it?

@korniltsev
Copy link

Nit: It would be nice to have a regression test.

@yinghai
Copy link
Contributor Author

yinghai commented May 10, 2025

I actually don't think this is library specific. It tends to happen on the spawned subprocess. It's just the binaries go remapped and the segments are reordered in terms of memory address vs offset.

@petersalas
Copy link

FWIW I ran into this issue with latest py-spy, saw similar segments as are described above, and confirmed that @yinghai's branch worked perfectly.

@fleimgruber
Copy link

fleimgruber commented Jun 13, 2025

@yinghai I wanted to check out your branch on Windows OS and get a compile error:

error[E0609]: no field `offset` on type `&&&MapRange`
   --> src\python_process_info.rs:166:70
    |
166 |             if let Some(libpython) = libmaps.iter().min_by_key(|m| m.offset) {
    |                                                                      ^^^^^^ unknown field

@yinghai
Copy link
Contributor Author

yinghai commented Jun 16, 2025

@fleimgruber yeah sorry probably I only covered the linux case.

@yinghai
Copy link
Contributor Author

yinghai commented Jun 16, 2025

Pushed a change to keep the behavior unchanged for windows.

@fleimgruber
Copy link

fleimgruber commented Jun 16, 2025

@yinghai thanks for the quick change, py-spy compiles fine now, but I still get "Error: Failed to find python version from target process" on Python 3.10.11

@yinghai
Copy link
Contributor Author

yinghai commented Jun 16, 2025

If it's windows, then I don't know how to solve it but maybe it's in the same line of idea.

Copy link
Owner

@benfred benfred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR!

@benfred benfred added the bug Something isn't working label Jul 31, 2025
@benfred benfred merged commit dac3c09 into benfred:master Jul 31, 2025
97 checks passed
@benfred benfred changed the title [Fix] Symbol mapping issue when we have multiple executable segment Fix Symbol mapping issue when we have multiple executable segments Jul 31, 2025
@yinghai yinghai deleted the yinghai/symbol branch July 31, 2025 22:49
@fleimgruber
Copy link

@benfred The addressed issues are still unresolved for Windows - should we create a tracking issue for a similar fix for Windows?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants