Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve ProcSyms module path handling #4319

Merged

Conversation

lenticularis39
Copy link
Contributor

@lenticularis39 lenticularis39 commented Nov 7, 2022

Another, better attempt at implementing loading symbols from exited processes, following cc44177. Unlike the previous one, this one should respect mount namespaces. I tried opening a file inside a container-specific tmpfs and it worked using the approach described below.

TODO list:

  • Add tests to make sure everything works as intended

Instead of loading modules from /proc/<pid>/root/, open /proc/<pid>/root in advance with open(..., O_PATH) first, then use openat to read the module file. This enables the module file to be read even if <pid> is not running.

A class named ModulePath is added, which does the openat call and manages the returned file descriptor, enabling functions expecting a path to use /proc/self/fd/... as the path to the module.

Edit: prepend backslash to < and > symbols to fix display on GitHub

@chenhengqi
Copy link
Collaborator

The CI still failed.

@lenticularis39
Copy link
Contributor Author

/tmp/perf-pid.map maps tests fixed with a workaround disabling the openat and /proc/self/fd/... trick for perf maps (they rely on the filename ending with .map, which does not work well with the latter).

Looking at failing Python resolve_name tests now.

Besides loading modules from /proc/<pid>/root/<path>, open /proc/<pid>/root
in advance with open(..., O_PATH) first, then use openat to read the module file
in case /proc/<pid>/root/<path> cannot be accessed.
This enables the module file to be read even if <pid> is not running.

A class named ModulePath is added, which does the openat call and manages the
returned file descriptor, enabling functions expecting a path to use
/proc/self/fd/... as the path to the module.

This approach has a few limitations. Code using the module path for something
different than open (e.g. debug info in .debug file or perf map detection) does not
work with the /proc/self/fd/... path, and the accessibility check of
/proc/<pid>/root/<path> is prone to a race condition.
@lenticularis39
Copy link
Contributor Author

There is too much functionality which breaks after substituting /proc/.../root/<module-name> with /proc/self/fd/..., so I opted for only using the latter when the former is not accessible (e.g. when the process has already exited). This should avoid any regressions at the cost of being prone to a race condition (the process might exit in between the accessibility check and reading the module).

@davemarchevsky
Copy link
Collaborator

Sorry for the delayed review.

In general I think this openat strategy is a good idea. I had some concerns about whether procfs would behave as expected since it's a 'special' fs, but skimming the code makes me think it'll work.

At Meta we have a profiling daemon that splits its execution into a few conceptual phases:

  • discovery: Gather information about processes running on the host for use in next phase
  • data collection: Attach BPF prog(s) and collect data. Usually this includes collecting stacks and passing them back to userspace.
  • post-processing: Symbolicate stacks, etc

We had a similar issue with symbolicating exited process' stacks in the past, as we were trying to grab /proc/PID/root/... files in post-processing, but the process may have died between beginning of data collection and beginning of post-processing. To solve this we do something similar to this PR - grab fd during discovery phase (similar to ProcStat::refresh_root), use it later during post-processing. This way processes which were alive during discovery but died during data collection didn't have the "exited process" issue.

I'm mentioning this all because I'm not aware of any similar 'discovery' phase in tools using ProcSyms. The cpp and python APIs are initializing ProcSyms for pids when the first stack for the pid is requested (BPFStackTable::get_stack_symbol and BPF._sym_cache, respectively). For the changes in this PR to be useful 'by default', IMO we should also modify the API to make separating 'discovery' and 'post-processing' easy for tools.

@lenticularis39
Copy link
Contributor Author

lenticularis39 commented Nov 29, 2022

I'm mentioning this all because I'm not aware of any similar 'discovery' phase in tools using ProcSyms. The cpp and python APIs are initializing ProcSyms for pids when the first stack for the pid is requested (BPFStackTable::get_stack_symbol and BPF._sym_cache, respectively). For the changes in this PR to be useful 'by default', IMO we should also modify the API to make separating 'discovery' and 'post-processing' easy for tools.

lenticularis39/bpftrace@4f07cc1, a part of a bpftrace PR I referenced this one from earlier, does a similar thing. In bpftrace, BPF code is not separated from userspace code, hence the "discovery phase" is triggered by symbol resolution being used in the code; perhaps in BCC it could be a function called by the user with a way to specify for which processes the symcache should be created.

@davemarchevsky davemarchevsky merged commit e44ece0 into iovisor:master Dec 8, 2022
@davemarchevsky
Copy link
Collaborator

Cool, sounds like we're on the same page. Since you have a concrete usecase already, I think this can be merged w/o waiting for implementation of 'discovery' phase in normal BCC framework

lenticularis39 added a commit to lenticularis39/bpftrace that referenced this pull request May 22, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this pull request May 22, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this pull request May 22, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this pull request May 22, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this pull request Jun 12, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
lenticularis39 added a commit to lenticularis39/bpftrace that referenced this pull request Jun 12, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
ajor pushed a commit to bpftrace/bpftrace that referenced this pull request Jun 13, 2023
Note: test for ASLR enabled is disabled because of race condition,
see iovisor/bcc#4319.
ddelnano added a commit to ddelnano/bcc that referenced this pull request May 17, 2024
…xe-absolute-path"

This reverts commit e44ece0, reversing
changes made to 8634441.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants