Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_percpu failing on Fedora 36 #4203

Closed
chantra opened this issue Aug 29, 2022 · 0 comments · Fixed by #4204
Closed

test_percpu failing on Fedora 36 #4203

chantra opened this issue Aug 29, 2022 · 0 comments · Fixed by #4204

Comments

@chantra
Copy link
Contributor

chantra commented Aug 29, 2022

As part of updating what version of Fedora BCC can be tested against, it appears that test_percpu.py is failing on Fedora 36:
https://gist.github.com/chantra/d37f3daa969bdbc07862dc11f8384a38

The underlying reason is that the test expects calls to clone, but on Fedora 36, vfork is called instead:

[root@940cbfdf11e3 /]# strace -f python3 -c 'import os; os.popen("hostname");' 2>&1 | egrep '(clone|fork)'
vfork(strace: Process 46 attached
[pid    45] <... vfork resumed>)        = 46

Python version is:

[root@940cbfdf11e3 /]# python3 -V
Python 3.10.6

os.popen uses subprocess.Popen under the hood

This is due to a change in 3.10: https://docs.python.org/3/whatsnew/changelog.html#id150

bpo-35823: Use vfork() instead of fork() for subprocess.Popen() on Linux to improve performance in cases where it is deemed safe.

Per

gh-91401: Provide a fail-safe way to disable subprocess use of vfork() via a private subprocess._USE_VFORK attribute. While there is currently no known need for this, if you find a need please only set it to False. File a CPython issue as to why you needed it and link to that from a comment in your code. This attribute is documented as a footnote in 3.11.

it can be disabled:

[root@940cbfdf11e3 /]# strace -f python3 -c 'import os; import subprocess; subprocess._USE_VFORK = False; os.popen("hostname");' 2>&1 | egrep '(clone|fork)'
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLDstrace: Process 67 attached
[root@940cbfdf11e3 /]# strace -f python3 -c 'import os; import subprocess; os.popen("hostname");' 2>&1 | egrep '(clone|fork)'
vfork(strace: Process 73 attached
[pid    72] <... vfork resumed>)        = 73

regardless, this either need to be gated based on python version, which is sub-optimal and add more conditionals to the code... or use another syscall.

I am suggesting tracing execve instead.

chantra added a commit to chantra/bcc that referenced this issue Aug 29, 2022
Fixes iovisor#4203

The issue goes into more details as to why this change is needed. The TL;DR is
that the implementation of `os.popen` change somewhere between python 3.9 and
3.10 and depending on which version of python is used, either `fork/clone` is
called, or `vfork`.

In both cases, we are going to end up calling `execve`. This change switches
the syscall traced from `clone` to `execve`.

At the same time, it is now a constant that can be used across the tests.

Test plan:
Before https://gist.github.com/chantra/d37f3daa969bdbc07862dc11f8384a38
After https://gist.github.com/chantra/2860e9812f00eaa0758ccdb5d3192689
chantra added a commit to chantra/bcc that referenced this issue Aug 30, 2022
Fixes iovisor#4203

The issue goes into more details as to why this change is needed. The TL;DR is
that the implementation of `os.popen` change somewhere between python 3.9 and
3.10 and depending on which version of python is used, either `fork/clone` is
called, or `vfork`.

In both cases, we are going to end up calling `execve`. This change switches
the syscall traced from `clone` to `execve`.

At the same time, it is now a constant that can be used across the tests.

Test plan:
Before https://gist.github.com/chantra/d37f3daa969bdbc07862dc11f8384a38
After https://gist.github.com/chantra/2860e9812f00eaa0758ccdb5d3192689
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant