-
-
Notifications
You must be signed in to change notification settings - Fork 625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Read stop signals from the process and update the process state. #5361
Conversation
for more information, see https://pre-commit.ci
…suspended # Conflicts: # xonsh/procs/posix.py
Helpful things for knightDocs
Interactive debugging in IDESetup PyCharm and have an experience with breakpoints. Tools for modeling the process behaviorsleep 10 # Tool that are easy to run in background and interrupt.
python -c 'input()' # Easy way to run unkillable (in some cases) input-blocked app.
fzf # Crazy tool. It reads input from STDIN, prints UI to STDERR, waits input from /dev/tty, prints result to STDOUT. See also "Python app to test catching all signals" below. Test in pure Linux environmentdocker run --rm -it xonsh/xonsh:slim bash -c "pip install -U 'xonsh[full]' && xonsh"
# a1b2c3 # docker container id
apt update && apt install -y vim git procps strace # to run `ps` # Connect to container from other terminal:
docker exec -it a1b2c3 bash # Save docker container state to reuse:
docker ps
docker commit c3f279d17e0a local/my_xonsh # the same for update
docker run --rm -it local/my_xonsh xonsh Trace signals with python -c 'input()' &
# pid 123
strace -p 123
# strace: Process 123 attached
# [ Process PID=123 runs in x32 mode. ]
# --- stopped by SIGTTIN ---
kill -SIGCONT 72 # From another terminal.
# --- SIGCONT {si_signo=SIGCONT, si_code=SI_USER, si_pid=48, si_uid=0} ---
# syscall_0x7ffffff90558(0x5555555c5a00, 0, 0x7fffff634940, 0, 0, 0x3f) = 0x2
# --- SIGTTIN {si_signo=SIGTTIN, si_code=SI_KERNEL} ---
# --- stopped by SIGTTIN --- Monitor process state codes (STAT)ps ax # Tool to monitor processes and have the process state in STAT column.
Hot to send signalsps ax | grep fzf # get pid=123
kill -SIGINT 123
kill -SIGCONT 123
watch -n1 'ps ax | grep fzf' # Monitor process state after sending signals (STAT). Decode process signals from
|
try: | |
_, wcode = os.waitpid(obj.pid, os.WUNTRACED) | |
except ChildProcessError as e: # No child processes | |
if return_error: | |
return e | |
else: | |
return _safe_wait_for_active_job( | |
last_task=active_task, backgrounded=backgrounded | |
) | |
if os.WIFSTOPPED(wcode): | |
active_task["status"] = "stopped" | |
backgrounded = True | |
elif os.WIFSIGNALED(wcode): | |
print() # get a newline because ^C will have been printed | |
obj.signal = (os.WTERMSIG(wcode), os.WCOREDUMP(wcode)) | |
obj.returncode = None | |
else: | |
obj.returncode = os.WEXITSTATUS(wcode) | |
obj.signal = None | |
return wait_for_active_job(last_task=active_task, backgrounded=backgrounded) |
When we run sleep 10
the system process go to wait and we stuck on os.waitpid
. After pressing Ctrl+C
(sending SIGINT) process get the SIGINT signal and we catch that signal appeared using WIFSIGNALED
(line 281) but have no actions about stopping the execution entirely and we continue execution.
It would be cool to considering the cases around this (i.e. can we really stop the execution of the whole code?) and stopping the execution carefully here.
Linux Process Management Commands
for more information, see https://pre-commit.ci
I converted this to draft because I want to try to create test. |
### Motivation There is annoying behavior when you run command in the loop and can't interrupt e.g. [this report](#5371) and a bit #5342. After diving into this I see the issue around return code. ### The essence Basically ``p = subprocess.Popen()`` populates ``p.returncode`` after ``p.wait()``, ``p.poll()`` or ``p.communicate()`` ([doc](https://docs.python.org/3/library/os.html#os.waitpid)). But if you're using `os.waitpid()` BEFORE these functions you're capturing return code from a signal subsystem and ``p.returncode`` will be ``0`` like success but it's not success. So after ``os.waitid`` call you need to set return code manually ``p.returncode = -os.WTERMSIG(status)`` like in Popen. Example: ```xsh python # python interactive import os, signal, subprocess as sp p = sp.Popen(['sleep', '100']) p.pid # 123 p.wait() # Ctrl+C or `kill -SIGINT 123` from another terminal p.returncode # -2 # BUT: p = sp.Popen(['sleep', '100']) p.pid # 123 pid, status = os.waitpid(p.pid, os.WUNTRACED) # status=2 # From another terminal: kill -SIGINT 123 p.wait() # 0 p.returncode # 0 ``` ```xsh from xonsh.tools import describe_waitpid_status describe_waitpid_status(2) # WIFEXITED - False - Return True if the process returning status exited via the exit() system call. # WEXITSTATUS - 0 - Return the process return code from status. # WIFSIGNALED - True - Return True if the process returning status was terminated by a signal. # WTERMSIG - 2 - Return the signal that terminated the process that provided the status value. # WIFSTOPPED - False - Return True if the process returning status was stopped. # WSTOPSIG - 0 - Return the signal that stopped the process that provided the status value. ``` See also: [Helpful things for knight](#5361 (comment)). ### Before ```xsh $RAISE_SUBPROC_ERROR = True sleep 100 # Ctrl+C _.rtn # 0 # It's wrong and RAISE_SUBPROC_ERROR ignored. for i in range(5): print(i) sleep 5 # 0 # Ctrl+C # Can't interrupt # 1 # 2 ``` ### After ```xsh sleep 100 # Ctrl+C _.rtn # -2 # Like in default Popen $RAISE_SUBPROC_ERROR = True for i in range(5): print(i) sleep 5 # 0 # Ctrl+C # subprocess.CalledProcessError ``` ### Notes * We need to refactor `xonsh.jobs`. It's pretty uncomfortable work with module. * The logic is blurry between Specs, Pipelines and Jobs. We need to bring more clear functions. * The `captured` variable looks like "just the way of capturing (stdout, object)" but in fact it affects all logic very much. We need to create table where we can see the logic difference for every type of capturing. E.g. in `captured=stdout` mode we use `xonsh.jobs` to monitor the command but in `captured=object` we avoid this and some logic from `xonsh.jobs` applied in `stdout` mode but missed in `object` mode. We need clear map. ## For community ⬇️ **Please click the 👍 reaction instead of leaving a `+1` or 👍 comment** --------- Co-authored-by: a <1@1.1> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@jnoortheen @gforsyth After epic struggle during 3 weeks I'm happy to say that it's ready to review. |
I came across this lib https://github.com/oconnor663/duct.py/blob/master/gotchas.md . Can this be helpful? |
@jnoortheen it's very helpful library! Cudos to you and to the oconnor663! I will take a look deeper. |
Yes this is great work. We can take pointers from this repo. |
@gforsyth hey! It's ready to review, please take a look! Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good @anki-code -- I made a few suggestions for the error message and we should update the formatting for jobs --posix
, but otherwise this can go in.
Co-authored-by: Gil Forsyth <gforsyth@users.noreply.github.com>
Co-authored-by: Gil Forsyth <gforsyth@users.noreply.github.com>
@gforsyth comments resolved. Thank you for review! Waiting for final approve. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @anki-code -- I think the helpful notes for the knight is an especially good write-up. Really awesome stuff here.
Reading stop signals from the process and update the process state.
The issue
Technically. In a couple of places that critical for processing signals we have
os.waitpid()
. The function behavior is pretty unobvious and one of things is processing return code after catching the signal. We had no good signal processing around this and this PR fixes this. See alsoproc_untraced_waitpid
function description.From user perspective. For example we have process that is waiting for user input from terminal e.g.
python -c "input()"
orfzf
. If this process will be in captured pipeline e.g.!(echo 1 | fzf | head)
it will be suspended by OS and the pipeline will be in the endless loop with future crashing and corrupting std at the end. This PR fixes this.The solution
Technically. The key function is
proc_untraced_waitpid
- it catches the stop signals and updates the process state.From user perspective. First of all we expect that users will use captured object
!()
only for capturable processes. Because of it our goal here is to just make the behavior in this case stable.In this PR we detect that process in the pipeline is suspended and we need to finish the command pipeline carefully:
jobs
. The same behavior we can see in bash. This is good because we don't know what process suspended and why. May be experienced user will want to continue it manually.Before
After
Closes #4752 #4577
Notes
Mentions
#2159
For community
⬇️ Please click the 👍 reaction instead of leaving a
+1
or 👍 comment