New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
during "rear mkbackup" [Ctrl]+[C] does not terminate 'tar' background process #1712
Comments
As a first idea I tried in lib/_input-output-functions.sh adding ... pkill -P $MASTER_PID ; wait ... before finally builtin trap " ... pkill -P $MASTER_PID ; wait ; kill $MASTER_PID" USR1 but that did not help. I also tried in lib/_input-output-functions.sh function DoExitTasks () { LogPrint "Running exit tasks" LogPrint "Terminating child processes" LogPrint "$( pgrep -a -P $MASTER_PID )" pkill -P $MASTER_PID LogPrint "Waiting for child processes to finish..." wait LogPrint "Child processes finished" ... but that also did not help. Somehow that "kill all running jobs" in the DoExitTasks() function |
Oddly enough, I'm not suffering by this. Recently I had to interrupt ReaR with ^C multiple times as well, but I did not had
After ^C
V. |
Phew! The analysis took much longer than expected: Also for me Ctrl+C usually works - but sometimes not. Summary: At least sometimes that "kill all running jobs" in the DoExitTasks() function Usually 'jobs -p' reports only one child and that gets killed with a brutal SIGKILL Therefore the "kill all running jobs" in the DoExitTasks() function Details on my SLES12-SP2 system: For analysis I enhanced lib/_input-output-functions.sh function DoExitTasks () { Log "Running exit tasks." # terminate all running jobs JOBS=( $( jobs -p ) ) # when "jobs -p" results nothing then JOBS is still an unbound variable so that # an empty default value is used to avoid 'set -eu' error exit if $JOBS is unset: if test -n ${JOBS:-""} ; then LogPrint "Terminating..." LogPrint "The following ReaR (background) jobs are active and get terminated:" LogPrint "Output of: jobs -p" LogPrint "$( jobs -p )" LogPrint "Output of: pstree -plau $MASTER_PID" LogPrint "$( pstree -plau $MASTER_PID )" LogPrint "Output of: pgrep -a -P $MASTER_PID" LogPrint "$( pgrep -a -P $MASTER_PID )" LogPrint "Output of: ps auxw | egrep 'rear|tar'" LogPrint "$( ps auxw | egrep 'rear|tar' )" LogPrint "Terminating jobs (PIDs ${JOBS[*]})" for job in "${JOBS[@]}" ; do LogPrint "Terminating job (PID $job)" kill $job 1>&2 done sleep 1 ... and backup/NETFS/default/500_make_backup.sh sleep 1 # Give the backup software a good chance to start working echo "ps auxw | egrep 'rear|tar'" 1>&7 ps auxw | egrep -i1 'rear|tar' 1>&7 echo "pstree -plau $MASTER_PID" 1>&7 pstree -plau $MASTER_PID 1>&7 then I did "rear mkbackup" with Ctrl+C during "archive operation" # usr/sbin/rear -D mkbackup ... Preparing archive operation ps auxw | egrep 'rear|tar' USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 9056 4.6 0.6 16080 6260 pts/1 S+ 13:43 0:01 /bin/bash usr/sbin/rear -D mkbackup root 27352 0.0 0.4 16080 4996 pts/1 S+ 13:44 0:00 /bin/bash usr/sbin/rear -D mkbackup root 27366 8.1 0.2 18256 2480 pts/1 S+ 13:44 0:00 tar --warning=no-xdev --sparse ... root 27368 1.0 0.0 4248 804 pts/1 S+ 13:44 0:00 dd of=/tmp/rear.czvXpnbkbMPDjq1/outputfs/f48/back.gz root 27377 0.0 0.3 16080 3328 pts/1 R+ 13:44 0:00 /bin/bash usr/sbin/rear -D mkbackup pstree -plau 9056 rear,9056 usr/sbin/rear -D mkbackup |-pstree,27378 -plau 9056 `-rear,27352 usr/sbin/rear -D mkbackup |-cat,27367 |-dd,27368 of=/tmp/rear.czvXpnbkbMPDjq1/outputfs/f48/backup.tar.gz `-tar,27366 --warning=no-xdev --sparse ... `-gzip,27371 Archived 46 MiB [avg 11960 KiB/sec] Archived 64 MiB [avg 8220 KiB/sec] ^CTerminating... The following ReaR (background) jobs are active and get terminated: Output of: jobs -p 27352 Output of: pstree -plau 9056 rear,9056 usr/sbin/rear -D mkbackup |-rear,27352 usr/sbin/rear -D mkbackup | `-dd,27368 of=/tmp/rear.czvXpnbkbMPDjq1/outputfs/f48/backup.tar.gz `-rear,27434 usr/sbin/rear -D mkbackup `-pstree,27435 -plau 9056 Output of: pgrep -a -P 9056 27352 /bin/bash usr/sbin/rear -D mkbackup 27440 /bin/bash usr/sbin/rear -D mkbackup Output of: ps auxw | egrep 'rear|tar' root 9056 3.7 0.5 16080 5932 pts/1 S+ 13:43 0:01 /bin/bash usr/sbin/rear -D mkbackup root 27352 0.0 0.4 16080 4996 pts/1 S+ 13:44 0:00 /bin/bash usr/sbin/rear -D mkbackup root 27368 1.2 0.0 4248 804 pts/1 D+ 13:44 0:00 dd of=/tmp/rear.czvXpnbkbMPDjq1/outputfs/f48/back.gz root 27446 0.0 0.4 16080 4612 pts/1 S+ 13:44 0:00 /bin/bash usr/sbin/rear -D mkbackup root 27448 0.0 0.0 4320 780 pts/1 S+ 13:44 0:00 egrep rear|tar Terminating jobs (PIDs 27352) Terminating job (PID 27352) You should also rm -Rf /tmp/rear.czvXpnbkbMPDjq1 rear mkbackup failed, check /root/rear.master/var/log/rear/rear-f48.log for details Note how directly after Ctrl+C the following sub-sub-processes are already gone `-tar,27366 --warning=no-xdev --sparse ... `-gzip,27371 only the 'dd' is left (shown by In some cases even the 'dd' was already gone directly after Ctrl+C for me. This seems to indicate that under normal circumstances In the end perhaps the root of all evil is only the SIGKILL |
According to my above analysis this issue is no longer a minor bug |
Via #1720 I found the basic idea (determine recursively children of children) in |
What is the expected behavior when a user presses Ctrl-C? For me it would be to cleanly abort ReaR which means:
So yes, we should kill everything. Maybe process groups can help us to catch all subprocesses by sending the TERM or KILL signal to the entire group. |
We should never "just kill" any process (i.e. send SIGKILL to it) Process groups do not help, at least not with bash default behaviour. It was my main finding that with bash default behaviour On my SLES12-SP2 system: # type -a descendants_pids descendants_pids is a function descendants_pids () { local parent_pid=$1; kill -0 $parent_pid 2> /dev/null || return 0; local child_pid=""; local children_pids=$( ps --ppid $parent_pid -o pid= ); if test "$children_pids"; then for child_pid in $children_pids; do descendants_pids $child_pid; done; fi; kill -0 $parent_pid 2> /dev/null && echo $parent_pid || return 0 } # { sleep 5 | grep foo | grep $( sleep 2 ; echo bar ) ; } & { sleep 7 | grep this | grep $( sleep 2 ; echo that ) ; } & echo ; sleep 1 ; pstree -plaug $$ ; descendants_pids $$ ; echo ; sleep 2 ; pstree -plaug $$ ; descendants_pids $$ [1] 25527 [2] 25528 bash,15851,15851 ├─bash,25527,25527 │ ├─bash,25535,25527 │ │ └─bash,25537,25527 │ │ └─sleep,25539,25527 2 │ ├─grep,25534,25527 --color=auto foo │ └─sleep,25533,25527 5 ├─bash,25528,25528 │ ├─bash,25532,25528 │ │ └─bash,25536,25528 │ │ └─sleep,25538,25528 2 │ ├─grep,25531,25528 --color=auto this │ └─sleep,25530,25528 7 └─pstree,25544,25544 -plaug 15851 25533 25534 25539 25537 25535 25527 25530 25531 25538 25536 25532 25528 15851 bash,15851,15851 ├─bash,25527,25527 │ ├─grep,25534,25527 --color=auto foo │ ├─grep,25535,25527 --color=auto bar │ └─sleep,25533,25527 5 ├─bash,25528,25528 │ ├─grep,25531,25528 --color=auto this │ ├─grep,25532,25528 --color=auto that │ └─sleep,25530,25528 7 └─pstree,25784,25784 -plaug 15851 25533 25534 25535 25527 25530 25531 25532 25528 15851 FYI: The '-g' option is not supported by older 'pstree' like the 'pstree' on SLE11. Sub-processes that get run via To facilitate the implementation of the user interface to job control, the operating system maintains the notion of a current terminal process group ID. Members of this process group (processes whose process group ID is equal to the current terminal process group ID) receive keyboard-generated signals such as SIGINT. These processes are said to be in the foreground. Background processes are those whose process group ID differs from the terminal's; such processes are immune to keyboard-generated signals. To cleanly terminate all descendant processes in any case |
With #1720 merged |
Current master code after ReaR 2.3.
When during "rear mkbackup" the 'tar' background process is already running
pressing [Ctrl]+[C] (i.e. sending SIGINT) only terminates the foreground
process (i.e. the bash script) but neither the 'tar' background process
nor the 'dd' background process:
The background 'tar' and 'dd' processes run until they finish
and at the very end the mounted NFS share is still mounted:
I think pressing [Ctrl]+[C] during any "rear WORKFLOW"
should also terminate all background processes
and clean up e.g. umount ReaR specific things
(i.e. ensure the EXIT_TASKS can run successfully).
I think the current behaviour is at most a "minor bug".
Or is there a reason why the current behaviour
is perhaps even intentional?
The text was updated successfully, but these errors were encountered: