From bb15f7fb190b94f4dc6796c644cf2d0339d1f1da Mon Sep 17 00:00:00 2001 From: Martijn Dekker Date: Sun, 27 Sep 2020 12:51:09 +0200 Subject: [PATCH] Fix elusive/intermittent DEBUG trap crash (rhbz#1200534) For one Red Hat customer, the following reproducer consistently crashed, tough I was not able to reproduce it and neither was RH. However, the crash analysis is sound (see below). function dlog { fc -ln -0 } trap dlog DEBUG >/tmp/blah Original patch: https://src.fedoraproject.org/rpms/ksh/blob/642af4d6/f/ksh-20140801-arraylen.patch The Red Hat bug thread is closed to the public as it also contains some correspondence with their customer. But it has an excellent crash analysis from Thomas Gardner which I'm including here for the record (the line numbers are for their ksh at the time, not 93u+m). ===begin analysis=== > The creation of an empty file instead of a command that executes > anything causes the coredump. [...] > Here is my analysis on the core that was provided by the customer: > > (gdb) bt > #0 sh_fmtq (string=0x1
) > at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/string.c:340 > #1 0x0000000000457e96 in out_string (cp=, c=32, > quoted=, iop=) > at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/xec.c:444 > #2 0x000000000045804c in sh_debug (shp=0x76d180, trap=0x7f2f13a821e0 "dlog", > name=, subscript=, > argv=0x76e070, flags=) > at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/xec.c:548 > #3 0x000000000045a867 in sh_exec (t=0x7f2f13aafad0, flags=4) > at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/xec.c:1265 > [...need go no further...] > > In frame 2, we can see it cycling through your classic > (char **)argv array like: > > 543 while(cp = *argv++) > 544 { > 545 if((flags&ARG_EXP) && argv[1]==0) > 546 out_pattern(iop, cp,' '); > 547 else > 548 out_string(iop, cp,' ',n?0: (flags&(ARG_RAW|ARG_NOGLOB))||*argv); > 549 } > 550 if(flags&ARG_ASSIGN) > 551 sfputc(iop,')'); > 552 else if(iop==stkstd) > > (we seg-fault after going down the out_string function in line > 548 up there). The string pointer that points to = 0x1 up in > frame #0 (sh_fmtq) traces back to the "cp" variable in line 548 > up there. The "argv" variable being referenced up there just gets > passed in as the fifth argument to this function. > > In frame #3 (sh_exec, line 1265), the line that makes the call > that takes us to frame 2 is: > > 1265 int n = sh_debug(shp,trap,(char*)0,(char*)0, com, ARG_R AW); > > so "com" (the fifth argument) is what's going wrong as it > descends down through these calls. Looking at where it comes > from, well, it's assigned here: > > 1241 if(argn==0) > 1242 { > 1243 /* fake 'true' built-in */ > 1244 np = SYSTRUE; > 1245 *argv = nv_name(np); > 1246 com = argv; > 1247 } > > because as we can see: > > (gdb) f 3 > #3 0x000000000045a867 in sh_exec (t=0x7f2f13aafad0, flags=4) > at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/xec.c:1265 > 1265 int n = sh_debug(shp,trap,(char*)0,(char*)0, com, ARG_RAW); > (gdb) p argn > $2 = 0 > (gdb) > > argn is == 0 here. The tip-off here is that nv_name clearly > returns a simple pointer to an array of characters, not an array > of pointers to arrays of characters as is evidenced by the fact > that the assignment is "*argv = nv_name(np);" not "argv = > nv_name(np);". Looking at the function nv_name proves that it > does indeed return a single pointer to an array of characters, > not a pointer to an array of pointers to arrays of characters. > Now, com is defined as a 'char **': > > 1002 char *cp=0, **com=0, *comn; > > (as it is expected to be in the calls that follow) also, that > argv is also defined as the effective equivalent a 'char **': > > 1237 static char *argv[1]; > > Yup, argv is actually an array of pointers (char ** equivalent), > but that array is restricted to having exactly one element. > Recalling the assignment in the previously quoted line: > > 1245 *argv = nv_name(np); > > we see that the one and only element in that argv array is > getting assigned a pointer to an array of characters here. > Nothing necessarily wrong with that, but remember the loop we > looked at earlier in frame #2 (sh_debug). It went like: > > 543 while(cp = *argv++) > 544 { > 545 if((flags&ARG_EXP) && argv[1]==0) > 546 out_pattern(iop, cp,' '); > 547 else > 548 out_string(iop, cp,' ',n?0: (flags&(ARG_RAW|ARG_NOGLOB))||*argv); > 549 } > > which is clearly expecting argv in this context (com in frame 3, > which really points to that static local single element array > that is also pointed to by argv in frame 2) to be an array of > pointers of indefinite size, each element being a pointer, but > whose last element will be a null pointer. Well, in frame 3 it is > clearly an array with only a single element, and that one element > is NOT pointing to null. Watch this: > > (gdb) f 3 > #3 0x000000000045a867 in sh_exec (t=0x7f2f13aafad0, flags=4) > at /usr/src/debug/ksh-20120801/src/cmd/ksh93/sh/xec.c:1265 > 1265 int n = sh_debug(shp,trap,(char*)0,(char*)0, com, ARG_RAW); > (gdb) p com > $8 = (char **) 0x76e060 > (gdb) p &argv > $9 = (char *(*)[1]) 0x76e060 > (gdb) p com[0] > $11 = 0x5009c6 "true" > (gdb) p com[1] > $10 = 0x1
> (gdb) p argv[0] > $12 = 0x5009c6 "true" > (gdb) p argv[1] > $13 = 0x1
> (gdb) > > So, as expected, com and &argv point to the same place, the first > element points to the constant string "true", but since the array > is defined as having only one element, when you refer to a second > element in that array, you get well, whatever random crap happens > to be in that memory location. When we try to reproduce this > problem, apparently we're getting 0 there (or we're not quite > following this same code path, which is also possible), but the > customer happens to have a "1" in that memory location. ===end analysis=== src/cmd/ksh93/sh/xec.c: sh_exec(): - When processing TCOM (simple command) with an empty/null command, increase the size of the static dummy argv[1] array to argv[2], ensuring a terminating NULL element so that 'while(cp = *argv++)' loops don't crash. (Note that static objects are automatically initialised to zero in C.) src/cmd/ksh93/tests/io.sh: - Adapt the reproducer, testing a null-command redirection 1000x. --- src/cmd/ksh93/sh/xec.c | 2 +- src/cmd/ksh93/tests/io.sh | 16 ++++++++++++++++ 2 files changed, 17 insertions(+), 1 deletion(-) diff --git a/src/cmd/ksh93/sh/xec.c b/src/cmd/ksh93/sh/xec.c index 55ebaf39f471..308cfc97e7bb 100644 --- a/src/cmd/ksh93/sh/xec.c +++ b/src/cmd/ksh93/sh/xec.c @@ -1154,7 +1154,7 @@ int sh_exec(register const Shnode_t *t, int flags) if((io||argn)) { Shbltin_t *bp=0; - static char *argv[1]; + static char *argv[2]; int tflags = 1; if(np && nv_isattr(np,BLT_DCL)) tflags |= 2; diff --git a/src/cmd/ksh93/tests/io.sh b/src/cmd/ksh93/tests/io.sh index 175de0ed3ce2..037ace820f0a 100755 --- a/src/cmd/ksh93/tests/io.sh +++ b/src/cmd/ksh93/tests/io.sh @@ -633,5 +633,21 @@ err=$( ) || err_exit 'Process substitution leaks file descriptors when used as argument to function' \ "(got $(printf %q "$err"))" +# ====== +# A redirection with a null command could crash under certain circumstances (rhbz#1200534) +"$SHELL" -i >/dev/null -c ' + function dlog + { + fc -ln -0 + } + trap dlog DEBUG + for((i=0;i<1000;i++)) + do >"$1/rhbz1200534" + done + exit +' empty_redir_crash_test "$tmp" +((!(e = $?))) || err_exit 'crash on null-command redirection with DEBUG trap' \ + "(got status $e$( ((e>128)) && print -n / && kill -l "$e"), $(printf %q "$got"))" + # ====== exit $((Errors<125?Errors:125))