Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ksh segfaults in job_chksave after receiving SIGCHLD #14

Merged
merged 1 commit into from
Jun 14, 2020

Conversation

aweeraman
Copy link

Upstreaming Debian patch:

Prior to this update, the compiler optimization dropped parts from the ksh job
locking mechanism from the binary code. As a consequence, ksh could terminate
unexpectedly with a segmentation fault after it received the SIGCHLD signal.
This update implements a fix to ensure the compiler does not drop parts of the
ksh mechanism and the crash no longer occurs.

Author: Michal Hlavinka mhlavink@redhat.com
Origin: Red Hat fix, ksh-20120801-locking.patch
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1123467
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1697501
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=867181

@McDutchie
Copy link

McDutchie commented Jun 14, 2020

Many thanks for the pull request and the explanation @aweeraman, but could you include the explanation (including all the reference URLs) in your commit message itself and do a force-push, please? See policy point 6 in the main README.md.

I know the issue number does get included when I merge (if I choose to have a merge commit), but a link to a GitHub issue that is unlikely to be around in 20 years is insufficient. We need to make sure people can git blame in the future to understand why a certain change has been made.

Upstreaming Debian patch:

Prior to this update, the compiler optimization dropped parts from the ksh job
locking mechanism from the binary code. As a consequence, ksh could terminate
unexpectedly with a segmentation fault after it received the SIGCHLD signal.
This update implements a fix to ensure the compiler does not drop parts of the
ksh mechanism and the crash no longer occurs.

Author: Michal Hlavinka mhlavink@redhat.com
Origin: Red Hat fix, ksh-20120801-locking.patch
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1123467
Bug-Ubuntu: https://bugs.launchpad.net/bugs/1697501
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=867181
@aweeraman
Copy link
Author

Makes sense. Just did a force push with the updated commit message.

@McDutchie
Copy link

Also, I don't really get it (this is probably where it shows that my primary competence is the shell language itself, and in C I pretty much pretend to know what I'm doing sometimes ;)). I mean, isn't this really working around a compiler bug, not a bug in ksh? If something like --job.in_critical gets wrongly optimised out, surely that should be reported to the compiler's upstream? What am I missing?

@McDutchie McDutchie merged commit f95d310 into ksh93:master Jun 14, 2020
@McDutchie
Copy link

So, given the documentation and the testing, it seems safe enough to take this one on faith; I've merged it with more data from the Ubuntu bug – in the merge commit message, so you don't have to jump through another hoop. I still would like to know why this isn't actually a compiler bug, though…

@aweeraman
Copy link
Author

Thanks. The new description is much better.

This would be a case of the compiler optimizing the unary pre/post-fix increment / decrement operations in a way that affects the mutex. Difficult to say whether it is a compiler bug or not without a reproducible test case...

McDutchie added a commit that referenced this pull request Jun 14, 2020
The more I think about it, the more it seems obvious that commit
07cc71b (PR #14) is quite simply a workaround for a GCC optimiser
bug, and (who knows?) possibly an old, long-fixed one, as the bug
report is years old.

The commit also caused ksh to fail to build on HP-UX B.11.11 with
GCC 4.2.3 (hosted at polarhome.com), because it doesn't have
__sync_fetch_and_add() and __sync_fetch_and_sub(). It may fail on
other systems. The GCC documentation says these are legacy:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html

HELP WANTED: what I would like best is if someone could come up
with some way of detecting this optimiser bug and then error out
with a message along the lines of "please upgrade your broken
compiler". It would probably need to be a new iffe test.

Meanwhile, let's try it this way for a while and see what happens:

src/cmd/ksh93/include/jobs.h:
- Restore original ksh version of job_lock()/job_unlock() macros.
- Use the workaround version only if the compiler has the builtins
  __sync_fetch_and_add() and __sync_fetch_and_sub().
@McDutchie
Copy link

Thanks. After thinking more about what those macros actually do concretely, I now find it very hard to imagine that it is not an optimiser bug.

@McDutchie
Copy link

McDutchie commented Jun 15, 2020

@aweeraman, FYI: b7f48e8 (please click for full commit message)

@aweeraman aweeraman deleted the debian-patches-2 branch June 22, 2020 00:06
McDutchie added a commit that referenced this pull request Jun 13, 2022
I didn't trust this back in e3d7bf1 (which disabled it for
interactive shells) and I trust it less now. In af6a32d/6b380572,
this was also disabled for virtual subshells as it caused program
flow corruption there. Now, on macOS 10.14.6, a crash occurs when
repeatedly running a command with this optimisation:

$ ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
0 1 2 3 4 5 6 7 Illegal instruction

Oddly enough it seems that I can only reproduce this crash on macOS
-- not on Linux, OpenBSD, or Solaris. It could be a macOS bug,
particularly given the odd message in the stack trace below.

I've had enough, though. Out it comes. Things now work fine, the
reproducer is fixed on macOS, and it didn't optimise much anyway.

The double-fork issue discussed in e3d7bf1 remains.
________
For future reference, here's an lldb debugger session with a stack
trace. It crashes on calling calloc() (via sh_calloc(), via
sh_newof()) in jobsave_create(). This is not an invalid pointer
problem as we're allocating new memory, so it does look like an OS
bug. The "BUG IN CLIENT OF LIBPLATFORM" message is interesting.

$ lldb -- arch/*/bin/ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
(lldb) target create "arch/darwin.i386-64/bin/ksh"
Current executable set to 'arch/darwin.i386-64/bin/ksh' (x86_64).
(lldb) settings set -- target.run-args  "-c" "for((i=0;i<100;i++));do print -n \"$i \";(sleep 1&);done"
(lldb) run
error: shell expansion failed (reason: lldb-argdumper exited with error 2). consider launching with 'process launch'.
(lldb) process launch
Process 35038 launched: '/usr/local/src/ksh93/ksh/arch/darwin.i386-64/bin/ksh' (x86_64)
0 1 2 3 4 5 6 7 8 9 Process 35038 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
libsystem_platform.dylib`_os_unfair_lock_recursive_abort:
->  0x7fff70deb1c2 <+23>: ud2

libsystem_platform.dylib`_os_unfair_lock_unowned_abort:
    0x7fff70deb1c4 <+0>:  movl   %edi, %eax
    0x7fff70deb1c6 <+2>:  leaq   0x1a8a(%rip), %rcx        ; "BUG IN CLIENT OF LIBPLATFORM: Unlock of an os_unfair_lock not owned by current thread"
    0x7fff70deb1cd <+9>:  movq   %rcx, 0x361cb16c(%rip)    ; gCRAnnotations + 8
Target 0: (ksh) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
    frame #1: 0x00007fff70de7c9a libsystem_platform.dylib`_os_unfair_lock_lock_slow + 239
    frame #2: 0x00007fff70daa3bd libsystem_malloc.dylib`tiny_malloc_should_clear + 188
    frame #3: 0x00007fff70daa20f libsystem_malloc.dylib`szone_malloc_should_clear + 66
    frame #4: 0x00007fff70dab444 libsystem_malloc.dylib`malloc_zone_calloc + 99
    frame #5: 0x00007fff70dab3c4 libsystem_malloc.dylib`calloc + 30
    frame #6: 0x000000010003fa5d ksh`sh_calloc(nmemb=1, size=16) at init.c:264:13
    frame #7: 0x000000010004f8a6 ksh`jobsave_create(pid=35055) at jobs.c:272:8
    frame #8: 0x000000010004ed42 ksh`job_reap(sig=20) at jobs.c:363:9
    frame #9: 0x000000010004ff6f ksh`job_waitsafe(sig=20) at jobs.c:511:3
    frame #10: 0x00007fff70de9b5d libsystem_platform.dylib`_sigtramp + 29
    frame #11: 0x00007fff70d39ac4 libsystem_kernel.dylib`__fork + 12
    frame #12: 0x00007fff70c57d80 libsystem_c.dylib`fork + 17
    frame #13: 0x000000010009590d ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:1883:16
    frame #14: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:2019:4
    frame #15: 0x0000000100096c4f ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2213:9
    frame #16: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2019:4
    frame #17: 0x000000010001c23f ksh`exfile(iop=0x0000000100405750, fno=-1) at main.c:603:4
    frame #18: 0x000000010001b23c ksh`sh_main(ac=3, av=0x00007ffeefbff4f0, userinit=0x0000000000000000) at main.c:365:2
    frame #19: 0x0000000100000776 ksh`main(argc=3, argv=0x00007ffeefbff4f0) at pmain.c:45:9
    frame #20: 0x00007fff70bfe3d5 libdyld.dylib`start + 1
McDutchie added a commit that referenced this pull request Jun 13, 2022
I didn't trust this back in e3d7bf1 (which disabled it for
interactive shells) and I trust it less now. In af6a32d/6b380572,
this was also disabled for virtual subshells as it caused program
flow corruption there. Now, on macOS 10.14.6, a crash occurs when
repeatedly running a command with this optimisation:

$ ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
0 1 2 3 4 5 6 7 Illegal instruction

Oddly enough it seems that I can only reproduce this crash on macOS
-- not on Linux, OpenBSD, or Solaris. It could be a macOS bug,
particularly given the odd message in the stack trace below.

I've had enough, though. Out it comes. Things now work fine, the
reproducer is fixed on macOS, and it didn't optimise much anyway.

The double-fork issue discussed in e3d7bf1 remains.
________
For future reference, here's an lldb debugger session with a stack
trace. It crashes on calling calloc() (via sh_calloc(), via
sh_newof()) in jobsave_create(). This is not an invalid pointer
problem as we're allocating new memory, so it does look like an OS
bug. The "BUG IN CLIENT OF LIBPLATFORM" message is interesting.

$ lldb -- arch/*/bin/ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
(lldb) target create "arch/darwin.i386-64/bin/ksh"
Current executable set to 'arch/darwin.i386-64/bin/ksh' (x86_64).
(lldb) settings set -- target.run-args  "-c" "for((i=0;i<100;i++));do print -n \"$i \";(sleep 1&);done"
(lldb) run
error: shell expansion failed (reason: lldb-argdumper exited with error 2). consider launching with 'process launch'.
(lldb) process launch
Process 35038 launched: '/usr/local/src/ksh93/ksh/arch/darwin.i386-64/bin/ksh' (x86_64)
0 1 2 3 4 5 6 7 8 9 Process 35038 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
libsystem_platform.dylib`_os_unfair_lock_recursive_abort:
->  0x7fff70deb1c2 <+23>: ud2

libsystem_platform.dylib`_os_unfair_lock_unowned_abort:
    0x7fff70deb1c4 <+0>:  movl   %edi, %eax
    0x7fff70deb1c6 <+2>:  leaq   0x1a8a(%rip), %rcx        ; "BUG IN CLIENT OF LIBPLATFORM: Unlock of an os_unfair_lock not owned by current thread"
    0x7fff70deb1cd <+9>:  movq   %rcx, 0x361cb16c(%rip)    ; gCRAnnotations + 8
Target 0: (ksh) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
    frame #1: 0x00007fff70de7c9a libsystem_platform.dylib`_os_unfair_lock_lock_slow + 239
    frame #2: 0x00007fff70daa3bd libsystem_malloc.dylib`tiny_malloc_should_clear + 188
    frame #3: 0x00007fff70daa20f libsystem_malloc.dylib`szone_malloc_should_clear + 66
    frame #4: 0x00007fff70dab444 libsystem_malloc.dylib`malloc_zone_calloc + 99
    frame #5: 0x00007fff70dab3c4 libsystem_malloc.dylib`calloc + 30
    frame #6: 0x000000010003fa5d ksh`sh_calloc(nmemb=1, size=16) at init.c:264:13
    frame #7: 0x000000010004f8a6 ksh`jobsave_create(pid=35055) at jobs.c:272:8
    frame #8: 0x000000010004ed42 ksh`job_reap(sig=20) at jobs.c:363:9
    frame #9: 0x000000010004ff6f ksh`job_waitsafe(sig=20) at jobs.c:511:3
    frame #10: 0x00007fff70de9b5d libsystem_platform.dylib`_sigtramp + 29
    frame #11: 0x00007fff70d39ac4 libsystem_kernel.dylib`__fork + 12
    frame #12: 0x00007fff70c57d80 libsystem_c.dylib`fork + 17
    frame #13: 0x000000010009590d ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:1883:16
    frame #14: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:2019:4
    frame #15: 0x0000000100096c4f ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2213:9
    frame #16: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2019:4
    frame #17: 0x000000010001c23f ksh`exfile(iop=0x0000000100405750, fno=-1) at main.c:603:4
    frame #18: 0x000000010001b23c ksh`sh_main(ac=3, av=0x00007ffeefbff4f0, userinit=0x0000000000000000) at main.c:365:2
    frame #19: 0x0000000100000776 ksh`main(argc=3, argv=0x00007ffeefbff4f0) at pmain.c:45:9
    frame #20: 0x00007fff70bfe3d5 libdyld.dylib`start + 1
McDutchie added a commit that referenced this pull request Jun 14, 2022
I didn't trust this back in e3d7bf1 (which disabled it for
interactive shells) and I trust it less now. In af6a32d/6b380572,
this was also disabled for virtual subshells as it caused program
flow corruption there. Now, on macOS 10.14.6, a crash occurs when
repeatedly running a command with this optimisation:

$ ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
0 1 2 3 4 5 6 7 Illegal instruction

Oddly enough it seems that I can only reproduce this crash on macOS
-- not on Linux, OpenBSD, or Solaris. It could be a macOS bug,
particularly given the odd message in the stack trace below.

I've had enough, though. Out it comes. Things now work fine, the
reproducer is fixed on macOS, and it didn't optimise much anyway.

The double-fork issue discussed in e3d7bf1 remains.
________
For future reference, here's an lldb debugger session with a stack
trace. It crashes on calling calloc() (via sh_calloc(), via
sh_newof()) in jobsave_create(). This is not an invalid pointer
problem as we're allocating new memory, so it does look like an OS
bug. The "BUG IN CLIENT OF LIBPLATFORM" message is interesting.

$ lldb -- arch/*/bin/ksh -c 'for((i=0;i<100;i++));do print -n "$i ";(sleep 1&);done'
(lldb) target create "arch/darwin.i386-64/bin/ksh"
Current executable set to 'arch/darwin.i386-64/bin/ksh' (x86_64).
(lldb) settings set -- target.run-args  "-c" "for((i=0;i<100;i++));do print -n \"$i \";(sleep 1&);done"
(lldb) run
error: shell expansion failed (reason: lldb-argdumper exited with error 2). consider launching with 'process launch'.
(lldb) process launch
Process 35038 launched: '/usr/local/src/ksh93/ksh/arch/darwin.i386-64/bin/ksh' (x86_64)
0 1 2 3 4 5 6 7 8 9 Process 35038 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
    frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
libsystem_platform.dylib`_os_unfair_lock_recursive_abort:
->  0x7fff70deb1c2 <+23>: ud2

libsystem_platform.dylib`_os_unfair_lock_unowned_abort:
    0x7fff70deb1c4 <+0>:  movl   %edi, %eax
    0x7fff70deb1c6 <+2>:  leaq   0x1a8a(%rip), %rcx        ; "BUG IN CLIENT OF LIBPLATFORM: Unlock of an os_unfair_lock not owned by current thread"
    0x7fff70deb1cd <+9>:  movq   %rcx, 0x361cb16c(%rip)    ; gCRAnnotations + 8
Target 0: (ksh) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0)
  * frame #0: 0x00007fff70deb1c2 libsystem_platform.dylib`_os_unfair_lock_recursive_abort + 23
    frame #1: 0x00007fff70de7c9a libsystem_platform.dylib`_os_unfair_lock_lock_slow + 239
    frame #2: 0x00007fff70daa3bd libsystem_malloc.dylib`tiny_malloc_should_clear + 188
    frame #3: 0x00007fff70daa20f libsystem_malloc.dylib`szone_malloc_should_clear + 66
    frame #4: 0x00007fff70dab444 libsystem_malloc.dylib`malloc_zone_calloc + 99
    frame #5: 0x00007fff70dab3c4 libsystem_malloc.dylib`calloc + 30
    frame #6: 0x000000010003fa5d ksh`sh_calloc(nmemb=1, size=16) at init.c:264:13
    frame #7: 0x000000010004f8a6 ksh`jobsave_create(pid=35055) at jobs.c:272:8
    frame #8: 0x000000010004ed42 ksh`job_reap(sig=20) at jobs.c:363:9
    frame #9: 0x000000010004ff6f ksh`job_waitsafe(sig=20) at jobs.c:511:3
    frame #10: 0x00007fff70de9b5d libsystem_platform.dylib`_sigtramp + 29
    frame #11: 0x00007fff70d39ac4 libsystem_kernel.dylib`__fork + 12
    frame #12: 0x00007fff70c57d80 libsystem_c.dylib`fork + 17
    frame #13: 0x000000010009590d ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:1883:16
    frame #14: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005d30, flags=4) at xec.c:2019:4
    frame #15: 0x0000000100096c4f ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2213:9
    frame #16: 0x0000000100096013 ksh`sh_exec(t=0x0000000101005a40, flags=5) at xec.c:2019:4
    frame #17: 0x000000010001c23f ksh`exfile(iop=0x0000000100405750, fno=-1) at main.c:603:4
    frame #18: 0x000000010001b23c ksh`sh_main(ac=3, av=0x00007ffeefbff4f0, userinit=0x0000000000000000) at main.c:365:2
    frame #19: 0x0000000100000776 ksh`main(argc=3, argv=0x00007ffeefbff4f0) at pmain.c:45:9
    frame #20: 0x00007fff70bfe3d5 libdyld.dylib`start + 1
JohnoKing added a commit to JohnoKing/ksh that referenced this pull request Aug 19, 2022
The ASan crash in basic.sh when sourcing multiple files is caused by
a bug that is similar to the crash fixed in 59a5672. This is the
trace for the regression test crash (note that in order to see the
trace, the 2>/dev/null redirect must be disabled):

==1899388==ERROR: AddressSanitizer: heap-use-after-free on address 0x6150000005b0 at pc 0x55a5e3f9432a bp 0x7ffeb91ea110 sp 0x7ffeb91ea100
WRITE of size 8 at 0x6150000005b0 thread T0
    #0 0x55a5e3f94329 in funct /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:967
    ksh93#1 0x55a5e3f96f77 in item /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:1349
    ksh93#2 0x55a5e3f90c9f in term /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:642
    ksh93#3 0x55a5e3f90ac1 in list /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:613
    ksh93#4 0x55a5e3f90845 in sh_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:561
    ksh93#5 0x55a5e3f909e0 in sh_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:586
    ksh93#6 0x55a5e3f8fd5e in sh_parse /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:438
    ksh93#7 0x55a5e3fc43c1 in sh_eval /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:635
    ksh93#8 0x55a5e4012172 in b_dot_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/misc.c:318
    ksh93#9 0x55a5e3fca3cb in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1254
    ksh93#10 0x55a5e3fd01d4 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1932
    ksh93#11 0x55a5e3fc4544 in sh_eval /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:651
    ksh93#12 0x55a5e4012172 in b_dot_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/misc.c:318
    ksh93#13 0x55a5e3fca3cb in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1254
    ksh93#14 0x55a5e3ecc1cd in exfile /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:604
    ksh93#15 0x55a5e3ec9e7f in sh_main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:369
    ksh93#16 0x55a5e3ec801d in main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/pmain.c:41
    ksh93#17 0x7f637b4db2cf  (/usr/lib/libc.so.6+0x232cf)
    ksh93#18 0x7f637b4db389 in __libc_start_main (/usr/lib/libc.so.6+0x23389)
    ksh93#19 0x55a5e3ec7f24 in _start ../sysdeps/x86_64/start.S:115

Code in question:
https://github.com/ksh93/ksh/blob/8d57369b0cb39074437dd82924b604155e30e1e0/src/cmd/ksh93/sh/parse.c#L963-L968

To avoid any more similar crashes, all of the fixes introduced
in 7e317c5 that set slp->slptr to null have been improved with the
fix in 59a5672.
McDutchie pushed a commit that referenced this pull request Aug 19, 2022
The ASan crash in basic.sh when sourcing multiple files is caused by
a bug that is similar to the crash fixed in 59a5672. This is the
trace for the regression test crash (note that in order to see the
trace, the 2>/dev/null redirect must be disabled):

==1899388==ERROR: AddressSanitizer: heap-use-after-free on address 0x6150000005b0 at pc 0x55a5e3f9432a bp 0x7ffeb91ea110 sp 0x7ffeb91ea100
WRITE of size 8 at 0x6150000005b0 thread T0
    #0 0x55a5e3f94329 in funct /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:967
    #1 0x55a5e3f96f77 in item /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:1349
    #2 0x55a5e3f90c9f in term /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:642
    #3 0x55a5e3f90ac1 in list /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:613
    #4 0x55a5e3f90845 in sh_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:561
    #5 0x55a5e3f909e0 in sh_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:586
    #6 0x55a5e3f8fd5e in sh_parse /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:438
    #7 0x55a5e3fc43c1 in sh_eval /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:635
    #8 0x55a5e4012172 in b_dot_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/misc.c:318
    #9 0x55a5e3fca3cb in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1254
    #10 0x55a5e3fd01d4 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1932
    #11 0x55a5e3fc4544 in sh_eval /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:651
    #12 0x55a5e4012172 in b_dot_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/misc.c:318
    #13 0x55a5e3fca3cb in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1254
    #14 0x55a5e3ecc1cd in exfile /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:604
    #15 0x55a5e3ec9e7f in sh_main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:369
    #16 0x55a5e3ec801d in main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/pmain.c:41
    #17 0x7f637b4db2cf  (/usr/lib/libc.so.6+0x232cf)
    #18 0x7f637b4db389 in __libc_start_main (/usr/lib/libc.so.6+0x23389)
    #19 0x55a5e3ec7f24 in _start ../sysdeps/x86_64/start.S:115

Code in question:
https://github.com/ksh93/ksh/blob/8d57369b0cb39074437dd82924b604155e30e1e0/src/cmd/ksh93/sh/parse.c#L963-L968

To avoid any more similar crashes, all of the fixes introduced
in 7e317c5 that set slp->slptr to null have been improved with the
fix in 59a5672.
McDutchie pushed a commit that referenced this pull request Aug 19, 2022
The ASan crash in basic.sh when sourcing multiple files is caused by
a bug that is similar to the crash fixed in f24040e. This is the
trace for the regression test crash (note that in order to see the
trace, the 2>/dev/null redirect must be disabled):

==1899388==ERROR: AddressSanitizer: heap-use-after-free on address 0x6150000005b0 at pc 0x55a5e3f9432a bp 0x7ffeb91ea110 sp 0x7ffeb91ea100
WRITE of size 8 at 0x6150000005b0 thread T0
    #0 0x55a5e3f94329 in funct /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:967
    #1 0x55a5e3f96f77 in item /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:1349
    #2 0x55a5e3f90c9f in term /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:642
    #3 0x55a5e3f90ac1 in list /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:613
    #4 0x55a5e3f90845 in sh_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:561
    #5 0x55a5e3f909e0 in sh_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:586
    #6 0x55a5e3f8fd5e in sh_parse /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/parse.c:438
    #7 0x55a5e3fc43c1 in sh_eval /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:635
    #8 0x55a5e4012172 in b_dot_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/misc.c:318
    #9 0x55a5e3fca3cb in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1254
    #10 0x55a5e3fd01d4 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1932
    #11 0x55a5e3fc4544 in sh_eval /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:651
    #12 0x55a5e4012172 in b_dot_cmd /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/misc.c:318
    #13 0x55a5e3fca3cb in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1254
    #14 0x55a5e3ecc1cd in exfile /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:604
    #15 0x55a5e3ec9e7f in sh_main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:369
    #16 0x55a5e3ec801d in main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/pmain.c:41
    #17 0x7f637b4db2cf  (/usr/lib/libc.so.6+0x232cf)
    #18 0x7f637b4db389 in __libc_start_main (/usr/lib/libc.so.6+0x23389)
    #19 0x55a5e3ec7f24 in _start ../sysdeps/x86_64/start.S:115

Code in question:
https://github.com/ksh93/ksh/blob/8d57369b0cb39074437dd82924b604155e30e1e0/src/cmd/ksh93/sh/parse.c#L963-L968

To avoid any more similar crashes, all of the fixes introduced
in 69d37d5 that set slp->slptr to null have been improved with the
fix in f24040e.
JohnoKing added a commit to JohnoKing/ksh that referenced this pull request Sep 23, 2022
The isaname, isaletter, isadigit, isexp and ismeta macros don't check if
c is a negative value before accessing sh_lexstates. This can result in
ASan crashing because of a buffer overflow in quoting2.sh when running
in a multibyte locale:
  test quoting2(C.UTF-8) begins at 2022-09-23+14:03:12
  =================================================================
  ==262224==ERROR: AddressSanitizer: global-buffer-overflow on address 0x557b201a451f at pc 0x557b1fe5e6fc bp 0x7fffcf1ac700 sp 0x7fffcf1ac6f8
  READ of size 1 at 0x557b201a451f thread T0
      #0 0x557b1fe5e6fb in sh_fmtq /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/string.c:341:5
      ksh93#1 0x557b1fe6098c in sh_fmtqf /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/string.c:473:10
      ksh93#2 0x557b1ff08dc0 in extend /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/print.c:998:14
      ksh93#3 0x557b2008a56c in sfvprintf /home/johno/GitRepos/KornShell/ksh/src/lib/libast/sfio/sfvprintf.c:531:8
      ksh93#4 0x557b2005b7f7 in sfprintf /home/johno/GitRepos/KornShell/ksh/src/lib/libast/sfio/sfprintf.c:31:7
      ksh93#5 0x557b1ff04272 in b_print /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/print.c:343:4
      ksh93#6 0x557b1ff04ebf in b_printf /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/bltins/print.c:148:9
      ksh93#7 0x557b1fe8d9a7 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1261:21
      ksh93#8 0x557b1fe7a7cf in sh_subshell /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/subshell.c:652:4
      ksh93#9 0x557b1fdedc0d in comsubst /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/macro.c:2207:9
      ksh93#10 0x557b1fdefc79 in varsub /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/macro.c:1181:3
      ksh93#11 0x557b1fde3bef in copyto /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/macro.c:620:21
      ksh93#12 0x557b1fde0b07 in sh_mactrim /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/macro.c:169:2
      ksh93#13 0x557b1fe05ab6 in nv_setlist /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/name.c:280:9
      ksh93#14 0x557b1fe8a7e8 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1051:7
      ksh93#15 0x557b1fe95b85 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:1940:5
      ksh93#16 0x557b1fe99ea6 in sh_exec /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/xec.c:2271:10
      ksh93#17 0x557b1fd23b04 in exfile /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:604:4
      ksh93#18 0x557b1fd1fe10 in sh_main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/main.c:369:2
      ksh93#19 0x557b1fd1d585 in main /home/johno/GitRepos/KornShell/ksh/src/cmd/ksh93/sh/pmain.c:41:9
      ksh93#20 0x7f55d5b5028f  (/usr/lib/libc.so.6+0x2328f) (BuildId: 26c81e7e05ebaf40bac3523b7d76be0cd71fad82)
      ksh93#21 0x7f55d5b50349 in __libc_start_main (/usr/lib/libc.so.6+0x23349) (BuildId: 26c81e7e05ebaf40bac3523b7d76be0cd71fad82)
      ksh93#22 0x557b1fc158d4 in _start /build/glibc/src/glibc/csu/../sysdeps/x86_64/start.S:115

src/cmd/ksh93/include/lexstates.h:
- Check if c is negative before accessing sh_lexstates. Backported from
  ksh2020: att@a7013320.
  I'll note that later in ksh2020 these macros became functions:
  att@adc589de. I didn't backport that
  commit because it requires the C99 bool type to avoid compiler
  warnings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants