Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Commit f0b0b67 breaks building GNU Binutils when using ksh93 as /bin/sh #507

Closed
atheik opened this issue Aug 4, 2022 · 6 comments
Closed
Labels
bug Something is not working

Comments

@atheik
Copy link

atheik commented Aug 4, 2022

The version of GNU Binutils being built doesn't seem to make a difference.
The build succeeds but is completely broken.

If you would like, you can reproduce this in Debian Live under x86_64 QEMU:

curl -LO https://cdimage.debian.org/cdimage/release/11.4.0-live/amd64/iso-hybrid/debian-live-11.4.0-amd64-standard.iso
qemu-system-x86_64 -cpu host -accel kvm -m 4096 -drive file=debian-live-11.4.0-amd64-standard.iso,media=cdrom

Run the following in Debian Live:

sudo -i

apt-get update
apt-get install git dejagnu expect

git clone https://github.com/ksh93/ksh.git
curl -LO https://ftp.gnu.org/gnu/binutils/binutils-2.35.2.tar.xz
tar xf binutils-2.35.2.tar.xz

# f0b0b67 is bad
cd /root/ksh
git checkout f0b0b67
rm -rf arch/*
bin/package make
install -vm 755 arch/*/bin/ksh /bin/sh
cd /root/binutils-2.35.2
mkdir build-bad
cd build-bad
../configure
make
make check-ld 2>&1 | tee make-check-ld.log | grep '# of '
# # of expected passes            296
# # of unexpected failures        1119
# # of expected failures          4
# # of unresolved testcases       77
# # of untested testcases         21
# # of unsupported tests          111

# 2124f14 is good
cd /root/ksh
git checkout 2124f14
rm -rf arch/*
bin/package make
install -vm 755 arch/*/bin/ksh /bin/sh
cd /root/binutils-2.35.2
mkdir build-good
cd build-good
../configure
make
make check-ld 2>&1 | tee make-check-ld.log | grep '# of '
# # of expected passes            2586
# # of unexpected failures        5
# # of expected failures          57
# # of untested testcases         1
# # of unsupported tests          23

Sorry, I know this makes for a pretty poor reproducer.
I'll try to look into this further.

I ran into this issue during a build of a Linux system that uses ksh93 as /bin/sh. #467 and #470 were found similarly a while back. I recently hadn't updated the system so I unfortunately missed the week-long testing of 93u+m/1.0.0-rc.1.

@McDutchie
Copy link

Investigating.

@McDutchie McDutchie added the bug Something is not working label Aug 4, 2022
@McDutchie
Copy link

I compared the good and bad binutils build directories with diff -ur. (It helps to build in a build directory first and then rename it afterwards, to avoid many irrelevant differences). In the bad build directory, after eliminating all the irrelevant differences (in names of temporary files), what remains is that large swaths of code are missing from files in ldscripts, whatever those are. I have no idea how that happens yet.

These files have these large missing code blocks.
+++ build.bad/ld/ldscripts/elf32_x86_64.x	2022-08-05 00:40:48.128129707 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xbn	2022-08-05 00:40:48.194129703 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xc	2022-08-05 00:40:48.234129700 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xce	2022-08-05 00:40:48.270129698 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xd	2022-08-05 00:40:48.572129679 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xdc	2022-08-05 00:40:48.634129675 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xdce	2022-08-05 00:40:48.673129672 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xde	2022-08-05 00:40:48.593129677 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xdw	2022-08-05 00:40:48.714129670 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xdwe	2022-08-05 00:40:48.751129667 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xe	2022-08-05 00:40:48.149129706 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xn	2022-08-05 00:40:48.169129705 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xr	2022-08-05 00:40:48.079129710 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xs	2022-08-05 00:40:48.374129691 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xsc	2022-08-05 00:40:48.435129687 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xsce	2022-08-05 00:40:48.473129685 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xse	2022-08-05 00:40:48.397129690 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xsw	2022-08-05 00:40:48.512129683 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xswe	2022-08-05 00:40:48.549129680 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xu	2022-08-05 00:40:48.107129708 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xw	2022-08-05 00:40:48.309129696 +0200
+++ build.bad/ld/ldscripts/elf32_x86_64.xwe	2022-08-05 00:40:48.346129693 +0200
+++ build.bad/ld/ldscripts/elf_i386.x	2022-08-05 00:40:49.406129625 +0200
+++ build.bad/ld/ldscripts/elf_i386.xbn	2022-08-05 00:40:49.482129620 +0200
+++ build.bad/ld/ldscripts/elf_i386.xc	2022-08-05 00:40:49.525129618 +0200
+++ build.bad/ld/ldscripts/elf_i386.xce	2022-08-05 00:40:49.567129615 +0200
+++ build.bad/ld/ldscripts/elf_i386.xd	2022-08-05 00:40:49.890129594 +0200
+++ build.bad/ld/ldscripts/elf_i386.xdc	2022-08-05 00:40:49.953129590 +0200
+++ build.bad/ld/ldscripts/elf_i386.xdce	2022-08-05 00:40:49.991129588 +0200
+++ build.bad/ld/ldscripts/elf_i386.xde	2022-08-05 00:40:49.914129593 +0200
+++ build.bad/ld/ldscripts/elf_i386.xdw	2022-08-05 00:40:50.032129585 +0200
+++ build.bad/ld/ldscripts/elf_i386.xdwe	2022-08-05 00:40:50.070129583 +0200
+++ build.bad/ld/ldscripts/elf_i386.xe	2022-08-05 00:40:49.429129624 +0200
+++ build.bad/ld/ldscripts/elf_i386.xn	2022-08-05 00:40:49.455129622 +0200
+++ build.bad/ld/ldscripts/elf_i386.xr	2022-08-05 00:40:49.347129629 +0200
+++ build.bad/ld/ldscripts/elf_i386.xs	2022-08-05 00:40:49.678129608 +0200
+++ build.bad/ld/ldscripts/elf_i386.xsc	2022-08-05 00:40:49.741129604 +0200
+++ build.bad/ld/ldscripts/elf_i386.xsce	2022-08-05 00:40:49.779129601 +0200
+++ build.bad/ld/ldscripts/elf_i386.xse	2022-08-05 00:40:49.702129606 +0200
+++ build.bad/ld/ldscripts/elf_i386.xsw	2022-08-05 00:40:49.818129599 +0200
+++ build.bad/ld/ldscripts/elf_i386.xswe	2022-08-05 00:40:49.858129596 +0200
+++ build.bad/ld/ldscripts/elf_i386.xu	2022-08-05 00:40:49.377129627 +0200
+++ build.bad/ld/ldscripts/elf_i386.xw	2022-08-05 00:40:49.606129612 +0200
+++ build.bad/ld/ldscripts/elf_i386.xwe	2022-08-05 00:40:49.648129610 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.x	2022-08-05 00:40:50.727129541 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xbn	2022-08-05 00:40:50.795129536 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xc	2022-08-05 00:40:50.831129534 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xce	2022-08-05 00:40:50.871129531 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xd	2022-08-05 00:40:51.194129511 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xdc	2022-08-05 00:40:51.257129507 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xdce	2022-08-05 00:40:51.300129504 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xde	2022-08-05 00:40:51.216129509 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xdw	2022-08-05 00:40:51.343129501 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xdwe	2022-08-05 00:40:51.391129498 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xe	2022-08-05 00:40:50.750129539 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xn	2022-08-05 00:40:50.773129538 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xr	2022-08-05 00:40:50.672129544 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xs	2022-08-05 00:40:50.987129524 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xsc	2022-08-05 00:40:51.051129520 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xsce	2022-08-05 00:40:51.089129517 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xse	2022-08-05 00:40:51.015129522 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xsw	2022-08-05 00:40:51.131129515 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xswe	2022-08-05 00:40:51.169129512 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xu	2022-08-05 00:40:50.698129542 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xw	2022-08-05 00:40:50.914129529 +0200
+++ build.bad/ld/ldscripts/elf_iamcu.xwe	2022-08-05 00:40:50.962129526 +0200
+++ build.bad/ld/ldscripts/elf_k1om.x	2022-08-05 00:40:53.311129375 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xbn	2022-08-05 00:40:53.375129371 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xc	2022-08-05 00:40:53.417129368 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xce	2022-08-05 00:40:53.454129366 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xd	2022-08-05 00:40:53.755129347 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xdc	2022-08-05 00:40:53.819129342 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xdce	2022-08-05 00:40:53.861129340 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xde	2022-08-05 00:40:53.776129345 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xdw	2022-08-05 00:40:53.973129333 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xdwe	2022-08-05 00:40:54.013129330 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xe	2022-08-05 00:40:53.332129374 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xn	2022-08-05 00:40:53.352129372 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xr	2022-08-05 00:40:53.261129378 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xs	2022-08-05 00:40:53.548129360 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xsc	2022-08-05 00:40:53.607129356 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xsce	2022-08-05 00:40:53.650129353 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xse	2022-08-05 00:40:53.568129359 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xsw	2022-08-05 00:40:53.686129351 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xswe	2022-08-05 00:40:53.727129348 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xu	2022-08-05 00:40:53.288129376 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xw	2022-08-05 00:40:53.489129364 +0200
+++ build.bad/ld/ldscripts/elf_k1om.xwe	2022-08-05 00:40:53.525129361 +0200
+++ build.bad/ld/ldscripts/elf_l1om.x	2022-08-05 00:40:52.060129455 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xbn	2022-08-05 00:40:52.127129451 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xc	2022-08-05 00:40:52.160129449 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xce	2022-08-05 00:40:52.203129446 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xd	2022-08-05 00:40:52.494129427 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xdc	2022-08-05 00:40:52.548129424 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xdce	2022-08-05 00:40:52.593129421 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xde	2022-08-05 00:40:52.514129426 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xdw	2022-08-05 00:40:52.632129418 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xdwe	2022-08-05 00:40:52.669129416 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xe	2022-08-05 00:40:52.084129454 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xn	2022-08-05 00:40:52.106129452 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xr	2022-08-05 00:40:52.014129458 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xs	2022-08-05 00:40:52.302129440 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xsc	2022-08-05 00:40:52.356129436 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xsce	2022-08-05 00:40:52.397129434 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xse	2022-08-05 00:40:52.323129438 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xsw	2022-08-05 00:40:52.434129431 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xswe	2022-08-05 00:40:52.470129429 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xu	2022-08-05 00:40:52.039129456 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xw	2022-08-05 00:40:52.243129443 +0200
+++ build.bad/ld/ldscripts/elf_l1om.xwe	2022-08-05 00:40:52.280129441 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.x	2022-08-05 00:40:46.829129790 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xbn	2022-08-05 00:40:46.895129786 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xc	2022-08-05 00:40:46.931129784 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xce	2022-08-05 00:40:46.978129781 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xd	2022-08-05 00:40:47.286129761 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xdc	2022-08-05 00:40:47.348129757 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xdce	2022-08-05 00:40:47.390129754 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xde	2022-08-05 00:40:47.310129760 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xdw	2022-08-05 00:40:47.427129752 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xdwe	2022-08-05 00:40:47.467129749 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xe	2022-08-05 00:40:46.846129789 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xn	2022-08-05 00:40:46.873129788 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xr	2022-08-05 00:40:46.782129793 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xs	2022-08-05 00:40:47.081129774 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xsc	2022-08-05 00:40:47.142129770 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xsce	2022-08-05 00:40:47.188129767 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xse	2022-08-05 00:40:47.106129773 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xsw	2022-08-05 00:40:47.225129765 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xswe	2022-08-05 00:40:47.262129763 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xu	2022-08-05 00:40:46.808129792 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xw	2022-08-05 00:40:47.017129778 +0200
+++ build.bad/ld/ldscripts/elf_x86_64.xwe	2022-08-05 00:40:47.058129776 +0200

@McDutchie
Copy link

This is a daunting problem.

I can see three possibilities. The optimisation introduced by f0b0b67 could:

  1. be invalid;
  2. be exposing some other bug in ksh;
  3. be exposing some bug in the GNU scripts.

I think 1 is unlikely. I think the optimisation is theoretically and conceptually sound. Of course I could be missing something. But the problem could also easily be number 2 or 3. In case 2, reverting it would just paper over the real ksh bug, and that's not what we're about here. In case 3, we are not the ones who should be changing.

So just blindly reverting the optimisation does not seem like a good idea. I really, really need a minimal reproducer to decide what to do about this issue. Without one we have no hope of knowing what the bug actually is. But it's very hard to derive one, as the GNU build scripts are highly complicated.

@McDutchie
Copy link

McDutchie commented Aug 5, 2022

The bug is triggered in binutil's ld/genscripts.sh and scripts sourced by it.

This patch to binutils allows tracing the execution path of these scripts in the `make` output (click the arrow to expand).
diff -ur binutils-2.35.2.orig/ld/genscripts.sh binutils-2.35.2/ld/genscripts.sh
--- binutils-2.35.2.orig/ld/genscripts.sh	2021-01-30 09:38:04.000000000 +0100
+++ binutils-2.35.2/ld/genscripts.sh	2022-08-05 07:02:54.940660426 +0200
@@ -1,4 +1,10 @@
 #!/bin/sh
+case $KSH_VERSION in
+Version*)
+	PS4='+ [${.sh.subshell:+S${.sh.subshell},}${.sh.file:+${.sh.file#${.sh.file%/*/*}/},}${.sh.fun:+${.sh.fun},}${LINENO:+L$LINENO,}e$?] '
+	set -x
+	;;
+esac
 # genscripts.sh - generate the ld-emulation-target specific files
 # Copyright (C) 2004-2020 Free Software Foundation, Inc.
 #

As a way of finding where it goes wrong, we can then diff the make output of a bad build with that of a good build. Here are those results from my testing.

That diff does not indicate a bug in the GNU scripts. Clearly, code is not getting executed that should be. Specifically, it looks like the sed command on line 490 in ld/scripttempl/elf.sc is getting incorrectly execve()d because execution abruptly returns to the parent script and the rest of the script is not executed. So either the optimization is incorrect or it exposes another bug in ksh.

@McDutchie
Copy link

McDutchie commented Aug 5, 2022

I have not found a minimal reproducer yet. And it looks like the problem is actually with the optimisation I introduced. It's not another bug in ksh, it was all me. :-/

More research is needed, so I'll just have to revert it. I'll make an urgent 1.0.1 point release as this is a fatal bug.

McDutchie added a commit that referenced this issue Aug 5, 2022
As of 2022-06-18, ksh 93u+m is not capable of being used as /bin/sh
while building GNU binutils. The execution of some of its build
system's dot scripts is incorrectly aborted as an external 'sed'
command is execve(2)'d without forking. This means that incorrect
exec optimization was happening.

Unfortunately I have not been able to derive a minimal reproducer
of the problem yet because the GNU binutils build scripts are very
complex. Pending further research, the optimisation is reverted.
Even if a way to make it work is found, it will not be reintroduced
to the 1.0 branch.

Thanks to @atheik for finding the problem and identifying the
commit that introduced it.

Resolves: #507
McDutchie added a commit that referenced this issue Aug 5, 2022
As of 2022-06-18, ksh 93u+m is not capable of being used as /bin/sh
while building GNU binutils. The execution of some of its build
system's dot scripts is incorrectly aborted as an external 'sed'
command is execve(2)'d without forking. This means that incorrect
exec optimization was happening.

Unfortunately I have not been able to derive a minimal reproducer
of the problem yet because the GNU binutils build scripts are very
complex. Pending further research, the optimisation is reverted.
Even if a way to make it work is found, it will not be reintroduced
to the 1.0 branch.

Thanks to @atheik for finding the problem and identifying the
commit that introduced it.

Resolves: #507
McDutchie added a commit that referenced this issue Aug 5, 2022
As of 2022-06-18, ksh 93u+m is not capable of being used as /bin/sh
while building GNU binutils. The execution of some of its build
system's dot scripts is incorrectly aborted as an external 'sed'
command is execve(2)'d without forking. This means that incorrect
exec optimization was happening.

Unfortunately I have not been able to derive a minimal reproducer
of the problem yet because the GNU binutils build scripts are very
complex. Pending further research, the optimisation is reverted.
Even if a way to make it work is found, it will not be reintroduced
to the 1.0 branch.

Thanks to @atheik for finding the problem and identifying the
commit that introduced it.

Resolves: #507
@atheik
Copy link
Author

atheik commented Aug 5, 2022

@McDutchie, thank you for the thorough analysis!
I took a look at ld/genscripts.sh. Is this a reproducer for the problem? If so, then maybe you know how to reduce it further.

cat <<'EOF1' >A
cat <<'EOF2' >B
( echo B1 ) | cat
( echo B2 ) | cat
EOF2

cat <<'EOF2' >C
( echo C1 ) | cat
( echo C2 ) | cat
EOF2

(
        . ./B
        . ./C
) | cat
EOF1

git checkout f0b0b67
rm -rf arch/$(bin/package host)
bin/package make
mv arch/$(bin/package host) build-f0b0b67
build-f0b0b67/bin/ksh ./A
# B1
# B2

git checkout 2124f14
rm -rf arch/$(bin/package host)
bin/package make
mv arch/$(bin/package host) build-2124f14
build-2124f14/bin/ksh ./A
# B1
# B2
# C1
# C2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is not working
Projects
None yet
Development

No branches or pull requests

2 participants