Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

configure fails when run with a posix shell supporting $LINENO (e.g. dash 0.5.9 and many others) #23448

Closed
tornaria opened this issue Jul 18, 2017 · 43 comments

Comments

@tornaria
Copy link
Contributor

When using an unpatched recent version of dash to run configure, this happens:

$ wget http://gondor.apana.org.au/~herbert/dash/files/dash-0.5.9.1.tar.gz
[...]
$ tar xf dash-0.5.9.1.tar.gz
$ cd dash-0.5.9.1
$ ./configure && make
[...]
$ cd ..
$ wget http://files.sagemath.org/devel/sage-8.0.rc2.tar.gz
$ tar xf sage-8.0.rc2.tar.gz
$ cd sage-8.0.rc2
$ ../dash-0.5.9.1/src/dash ./configure
$ ../dash-0.5.9.1/src/dash ./configure
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether to enable maintainer-specific portions of Makefiles... yes
configure: WARNING: you should use --build, --host, --target
configure: WARNING: you should use --build, --host, --target
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking whether make supports nested variables... yes
checking whether to enable maintainer-specific portions of Makefiles... yes
checking for root user... no
checking build system type... Invalid configuration `x': machine `x' not recognized
configure: error: bash config/config.sub x failed

My /bin/sh happens to be this shell, so just running make is broken in my system.

After much suffering, I found out the code at fault in configure.ac:

#---------------------------------------------------------
# We need to run this configure script with bash
if test -z "$BASH_VERSION$CONFIG_SHELL"
then
    CONFIG_SHELL=bash
    export CONFIG_SHELL
    if $CONFIG_SHELL -c "exit 0"
    then
        exec $CONFIG_SHELL $0 "$@"
[...]

introduced by e03f296.

I guess the idea is to re-run configure ($0) with the same arguments ($@) but using bash. However, by the time we reach the line exec $CONFIG_SHELL $0 "$@", the positional parameters have been obliterated by previous tests. Indeed, in the example above it happens to run bash ./configure x make which produces the error.

When using debian's dash, it turns out that before reaching this code configure decides to rerun itself and for that reason it sets CONFIG_SHELL=/bin/sh (to avoid a recursion).

Hence, the test above is wrong (IMO it should only test $BASH_VERSION).

CC: @jdemeyer

Component: build: configure

Keywords: sd87

Author: Gonzalo Tornaría

Branch/Commit: 89212b4

Reviewer: Dima Pasechnik

Issue created by migration from https://trac.sagemath.org/ticket/23448

@tornaria tornaria added this to the sage-8.0 milestone Jul 18, 2017
@tornaria
Copy link
Contributor Author

comment:1

After much suffering, I found out the code at fault in configure.ac:

#---------------------------------------------------------
# We need to run this configure script with bash
if test -z "$BASH_VERSION$CONFIG_SHELL"
then
    CONFIG_SHELL=bash
    export CONFIG_SHELL
    if $CONFIG_SHELL -c "exit 0"
    then
        exec $CONFIG_SHELL $0 "$@"
[...]

introduced by e03f296.

I guess the idea is to re-run configure ($0) with the same arguments ($@) but using bash. However, by the time we reach the line exec $CONFIG_SHELL $0 "$@", the positional parameters have been obliterated by previous tests. Indeed, in the example above it happens to run bash ./configure x make which produces the error.

When using debian's dash, it turns out that before reaching this code configure decides to rerun itself and for that reason it sets CONFIG_SHELL=/bin/sh (to avoid a recursion).

Hence, the test above is wrong (IMO it should only test $BASH_VERSION).

@tornaria
Copy link
Contributor Author

New commits:

60ac900(#23448) fix configure when run with dash

@tornaria
Copy link
Contributor Author

Commit: 60ac900

@tornaria
Copy link
Contributor Author

Branch: u/tornaria/23448

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jul 18, 2017

Changed commit from 60ac900 to e77704b

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jul 18, 2017

Branch pushed to git repo; I updated commit sha1. New commits:

e77704b(#23448) fix configure when run with old dash

@tornaria
Copy link
Contributor Author

comment:4

The first patch makes configure work with new versions of dash, but it breaks with old versions of dash (unpatched 0.5.6, and also version 0.5.7-4+b1 from debian jessie.

This happens because test -z "$BASH_VERSION$CONFIG_SHELL" is incorrect, it should instead be test -z "$BASH_VERSION", which is what the second patch does.

@jdemeyer
Copy link

comment:7

I'm a bit lost here. I think I understood the problem, and I understand what your patch does. However, I don't see how that actually fixes the problem.

@jdemeyer

This comment has been minimized.

@jdemeyer
Copy link

comment:8

General side comment: when fixing a bug, three things have to be asked:

(1) what is the problem?

(2) what does the patch do?

(3) how does (2) actually fix (1)?

In this case, I am missing (3).

@tornaria
Copy link
Contributor Author

comment:9

Thanks, I tried to explain it above, but I will be more detailed:

(1) by the time configure reaches the line exec $CONFIG_SHELL $0 "$@", the positional parameters have been obliterated by previous tests, so what is actually run is exec bash ./configure x make which is bad (configure interprets x as the target machine which fails).

(2) the patch ​60ac900 moves the block of code to before calling other tests.

(3) this fixes (1) for recent dash because now exec $CONFIG_SHELL $0 "$@" uses the correct positional parameters.

@tornaria
Copy link
Contributor Author

comment:10

It remains to explain e77704b.

I'm actually confused as to the intent of doing test -z "$BASH_VERSION$CONFIG_SHELL", maybe this is so that if the user sets CONFIG_SHELL then we leave it unchanged.

But the problem is that for some older versions of dash the variable CONFIG_SHELL is written to if not set, so the test fails (that seems to be the status quo anyway, so nothing that ​60ac900 changes.

Maybe it's better to change ​60ac900 so that the code in question is run even earlier, before CONFIG_SHELL gets a chance to be written to.

@tornaria
Copy link
Contributor Author

comment:11

I understand better now.

The first thing configure does is to check whether the current shell is "good enough" to run configure. If not, it will try sh, bash, ksh and sh5 in /bin or /usr/bin.

When we run configure on debian, /bin/sh is rejected, and so CONFIG_SHELL ends up being set to /bin/bash and the whole configure script is run under bash, so everything is fine.

When I run configure on my system, /bin/sh is ok, so configure keeps going with CONFIG_SHELL unset until it reaches the test -z "$BASH_VERSION$CONFIG_SHELL" which is true. As I explained above, this ends up running exec bash $0 "$@" which is wrong unless it is run very early. This is what ​60ac900 fixes.

As to e77704b, it is necessary when the following conditions are met:
a. I run configure under an old version of dash (or any "//not// good enough" shell)
b. /bin/sh is a new version of dash (or any "good enough shell" which is not bash)
For example, I could be doing

$ ../dash-0.5.6/src/dash ./configure

What happens in this case is that configure decides that the current shell (old version of dash) is not good, so it looks for a better one. It decides /bin/sh is good enough so it sets CONFIG_SHELL=/bin/sh and re-execs itself under this shell. Now the test -z "$BASH_VERSION$CONFIG_SHELL" is false, and so configure is not reexecuted with bash and things break later on [1].

The only way to fix this I could find is to change the test to test -z "$BASH_VERSION", which is what e77704b is doing.

The suggestion of moving the test earlier is not possible: if placed before the line

AC_INIT([Sage], SAGE_VERSION, [sage-devel@googlegroups.com])

it is too early and it won't be included in configure.

But after this line it is already too late (CONFIG_SHELL is already set in the cases where it is a problem).

[1] to be explicit, the first problem is the substitution "${!need_to_install}" which is not supported by dash.

@tornaria
Copy link
Contributor Author

Author: tornaria

@jdemeyer
Copy link

comment:13

Author name

@jdemeyer
Copy link

comment:14

Wouldn't it make sense to move the block

if test -z "$BASH_VERSION"

even higher in the file, as early as possible?

@tornaria
Copy link
Contributor Author

Changed branch from u/tornaria/23448 to u/tornaria/23448-v2

@tornaria
Copy link
Contributor Author

comment:15

I've changed the patch so that this is done immediately after AC_INIT, as per this paragraph in autoconf documentation:
If your configure script does its own option processing, it should inspect ‘$@’ or ‘$’ immediately after calling AC_INIT, because other Autoconf macros liberally use the set command to process strings, and this has the side effect of updating ‘$@’ and ‘$’.
https://www.gnu.org/savannah-checkouts/gnu/autoconf/manual/autoconf-2.69/html_node/Initializing-configure.html

I put the new patch as a single commit in a new branch, I'm not sure if that's proper etiquette, please let me know otherwise.


New commits:

16d2c3aTrac #23448: fix configure when run with dash

@tornaria
Copy link
Contributor Author

Changed author from tornaria to Gonzalo Tornaría

@tornaria
Copy link
Contributor Author

Changed branch from u/tornaria/23448-v2 to u/tornaria/23448-v3

@jdemeyer
Copy link

Replying to @tornaria:

My /bin/sh happens to be this shell, so just running make is broken in my system.

Simple proposal: change the Makefile to run bash configure instead of ./configure and error out in configure if the shell is not bash.

I think that is a much simpler solution which would also solve your problem.

Comments?

@tornaria
Copy link
Contributor Author

comment:21

Replying to @jdemeyer:

Simple proposal: change the Makefile to run bash configure instead of ./configure and error out in configure if the shell is not bash.

I think that is a much simpler solution which would also solve your problem.

I wouldn't say the current proposal is not simple, in fact my current patch has this in configure.ac:

# The following needs to be immediately after calling AC_INIT
#------------------------------------------------------------
# We need to run this configure script with bash
if test -z "$BASH_VERSION"
then
    CONFIG_SHELL=`command -v bash`
    export CONFIG_SHELL
    if $CONFIG_SHELL -c "exit 0"
    then
        exec $CONFIG_SHELL $0 "$@"
    else
        AC_MSG_NOTICE([The 'bash' shell is needed to build AC_PACKAGE_NAME])
        AC_MSG_NOTICE([All modern systems will have the 'bash' shell installed somewhere])
        if test -d /opt/OpenSource/bin
        then
           AC_MSG_NOTICE([On HP-UX you may try adding /opt/OpenSource/bin to your path])
        fi  
        if test -d /opt/pware/bin
        then
           AC_MSG_NOTICE([On AIX you may try adding /opt/pware/bin to your path])
        fi  
        AC_MSG_ERROR(['bash' not found])
    fi  
fi

while for your proposal it would be:

#------------------------------------------------------------
# We need to run this configure script with bash
if test -z "$BASH_VERSION"
then
    AC_MSG_NOTICE([The 'bash' shell is needed to build AC_PACKAGE_NAME])
    AC_MSG_NOTICE([All modern systems will have the 'bash' shell installed somewhere])
    if test -d /opt/OpenSource/bin
    then
       AC_MSG_NOTICE([On HP-UX you may try adding /opt/OpenSource/bin to your path])
    fi  
    if test -d /opt/pware/bin
    then
       AC_MSG_NOTICE([On AIX you may try adding /opt/pware/bin to your path])
    fi  
    AC_MSG_ERROR(['bash' not found])
fi

So it's essentially the same except the first one will try to find bash and reexec instead, we avoid tracking all the paths that lead to running configure.

Note that I didn't write the code above, it was already written and the only problem was that it was moved to run too late. With the warning I added (needs to be immediately after calling AC_INIT) it should be safe.

@tornaria
Copy link
Contributor Author

comment:22

Replying to @jdemeyer:

I think that is a much simpler solution which would also solve your problem.

Another comment: it just happens that I'm running a new version of dash, while most others (e.g. debian, ubuntu, etc) are running an older version.

This will be a problem for everybody once the main distros pick up a new version of dash, so fixing this now is investing for the future :-)

(same applies to #23451)

@jdemeyer
Copy link

comment:23

Replying to @tornaria:

So it's essentially the same except the first one will try to find bash and reexec instead

"apples and oranges are essentially the same except they are different" :-)

As this ticket shows, it's precisely the reexec part which is fragile, so I suggest just dropping that.

@tornaria
Copy link
Contributor Author

comment:24

Replying to @jdemeyer:

Replying to @tornaria:

So it's essentially the same except the first one will try to find bash and reexec instead

"apples and oranges are essentially the same except they are different" :-)

As this ticket shows, it's precisely the reexec part which is fragile, so I suggest just dropping that.

Not quite, the "fragile" part is using the positional parameters "$0" and "$@", but that's very clearly documented in the autoconf manual that the parameters must be used immediately after calling AC_INIT since other macros will change their value.

I would argue that if configure requires bash, then the #! line should have /bin/bash instead, but that may have other problems (maybe bash is in a different location, etc), it's a bit silly to ship a script with "#! /bin/sh" for which the correct way to run it is bash ./configure.

The standard way to instal a random program from source is to run

./configure
make

which won't work in your proposal.

@dimpase
Copy link
Member

dimpase commented Jan 6, 2018

Reviewer: Dima Pasechnik

@dimpase
Copy link
Member

dimpase commented Jan 6, 2018

comment:25

I am seeing a similar problem on FreeBSD, where /bin/sh is not bash or dash. However, I have /bin/bash, pointing to /usr/local/bin/bash, and this is my default shell.

My current workaround is to simply edit ./configure to replace every /bin/sh with /bin/bash.

The branch on this ticket does not work for me, as /bin/sh is not the right shell to propagate into the build files by ./configure: I get some silly errors while running make (which then tries to use /bin/sh instead of /bin/bash).

I think we should give up on even trying to run configure with /bin/sh, make bash a Sage pre-req, and use /bin/bash (or whatever the proper shebang invocation would be e.g. /usr/bin/env bash) consistently throughout.

@dimpase dimpase modified the milestones: sage-8.0, sage-8.2 Jan 6, 2018
@dimpase dimpase changed the title configure fails when run with dash configure fails when run with dash or BSD sh Jan 6, 2018
@dimpase
Copy link
Member

dimpase commented Jan 7, 2018

comment:27

Would it be acceptable to simply add to ./bootstrap the sed editing of the generated ./configure to replace all the occurence of /bin/sh with the output of command -v bash ?

@tornaria
Copy link
Contributor Author

tornaria commented Jan 7, 2018

comment:28

If my solution doesn't fix your bug, then it may be different bug.

Can you please document clearly what fails before/after this patches? I suggest that you include detailed information of the exact shell with version and where to get it / how to build it, as I did in my report.

AFAICT the behaviour with any posix-compatible shell would be to run configure under bash. This was broken before, and it should be fixed after.

Did you remember to run ./bootstrap after switching branches?

I just tried to run configure with the three BSD shells that I have available in my distro:

[*] loksh-6.2_1                        A Linux port of OpenBSD's ksh
[*] mksh-R56b_1                        The MirBSD Korn Shell
[*] oksh-0.5.9_2                       OpenBSD's version of ksh ported to Linux

I can tell you that those three exhibit the same behaviour as dash 0.5.9.1, namely configure fails in the same way before my patch, and works after my patch.

While I was at it, I also tried all the other posix shells I have available in my distro:

[*] heirloom-sh-050706_3                A portable variant of the traditional Unix shell
[*] ksh-2017.12.31_1                    AT&T's Korn shell (ksh93)
[*] yash-2.46_1                         Yet another shell (POSIX-compliant)
[*] zsh-5.4.2_1                         The Z SHell

with the same result (configure is broken before, works after).

Note that I am NOT claiming one can build the whole of sage with those shells, but only that configure can be run successfully. In fact, #23451 reports and fixes two issues down the line.

Also note that this works by switching to bash, so it's still a requirement, but it will pick bash from your PATH so you don't need to put it in /bin/bash (which you can't unless you are root)


The patch I proposed fixes the issue I reported in a reasonable way without breaking anything that already works. It is really simple, just changing the order of some lines to match what is required explicitly by autoconf documentation, and some little details. I explained and documented everything that's going on.

I documented above 8 different shells for which configure is broken and fixed by this.

The fact that there may be OTHER bugs that my patch won't fix is a non sequitur for rejecting it, and I even reported at least one more (#23451).

I spent a fair number of hours back in SD87 to understand and diagnose this and the other bug, fix it and document the reasons for the bug and the fix, and the situation is that 6 months after I still cannot build Sage in void linux.

More important, as I argued above, is that is only matter of time until "more mainstream" distros pick the new version of dash that will break sage building.


Let me summarise the situation.

Before the patch:
a. sage can be built if /bin/sh is bash or an old version of dash (I guess any posix shell that doesn't support $LINENO)
b. configure fails if /bin/sh is a posix shell that supports $LINENO

After the patch:
c. nothing changes for situation (a)
d. configure works if /bin/sh is a posix shell that supports $LINENO (tested with 8 different shells)
e. sage can be completely built after some minor additional fixes to a couple of packages (#23451)

Can we please:

  1. limit the scope of this ticket to fixing the configure step of sage when /bin/sh is a posix shell that supports $LINENO
  2. limit the review to checking my claim that
    a. the patch won't break building in systems where it now works
    b. it fixes the configure step under other shells (including dash 0.5.9.1)

Fixing other steps of the build is a non-goal of the current ticket.

@tornaria tornaria changed the title configure fails when run with dash or BSD sh configure fails when run with a posix shell supporting $LINENO (e.g. dash 0.5.9 and many others) Jan 7, 2018
@dimpase
Copy link
Member

dimpase commented Jan 7, 2018

Changed reviewer from Dima Pasechnik to Jeroen Demeyer, Dima Pasechnik

@dimpase
Copy link
Member

dimpase commented Jan 7, 2018

comment:29

OK, sorry for noise. It looks good.

@tornaria
Copy link
Contributor Author

tornaria commented Jan 7, 2018

comment:30

Replying to @dimpase:

OK, sorry for noise. It looks good.

Thanks, let me know if I can help you diagnosing the issue you have with FreeBSD.

@jdemeyer
Copy link

jdemeyer commented Jan 7, 2018

Changed reviewer from Jeroen Demeyer, Dima Pasechnik to Dima Pasechnik

@jdemeyer
Copy link

jdemeyer commented Jan 7, 2018

comment:31

I never said that I agreed with the current branch.

@dimpase
Copy link
Member

dimpase commented Jan 7, 2018

comment:32

Replying to @jdemeyer:

I never said that I agreed with the current branch.

I take "reviewer" in sense of publishing, not as "signed off by".

This patch is an improvement for freebsd and other non-*ash shells too.

@vbraun
Copy link
Member

vbraun commented Jan 14, 2018

Changed branch from u/tornaria/23448-v3 to 89212b4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants