Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

w64devkit and GNU Autotools #50

Open
rmyorston opened this issue Feb 26, 2023 · 17 comments
Open

w64devkit and GNU Autotools #50

rmyorston opened this issue Feb 26, 2023 · 17 comments

Comments

@rmyorston
Copy link

Over at rmyorston/busybox-w32#297 we've been working on getting stuff that uses a configure script to build with w64devkit.

With a bit of hackery I've managed to get Expat to build. (Chosen as a guinea pig because it's included with w64devkit and it's quite small.)

GNU Autotools support for Windows platforms is mostly restricted to Cygwin/MSYS2. This sort of works for w64devkit (hence the hackery) but what's really needed is proper support for the w64devkit build environment.

Is there any interest in prodding the GNU Autotools people to do this?

@avih
Copy link

avih commented Feb 26, 2023

I think it would be great if autotools could work using busybox-w32 shell and w64devkit tools.

GNU Autotools support for Windows platforms is mostly restricted to Cygwin/MSYS2. This sort of works for w64devkit...

What actually doesn't work? I presume semicolon path separator is not great, and/or possibly colon at absolute paths, and/or absolute paths don't necessarily begin with slash? (and something with uname?)

It was mentioned before, but allowing some prefix as virtual root and make all the paths normal posix paths can help a lot in many things which don't expect windows path [separator], in addition to taking colon path separator.

Something like /bbdrive/c/whatever, similar to /cygdrive/...

The git-for-windows busybox-w32 downstream fork already has such changes (check the busybox.exe binary inside one of the "MiniGit-2.xxx-busybox...zip" downloads of the Git project).

@skeeto
Copy link
Owner

skeeto commented Feb 26, 2023 via email

@rmyorston
Copy link
Author

The details are in the busybox-w32 issue mentioned above. The tl;dr is that it's a stock w64devkit but you need to source this script before running configure:

export PATH_SEPARATOR=';'
export ac_executable_extensions='.exe'
export build_alias="$(uname -m)-pc-mingw64"

if [ -f configure ]
then
    echo "Converting configure..."
    sed -i 's/func_convert_file_msys_to_w32/func_convert_file_noop/' configure
fi

@skeeto
Copy link
Owner

skeeto commented Feb 27, 2023 via email

@skeeto
Copy link
Owner

skeeto commented Feb 27, 2023 via email

@skeeto
Copy link
Owner

skeeto commented Feb 27, 2023

I whipped one up to try it out: cmd.c. With that cmd.exe on my path, and the three environment variables exported, I could straightforwardly build m4 (note: disable fortify), GMP, MPFR, and MPC. GCC started to build but failed in the middle looking for sys/wait.h (i.e. a misconfiguration to be sorted out).

@avih
Copy link

avih commented Feb 27, 2023

I whipped one up to try it out: cmd.c.

// Converts libtool's Cygwin-style "cmd //c ..." to "cmd /c ..."
//   $ cc -nostartfiles -o cmd.exe cmd.c

This is not a cygwin thing, it's an MSYS2 thing.

In cygwin arguments are unmodified, and if you want to pass a Windows path to a Windows utility then you should convert the path yourself, using cygpath, similar to this:

$ /cygdrive/c/Windows/notepad.exe "$(cygpath -w /cygdrive/c/foo.txt)"

In MSYS2. however, they want to make it easier to integrate with Windows programs, so they do (at least these) three hacks:

  • Map the virtual roots /<drive-letter>/ to the Windows drive <drive-letter>:/.
  • When invoking a Windows binary (non MSYS2 native), they auto-convert existig MSYS2 paths to Windows paths, so e.g. /c anywhere at the arguments becomes c:/ when the Windows program sees it. However, this is not limited to existing paths, and e.g. an argument of /foo which is at the root of the MSYS2 filesystem view, becomes <MSYS2-ROOT-DRIVE-PATH>/foo even if it doesn't exist.
  • When invoking a Windows binary, if an argument begins with // then it escapes this auto-conversion, by doing a different auto-conversion and turning // into /. IIRC this other auto-conversion can be bypassed using some ENV var.

This behavior is baked into MSYS2's exec* C interfaces, and not only a bash hack. So if you compile a different shell it would still behave the same.

This can be tested with the following args.c when compiled as a windows binary args.exe:

#include <stdio.h>

int main(int argc, char **argv) {
    while (*argv)
        printf("[%s]\n", *argv++);
    return 0;
}

and

$ ./args.exe foo /c /foo //bar foo/bar
[D:\run\utils\args.exe]
[foo]
[C:/]
[T:/msys64/foo]
[/bar]
[foo/bar]

However, without a windows program (or with a native MSYS2 binary) the args are unmodified:

$ printf "[%s]\n" foo /c /foo //bar foo/bar
[foo]
[/c]
[/foo]
[//bar]
[foo/bar]

All this is very unfortunate together with the (traditional) windows switches which use /whatever convention. That's also the main reason I switched from MSYS2 to cygwin - it's much more pure and without these hacks (even if not as elaborate when it comes to mingw packages).

I'm guessing they manage to upstream this behavior into autotools (or only their autogen?), based on the uname value, when it presumably indicates an MSYS2 MINGW environment (not sure about "native" MSYS2 env).

@rmyorston
Copy link
Author

rmyorston commented Feb 27, 2023

I'm guessing they manage to upstream this behavior into autotools (or only their autogen?), based on the uname value

Indeed. uname -s reports MINGW64_NT-10.0-19044 in a mingw64 environment and MINGW32_NT-10.0-19044 in mingw32. Cygwin says CYGWIN-10.0-19044. These values are recognised by autotools.

For comparison, busybox-w32 uname -s says Windows_NT. The string can be changed using the configuration option CONFIG_UNAME_OSNAME (one of my self-serving upstream contributions). w64devkit could, perhaps, use that capability to masquerade as mingw64/32 and avoid the need to set build_alias. (EDIT: sorry, that's nonsense. The Windows_NT string is hardcoded in win32/uname.c.)

@skeeto Excellent progress on those builds! This looks very promising.

@avih
Copy link

avih commented Feb 27, 2023

I do wonder though why they use cmd to begin with?

Is there some functionality which cmd.exe can do and standard posix (/busybox/cygwin/msys2) tools cannot?

Presumably, if they apply the cmd //c ... MSYS2 hack then they do know that the environment should have the standard tools (like rm instead of DEL, etc).

So why cmd at all?

ndeed. uname -s reports MINGW64_NT-10.0-19044 in a mingw64 environment and MINGW32_NT-10.0-19044 in mingw32. Cygwin says CYGWIN-10.0-19044. These values are recognised by autotools.

It's a shame that MSYS2 upstreamed "MINGW" prefix and associated it with hacked paramaters mangling.

It really should have been something like "MSYS2-MINGW...", because not all MINGW setups (current or future) would necessarily hack the arguments like MSYS2 does...

@rmyorston
Copy link
Author

Presumably, if they apply the cmd //c ... MSYS2 hack then they do know that the environment should have the standard tools (like rm instead of DEL, etc).
So why cmd at all?

I was puzzled by that too. The only cases I've seen where this is used are of the form:

( cmd //c echo "$VAR" ) 2>/dev/null | sed ...

A subshell to run cmd to echo a variable?

This message from the author of the patch which added the cmd says:

You'd end up calling func_msys_to_mingw, which relies on the msys "magic" path
conversion logic:

( cmd //c echo "$1" )

MSYS (but not cygwin) notices that cmd is a native win32 program, and
that there is a path-like argument "$1". MSYS (but not cygwin) will then
automatically convert $1 to DOS format, before spawning the cmd process
-- which echos the converted path to stdout, where we can grab it. MSYS
(but not cygwin) also turns '//c' into '/c' (the extra slash means
"don't use the MSYS mount table to convert this "path") -- this is how
you pass win32-style switches to native programs. IIRC, there's some
complicated logic to determine whether a given argument that begins with
two slashes is a "switch" like /EHsc or a unix-format SMB path like
//server/share/path/to/file.

It seems cmd is being used to trick MSYS into converting the path to DOS format. We don't need to do that, so if the wrapper detects the //c echo case could it just echo the path itself without invoking cmd at all?

@avih
Copy link

avih commented Feb 27, 2023

It seems cmd is being used to trick MSYS into converting the path to DOS format. We don't need to do that, so if the wrapper detects the //c echo case could it just echo the path itself without invoking cmd at all?

Well, MSYS2 does have cygpath, and converting $PATH itself would work like this (in MINGW32 env):

$ cygpath -p -w -- "$PATH"
T:\msys64\usr\local32\bin;T:\msys64\mingw32\bin;T:\msys64\usr\local\bin;T:\msys64\usr\bin;T:\msys64\usr\bin;C:\Windows\System32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0\;T:\msys64\usr\bin\site_perl;T:\msys64\usr\bin\vendor_perl;T:\msys64\usr\bin\core_perl

cygpath is quite versatile, and I find it hard to believe that it doesn't cover their needs to convert MSYS2 paths to windows/dos paths (but this brings up the question again - why do they need dos paths at all?).

However, if they did that, which would be the correct way to handle path conversion instead of relying on some magic MSYS2 arguments hacks, then it still won't work in busybox-w32 (unless cygpath is implemented sufficiently for their use patterns - which should be possible I think).

So yeah, being able to detect the usage pattern and possibly replace it automatically with something which works in busybox-w32 should work.

Alternatively, maybe they do have some simple mode which doesn't rely on MSYS2 hacks, which could be enabled/enforced using some env vars, and which would happen to work in busybox-w32 (considering it's paths are indeed non-posix...).

@rmyorston
Copy link
Author

Alternatively, maybe they do have some simple mode which doesn't rely on MSYS2 hacks,

Cygwin doesn't depend on MSYS2 hacks.

Using build_alias=x86_64-pc-cygwin works to build Expat. It doesn't need to work around the cmd //c issue. But you end up with cygexpat-1.dll rather than libexpat-1.dll.

@avih
Copy link

avih commented Feb 27, 2023

Using build_alias=x86_64-pc-cygwin works to build Expat. It doesn't need to work around the cmd //c issue. But you end up with cygexpat-1.dll rather than libexpat-1.dll.

Hmm.. that's a bit hacky I think, and it's possible IMHO that some things could break either during configure or the actual build later, even if they didn't for expat.

However, it seems that, at least for expat, the env vars and the cmd //c thing are enough, so a script like this, saved at the PATH as e.g. bbconf or w64dkconf or whatever, can streamline this conversion without actually touching the original configure file, and also supports arguments, e.g.

bbconf ./configure --host=whatever --prefix=whatever

The script:

#!/bin/sh

# run autotools "configure" script in w64devkit env, using busybox-w32 sh,
# utilizing the MSYS2 MINGW code paths, and work around the following:
# - export some env vars which bypass the detection by "configure".
# - "cmd //c ..." is replaced with "cmd /c ...". originally //c works around
#   MSYS2 auto args copnversion to get "/c" - not needed with busybox-w32.
# The modification is written to a temporary file at the same dir, which is
# deleted [before startup and] after it's invoked.

echo() { printf %s\\n "$*"; }

error() {
    [ "${1+x}" ] && >&2 echo "${0##*/}: $*"
    >&2 echo "Usage: ${0##*/} path/to/autotools-configure [ARG...]"
    exit 1
}

[ "${1-}" ] || error
[ -e "$1" ] || error cannot find file -- "$1"

conf=$1
bbconf=$conf.bb.tmp
rm -f -- "$bbconf" || error "cannot remove file -- $bbconf"

shift


# The following might be needed too in some cases, but currently not applied
#   's/func_convert_file_msys_to_w32/func_convert_file_noop/'

sed -e 's/cmd \/\/c/cmd \/c/g' < "$conf" > "$bbconf" \
    || error "cannot create file -- $bbconf"
chmod +x "$bbconf"

export PATH_SEPARATOR=';'
export ac_executable_extensions='.exe'
export build_alias="$(uname -m)-pc-mingw64"

"$bbconf" "$@"

e=$?
rm -- "$bbconf"
exit "$e"

This makes it explicit that some conversion is applied.

However, if indeed the //c and exports are enough, then it would be more user friendly to export the stuff unconditionally automatically, and handle the //c thing using some script/binary wrapper.

@skeeto
Copy link
Owner

skeeto commented Feb 27, 2023 via email

@avih
Copy link

avih commented Feb 27, 2023

which is what lead to my cmd.exe wrapper idea

Right. I think it can also be implemented as a shell script, either named cmd or cmd.sh (which invokes cmd.exe, therefore bypassing itself at $PATH).

If the cmd //c thing is the main/only issue, then a wrapper executable (sh/binary) and the exported env should have it covered.

@rmyorston
Copy link
Author

Here's a patch to busybox-w32 ash to intercept commands with exactly the form cmd //c echo arg and replace them with the internal echo. Its action is unconditional at the moment. We might want it only to take effect when running a configure script. Or we might not bother.

diff --git a/shell/ash.c b/shell/ash.c
index 742067216..886498640 100644
--- a/shell/ash.c
+++ b/shell/ash.c
@@ -8970,6 +8970,21 @@ static int builtinloc = -1;     /* index in path of %builtin, or -1 */
 static void
 tryexec(IF_FEATURE_SH_STANDALONE(int applet_no,) const char *cmd, char **argv, char **envp)
 {
+#if ENABLE_PLATFORM_MINGW32
+   int a;
+
+   if (strcmp(argv[0], "cmd") == 0 &&
+           argv[1] && strcmp(argv[1], "//c") == 0 &&
+           argv[2] && strcmp(argv[2], "echo") == 0 &&
+           argv[3] && !argv[4] && (a = find_applet_by_name("echo")) > 0) {
+       argv += 2;
+       cmd = "echo";
+#if ENABLE_FEATURE_SH_STANDALONE
+       applet_no = a;
+#endif
+   }
+#endif
+
 #if ENABLE_FEATURE_SH_STANDALONE
    if (applet_no >= 0) {
 # if ENABLE_PLATFORM_MINGW32

@avih
Copy link

avih commented Mar 1, 2023

Here's a patch to busybox-w32 ash ...

I didn't try that, but (together with the exported env vars) I did try a cmd shell script wrapper - which didn't work (./configure completes, make has some errors). I'm guessing it searches for cmd.exe.

So I tried the same shell script, this time saved as cmd.exe:

#!/bin/sh

# save me as "cmd.exe" and place it early in PATH

# autotools assumes MINGW setup uses MSYS[2] env, and uses "cmd //c echo..."
# to invoke "cmd /c echo..." (MSYS converts //c into /c for windows prog args).
# this breaks in busybox-w32 sh, so replace the //c with normal /c

# only handle exactly cmd //c echo...
# could be enhance for more cases, but for autotools this seems enough
# and without breaking "cmd.exe" arguments in general
case ${COMSPEC-} in *\\cmd.exe)
    if [ "${1-}" = //c ] && [ "${2-}" = echo ]; then
        shift
        set -- /c "$@"
    fi
esac

exec "$COMSPEC" "$@"

And this does seem to work.

I'm guessing this could break with unicode paths, but then again, if unicode paths are used then I'm guessing it would break elsewhere too regardless of this cmd.exe wrapper, because currently all busybox-w32 tools and sh don't support unicode paths.

EDIT:
busybox-w32 does convert commom env vars to "mixed" paths case (\ into / e.g. in PATH and APPDATA etc), but not in COMSPEC. This is important, because the real cmd.exe doesn't like being invoked with such mixed paths. We could replace / into \ too in COMSPEC before exec as a future-proof thing.

skeeto added a commit that referenced this issue Jul 3, 2024
Introduce a "system-wide" profile loaded before ~/.profile, and populate
it with variables to guide GNU Autoconf. In general, "configure" scripts
should now work out-of-the-box.

Libtool assumes the host environment is MSYS2 and will be confused by
Windows-style command switches. The "/c" in "cmd /c" looks like a path,
which MSYS2 incorrectly decodes to "c:/". Anticipating this, libtool
encodes these calls as "cmd //c" which does not work outside MSYS2. A
busybox-w32 patch makes it behave like MSYS2 in just this one case.
rmyorston added a commit to rmyorston/busybox-w32 that referenced this issue Jul 7, 2024
Libtool assumes the host environment is MSYS2 and will be confused by
Windows-style command switches. The "/c" in "cmd /c" looks like a path,
which MSYS2 incorrectly decodes to "c:/". Anticipating this, libtool
encodes these calls as "cmd //c" which does not work outside MSYS2. A
busybox-w32 patch makes it behave like MSYS2 in just this one case.

Adds 88-96 bytes.

(GitHub issue #297 and skeeto/w64devkit#50)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants