Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build-type Configure on Windows with all-numeric path segments causes sed failure #5386

Closed
snoyberg opened this issue Jun 18, 2018 · 12 comments

Comments

@snoyberg
Copy link
Collaborator

If that repro sounds far-fetched, let me justify it a bit: the temporary paths on AppVeyor look something like c:\foo\bar\1\baz, meaning that whenever using a temporary directory to unpack and build, say, network, you get a build failure. Original issue: commercialhaskell/stack#3944.

To repro:

  1. Create the directory c:\1
  2. cd \1
  3. cabal unpack network-2.7.0.0
  4. cd network-2.7.0.0
  5. runghc Setup.hs configure --user

When using Cabal 2.2, I get the output at the end:

sed: -e expression #1, char 213: invalid reference \1 on `s' command's RHS

Note that the \1 changes based on the directory name chosen. I do not get a sed error for Cabal 2.0. I confirmed further that GHC 8.2.2 with its default-shipped Cabal (2.0) works, but upgrading to Cabal 2.2 does not.

@snoyberg
Copy link
Collaborator Author

I've reproduced this on AppVeyor using the script:

https://github.com/snoyberg/bug-network-ghc84/blob/44fb43df669afc08415af83a96fbdfdd6d9ec6b3/appveyor.yml

With AppVeyor results available at:

https://ci.appveyor.com/project/snoyberg/bug-network-ghc84

Note that, at the time of writing this, it failed for GHC 8.4.3, but succeeded for GHC 8.2.2.

@snoyberg
Copy link
Collaborator Author

Note that, while this is a terrible approach, the following change does in fact fix the bug:

diff --git a/Cabal/Distribution/Simple.hs b/Cabal/Distribution/Simple.hs
index e45c8312a..06da8d264 100644
--- a/Cabal/Distribution/Simple.hs
+++ b/Cabal/Distribution/Simple.hs
@@ -734,7 +734,9 @@ runConfigureScript verbosity backwardsCompatHack flags lbi = do
                 ((intercalate spSep extraPath ++ spSep)++) $ lookup "PATH" env
       overEnv = ("CFLAGS", Just cflagsEnv) :
                 [("PATH", Just pathEnv) | not (null extraPath)]
-      args' = configureFile:args ++ ["CC=" ++ ccProgShort]
+      args' = map (map replace) $ configureFile:args ++ ["CC=" ++ ccProgShort]
+      replace '\\' = '/'
+      replace c = c
       shProg = simpleProgram "sh"
       progDb = modifyProgramSearchPath
                (\p -> map ProgramSearchPathDir extraPath ++ p) emptyProgramDb

@quasicomputational
Copy link
Contributor

I think there's a deeper bug here, possibly in the network's configure script that has been uncovered by a change in Cabal. I can reproduce the failure on Linux by building in a directory named '\9':

sed: -e expression #1, char 216: invalid reference \9 on `s' command's RHS

My guess is it's doing something like sed s/something/$CWD/ and not escaping metacharacters on the RHS, but I haven't looked at its invocations of sed to see if they're safe or not. If that is the case, then it's not Cabal's bug: this was always broken and, at some point between 2.0 and 2.2, Cabal changed to passing Windows paths with backslashes rather than slashes, causing the observed failures.

@snoyberg
Copy link
Collaborator Author

AFAICT, this was originally caused by af49513. Prior to this commit, the configure script was referenced by relative path, bypassing this issue. We can get back that prior behavior without losing the goal of current-directory indifference with:

diff --git a/Cabal/Distribution/Simple.hs b/Cabal/Distribution/Simple.hs
index e45c8312a..514a5ef23 100644
--- a/Cabal/Distribution/Simple.hs
+++ b/Cabal/Distribution/Simple.hs
@@ -724,8 +724,6 @@ runConfigureScript verbosity backwardsCompatHack flags lbi = do
   -- to ccFlags
   -- We don't try and tell configure which ld to use, as we don't have
   -- a way to pass its flags too
-  configureFile <- makeAbsolute $
-    fromMaybe "." (takeDirectory <$> cabalFilePath lbi) </> "configure"
   let extraPath = fromNubList $ configProgramPathExtra flags
   let cflagsEnv = maybe (unwords ccFlags) (++ (" " ++ unwords ccFlags))
                   $ lookup "CFLAGS" env
@@ -734,7 +732,7 @@ runConfigureScript verbosity backwardsCompatHack flags lbi = do
                 ((intercalate spSep extraPath ++ spSep)++) $ lookup "PATH" env
       overEnv = ("CFLAGS", Just cflagsEnv) :
                 [("PATH", Just pathEnv) | not (null extraPath)]
-      args' = configureFile:args ++ ["CC=" ++ ccProgShort]
+      args' = "./configure":args ++ ["CC=" ++ ccProgShort]
       shProg = simpleProgram "sh"
       progDb = modifyProgramSearchPath
                (\p -> map ProgramSearchPathDir extraPath ++ p) emptyProgramDb
@@ -743,7 +741,7 @@ runConfigureScript verbosity backwardsCompatHack flags lbi = do
   case shConfiguredProg of
       Just sh -> runProgramInvocation verbosity $
                  (programInvocation (sh {programOverrideEnv = overEnv}) args')
-                 { progInvokeCwd = Just (buildDir lbi) }
+                 { progInvokeCwd = fmap takeDirectory (cabalFilePath lbi) }
       Nothing -> die notFoundMsg

   where

And it may be slightly blasphemous to say this, but it seems like this may in fact be a bug in autotools not having proper escaping support for backslashes. I have yet to demonstrate that. However, given the great difficulty in getting a patch into autotools, and the fact that many existing generated configure scripts would still have this bug (including all existing versions of the network package!), I think it's highly desirable to fix this in Cabal itself.

@snoyberg
Copy link
Collaborator Author

@quasicomputational I agree with your analysis. However, I still think the fix for this should go into Cabal itself, see my previous comment (which I was writing concurrently with yours).

@quasicomputational
Copy link
Contributor

I don't know autotools but I'd be slightly shocked (though maybe not surprised) to find that it didn't deal with the full range of POSIX paths.

Preferring slashes to backslashes is sane, I think? I think your patch needs to be careful not to mangle a POSIX path like /tmp/\9; i.e., it ought to be something like intercalate "/" . splitDirectories. That won't fully fix the root problem, but I guess it'd keep it from happening on Appveyor, which is a win.

Some suspicious sed invocations from network:

Those files seem to have come straight from autotools, so a bug there is looking more and more likely.

@quasicomputational
Copy link
Contributor

Yeah, autotools bug. Here's the offending sed call in configure, in code generated by autotools:

# Neutralize VPATH when `$srcdir' = `.'.
# Shell code in configure.ac might set extrasub.
# FIXME: do we really want to maintain this feature?
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_sed_extra="$ac_vpsub
$extrasub
_ACEOF
cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
:t
/@[a-zA-Z_][a-zA-Z_0-9]*@/!b
s|@configure_input@|$ac_sed_conf_input|;t t
s&@top_builddir@&$ac_top_builddir_sub&;t t
s&@top_build_prefix@&$ac_top_build_prefix&;t t
s&@srcdir@&$ac_srcdir&;t t
s&@abs_srcdir@&$ac_abs_srcdir&;t t
s&@top_srcdir@&$ac_top_srcdir&;t t
s&@abs_top_srcdir@&$ac_abs_top_srcdir&;t t
s&@builddir@&$ac_builddir&;t t
s&@abs_builddir@&$ac_abs_builddir&;t t
s&@abs_top_builddir@&$ac_abs_top_builddir&;t t
$ac_datarootdir_hack
"
eval sed \"\$ac_sed_extra\" "$ac_file_inputs" | $AWK -f "$ac_tmp/subs.awk" \
  >$ac_tmp/out || as_fn_error $? "could not create $ac_file" "$LINENO" 5

$abs_top_srcdir is, as you might expect given the name, /tmp/\9/network-2.7.0.0. Unless it's a documented thing that you just can't use autotools in directories containing sed metacharacters, this is broken.

@quasicomputational
Copy link
Contributor

I've sent a bug report to the autoconf people but we should probably also mitigate this in Cabal. The basic idea of Michael's patch is a good one; I'll try to make it robust and add some tests as well.

@snoyberg
Copy link
Collaborator Author

Thanks for sending the bug report upstream, and isolating this further. Can you share the link to the bug report?

quasicomputational added a commit to quasicomputational/cabal that referenced this issue Jun 18, 2018
@quasicomputational
Copy link
Contributor

autoconf doesn't seem to use a bug-tracker. This is the message I sent to their bug list.

#5388 is where I'm hacking on the tests + fix but it's not baked yet; I've just realised how to test the doesn't-mangle-weird-POSIX-paths property, so that's next!

ndmitchell added a commit to ndmitchell/neil that referenced this issue Jun 18, 2018
quasicomputational added a commit to quasicomputational/cabal that referenced this issue Jun 19, 2018
On all platforms, warn about their presence. On Windows, we should use
slashes (as opposed to backslashes) where possible, to avoid causing
things like haskell#5386.

Closes haskell#5386.
quasicomputational added a commit to quasicomputational/cabal that referenced this issue Jun 19, 2018
On all platforms, warn about their presence. On Windows, we should use
slashes (as opposed to backslashes) where possible, to avoid causing
things like haskell#5386.

Closes haskell#5386.
@hvr
Copy link
Member

hvr commented Jun 20, 2018

If OP had invested a little bit of time to consult the GNU Autoconf manual instead, he might have saved himself some time by reading up on the filesystem limitations https://www.gnu.org/software/autoconf/manual/autoconf.html#File-System-Conventions, stating

Autoconf uses shell-script processing extensively, so the file names that it processes should not contain characters that are special to the shell. Special characters include space, tab, newline, NUL, and the following:

" # $ & ' ( ) * ; < = > ? [ \ ` |

Also, file names should not begin with ~ or -, and should contain neither - immediately after / nor ~ immediately after :. On Posix-like platforms, directory names should not contain :, as this runs afoul of : used as the path separator.

These restrictions apply not only to the files that you distribute, but also to the absolute file names of your source, build, and destination directories.

...

@23Skidoo
Copy link
Member

23Skidoo commented Jun 20, 2018

On Posix-like platforms, directory names should not contain :,

Does this mean that c:/foo/bar is illegal? Or Windows does not count as Posix-like?

quasicomputational added a commit to quasicomputational/cabal that referenced this issue Jun 20, 2018
On all platforms, warn about their presence. On Windows, we should use
slashes (as opposed to backslashes) where possible, to avoid causing
things like haskell#5386.

Closes haskell#5386.
ndmitchell added a commit to ndmitchell/bug-network-ghc84 that referenced this issue Jun 22, 2018
Work around haskell/cabal#5386 by setting the TMP env var
tfausak added a commit to tfausak/rattletrap that referenced this issue Jul 10, 2018
tfausak added a commit to tfausak/github-release that referenced this issue Jul 14, 2018
mwotton added a commit to mwotton/cacher that referenced this issue Aug 4, 2018
snoyberg added a commit to snoyberg/snoyman.com-content that referenced this issue Aug 6, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants