Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect special character handling in Haddock response files #3004

Closed
snoyberg opened this issue Dec 28, 2015 · 14 comments · Fixed by #3012
Closed

Incorrect special character handling in Haddock response files #3004

snoyberg opened this issue Dec 28, 2015 · 14 comments · Fixed by #3012

Comments

@snoyberg
Copy link
Collaborator

This has been causing us quite some trouble with our Stackage builds (see: commercialhaskell/stackage#1018). I recommend considering this a major issue worthy of an emergency patch, since the following combination seems to always cause failed Haddock builds:

  • Newest released Cabal (1.22.6.0)
  • Upcoming Haddock release (> 2.16.1)
  • Any cabal file with a newline in the synopsis

When generating a response file for the Haddock arguments, the current code in Cabal simply calls hPutStrLn on the argument without any escaping. When an argument has a newline (like the synopsis), anything after the newline is treated as its own argument.

Solution: use response file escaping to pass in all arguments. As an example, that's how I implemented response file support in GHC, see: ghc/ghc@296bc70#diff-782eb6a039b579ec019e5a4009f45058R1281

@ezyang ezyang self-assigned this Dec 28, 2015
@ezyang
Copy link
Contributor

ezyang commented Dec 28, 2015

Reproduced.

@ezyang
Copy link
Contributor

ezyang commented Dec 28, 2015

If I understand correctly, we can't fix this until haskell/haddock#379 is fixed, because Haddock does not yet support the standard GCC response file format.

@23Skidoo
Copy link
Member

/cc @randen

@ezyang
Copy link
Contributor

ezyang commented Dec 28, 2015

If you like, we can disable response files for now until Haddock fixes this.

@23Skidoo
Copy link
Member

Yes, needs to be fixed in Haddock first.

ezyang added a commit to ezyang/cabal that referenced this issue Dec 28, 2015
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
ezyang added a commit to ezyang/cabal that referenced this issue Dec 28, 2015
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
@ezyang ezyang removed their assignment Dec 28, 2015
@ezyang
Copy link
Contributor

ezyang commented Dec 28, 2015

The relevant code to fix is Distribution.Simple.Haddock, in renderArgs. Here's a hokey patch that doesn't work:

--- a/Cabal/Distribution/Simple/Haddock.hs
+++ b/Cabal/Distribution/Simple/Haddock.hs
@@ -497,7 +497,7 @@ renderArgs verbosity tmpFileOpts version comp args k = do
                  withTempFileEx tmpFileOpts outputDir "haddock-response.txt" $
                     \responseFileName hf -> do
                          when haddockSupportsUTF8 (hSetEncoding hf utf8)
-                         mapM_ (hPutStrLn hf) renderedArgs
+                         mapM_ (hPutStrLn hf . escape) renderedArgs
                          hClose hf
                          let respFile = "@" ++ responseFileName
                          k ([respFile], result)
@@ -515,6 +515,19 @@ renderArgs verbosity tmpFileOpts version comp args k = do
               pkgstr = display $ packageName pkgid
               pkgid = arg argPackageName
       arg f = fromFlag $ f args
+      -- This was cribbed from compiler/main/Systools.hs
+      escape x = concat
+          [ "\""
+          , concatMap
+              (\c ->
+                  case c of
+                      '\\' -> "\\\\"
+                      '\n' -> "\\n"
+                      '\"' -> "\\\""
+                      _    -> [c])
+              x
+          , "\""
+          ]

@randen
Copy link
Contributor

randen commented Dec 28, 2015

Arrrgv! Right, so even with cabal doing the escaping, e.g., snoyberg's ghc patch for response-file-generation, the called program must also be aware of the escaping and do the unescaping. Haddock > 2.16.1 was only used in building the docs for the recent HP releases on exactly one machine, so that is why this problem has laid dormant until we are now post-7.10.3.

Is escaping backslashes, newlines, and double-quote sufficient? When supported, we encode into UTF-8 in the above quoted code, but is anyone aware of any other caveats that are not obvious to me?

Short of reading the gcc code (e.g., the 2005 gcc patch referenced by https://gcc.gnu.org/wiki/Response_Files), is anyone aware of an actual "spec" for the " gcc -compatible syntax" of a response file?

edit: If we disable response file generation in cabal, as a work-around, it has not been depended upon by anyone, yet (except the HP build).

@23Skidoo
Copy link
Member

Short of reading the gcc code is anyone aware of an actual "spec" for the " gcc -compatible syntax" of a response file?

This part of gcc's source is actually quite readable and well-commented, so I suggest doing exactly that.

@randen
Copy link
Contributor

randen commented Dec 29, 2015

This may sound gross, but what if we do something like base64-encode each argument chunk, separating each by newlines? No escaping necessary (other than whatever the receiving program expects for its own argument parameters, such as whether it allows newlines, etc.). Not "standard" (like there is a standard for this) but should be completely reliable and robust. Since it is only for cabal-haddock interaction, it is rather limited. Would humans build such a response file? Since getting all the escaping right manually would likely involve a tool of some kind to be reliable and repeatable (no one is going to hand-create a response file and get it right the first time), it is not so onerous to have the user use a base64-encoding of each argument chunk. Also, the manual use case is probably not really the requirement here, is it? Anyway, I'm working on this problem, the gcc-spec-way, but please consider this other alternative for fun.

@23Skidoo
Copy link
Member

I'd much prefer to solve this the standard way. People may want to inspect Cabal-generated response files; other tools/scripts may want to generate them.

garetxe pushed a commit to garetxe/cabal that referenced this issue Jan 9, 2016
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
@23Skidoo
Copy link
Member

23Skidoo commented Mar 3, 2016

For anyone following this ticket: the issue is fixed in Cabal >= 1.22.7.0 and Haddock >= 2.16.2. Specifically, the version of Haddock that comes with GHC 8 RC2 (2.17.0) includes the fix.

@geraldus
Copy link

geraldus commented Mar 3, 2016

@23Skidoo do you mean

Haddock >= 2.16.2 … GHC 8 RC2 (2.17.0) …

?

@23Skidoo
Copy link
Member

23Skidoo commented Mar 3, 2016

Thanks, fixed.

@randen
Copy link
Contributor

randen commented Mar 3, 2016

To complete the story for archeologists (including myself--I wasn't able to find it immediately), I should have linked the PR that was used for the Haddock part of this: haskell/haddock/pull/470

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants