Skip to content

Emulate GNU ar's deterministic mode #1537

Merged
merged 6 commits into from Nov 6, 2013

5 participants

@liyang
liyang commented Oct 11, 2013

After invoking ar(1), replace each object file's metadata with zero for the mtime, UID and GID, and 0644 for the file mode. Do not touch the existing archive if the contents are the same.

Relevant discussion in #1177; we can probably close the previous attempts in #1410 and #1501.

@nh2 originally wrote this but he disappeared. We would like to see this in production eventually, so I cleaned it up and added more sanity checks.

@tibbe tibbe and 1 other commented on an outdated diff Oct 11, 2013
Cabal/Distribution/Simple/Program/Ar.hs
simple = programInvocation ar (simpleArgs ++ extraArgs)
initial = programInvocation ar (initialArgs ++ extraArgs)
middle = initial
final = programInvocation ar (finalArgs ++ extraArgs)
- in sequence_
+ removeFile targetTmp `catchIO` \ ioe -> unless (isDoesNotExistError ioe) $
@tibbe
Haskell member
tibbe added a note Oct 11, 2013

I forget, don't we already have utilities to create a temporary filename that's guaranteed to not be used?

@liyang
liyang added a note Oct 12, 2013

I think I saw one. I'll look into using some existing thing, but I remembered it giving a file Handle which I don't need, and I wasn't sure if I could just hClose it. (Also it should preferably be on the same FS as target.)

@liyang
liyang added a note Oct 14, 2013

I eventually realised withTempFile (& cetera) was just too painful to use and ended up with withTempDirectory instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tibbe tibbe commented on the diff Oct 11, 2013
Cabal/Distribution/Simple/Program/Ar.hs
where
verbosityOpts v | v >= deafening = ["-v"]
| v >= verbose = []
| otherwise = ["-c"]
+
+-- | @ar@ by default includes various metadata for each object file in their
+-- respective headers, so the output can differ for the same inputs, making
+-- it difficult to avoid re-linking. GNU @ar@(1) has a deterministic mode
+-- (@-D@) flag that always writes zero for the mtime, UID and GID, and 0644
@tibbe
Haskell member
tibbe added a note Oct 11, 2013

Will setting the mtime in the past and using 0644 for the mode cause any other issues? I know it's hard to predict, but can anyone think of any right now?

Perhaps we could instead set the mtime etc to the same as the old file (or just compare the two files with the mtime etc stripped) and thus avoid any such issues?

@nh2
nh2 added a note Oct 11, 2013

As far as I know, mtimes are never used in any pipeline that uses are for dealing with program code, and probably that is one why the -D flag was introduced (as this usage is the most common usage of ar).

In the same way file permissions are ignored by compiler/linker pipelines, and they will only matter if you extract the contents. If you do that extraction as part of a compiler/linker pipeline, you will again not care about maintaining those - it's just that ar (a file archiver like tar) was "abused" for combining object files because it was already there.

@liyang
liyang added a note Oct 12, 2013

What @nh2 said. I'm inclined to go with what GNU ar(1)'s -D does.

@tibbe
Haskell member
tibbe added a note Oct 14, 2013

Alright I'm convinced. Lets try this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tibbe tibbe commented on an outdated diff Oct 11, 2013
Cabal/Distribution/Simple/Program/Ar.hs
where
verbosityOpts v | v >= deafening = ["-v"]
| v >= verbose = []
| otherwise = ["-c"]
+
+-- | @ar@ by default includes various metadata for each object file in their
+-- respective headers, so the output can differ for the same inputs, making
+-- it difficult to avoid re-linking. GNU @ar@(1) has a deterministic mode
+-- (@-D@) flag that always writes zero for the mtime, UID and GID, and 0644
+-- for the file mode. However detecting whether @-D@ is supported seems
+-- rather harder than just re-implementing this feature.
+wipeArchiveMetadata :: FilePath -> IO ()
+wipeArchiveMetadata path = do
+
+ -- Check for existence first (ReadWriteMode would create one otherwise)
+ doesFileExist path >>= (`unless` die "No such file")
@tibbe
Haskell member
tibbe added a note Oct 11, 2013

Please make the error message more descriptive e.g.

Cabal.Distribution.Simple.Program.Ar.wipeArchiveMetadata: No such file: $FILENAME

Since the message is so long it might easier to read if the code is written as

exists <- doesFileExist path
unless exists $ die ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tibbe tibbe and 1 other commented on an outdated diff Oct 11, 2013
Cabal/Distribution/Simple/Program/Ar.hs
+
+-- | @ar@ by default includes various metadata for each object file in their
+-- respective headers, so the output can differ for the same inputs, making
+-- it difficult to avoid re-linking. GNU @ar@(1) has a deterministic mode
+-- (@-D@) flag that always writes zero for the mtime, UID and GID, and 0644
+-- for the file mode. However detecting whether @-D@ is supported seems
+-- rather harder than just re-implementing this feature.
+wipeArchiveMetadata :: FilePath -> IO ()
+wipeArchiveMetadata path = do
+
+ -- Check for existence first (ReadWriteMode would create one otherwise)
+ doesFileExist path >>= (`unless` die "No such file")
+ withBinaryFile path ReadWriteMode $ \ h -> hFileSize h >>= wipeArchive h
+
+ where
+ die msg = error $ "wipeArchiveMetadata: " ++ msg ++ ": " ++ path
@tibbe
Haskell member
tibbe added a note Oct 11, 2013

Please include module names in error message. Also use Cabal's standard die function instead of calling error.

@liyang
liyang added a note Oct 12, 2013

Ah, the die here was left over from @nh2's original patch. Will fix.

@liyang
liyang added a note Oct 14, 2013

I have now done this. Was tempted to abuse the Maybe Int line number argument of dieWithLocation, but I wasn't happy converting the Integer offset to an Int, so kept the atOffset helper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tibbe tibbe and 2 others commented on an outdated diff Oct 11, 2013
Cabal/Distribution/Simple/Program/Ar.hs
+ die msg = error $ "wipeArchiveMetadata: " ++ msg ++ ": " ++ path
+ archLF = "!<arch>\x0a" -- 8 bytes
+ magic = "\x60\x0a" -- 2 bytes
+ metadata = BS.concat
+ [ "0 " -- mtime, 12 bytes
+ , "0 " -- UID, 6 bytes
+ , "0 " -- GID, 6 bytes
+ , "0644 " -- mode, 8 bytes
+ ]
+ hSize = 60
+
+ -- http://en.wikipedia.org/wiki/Ar_(Unix)#File_format_details
+ wipeArchive :: Handle -> Integer -> IO ()
+ wipeArchive h archiveSize = do
+ global <- BS.hGet h (BS.length archLF)
+ unless (global == archLF) (die "Bad global header")
@tibbe
Haskell member
tibbe added a note Oct 11, 2013

Is this the same on all platforms?

@liyang
liyang added a note Oct 12, 2013

Any platform that claims to use ar(1) ought to be compatible. I wrote this on GNU but will test OS X and Windows soon (but if anyone else has a setup for those, please have a look; I don't.)

@23Skidoo
Haskell member
23Skidoo added a note Oct 12, 2013

will test OS X and Windows soon

Please do!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@nh2
nh2 commented Oct 11, 2013

@liyang It would have been nicer if you had created another commit instead of amending my ones; I can't easily look at what's changed compared to my version this ways, or comment on it. Would you mind changing?

@liyang
liyang commented Oct 12, 2013

@nh2

git diff liyang/deterministic-ar nh2/avoid-linking -- Cabal/Distribution/Simple/Program/Ar.hs
@nh2
nh2 commented Oct 12, 2013

git diff liyang/deterministic-ar nh2/avoid-linking -- Cabal/Distribution/Simple/Program/Ar.hs

@liyang I'm aware. It still attributes code which I did not write to me (!), and as I said, forbids me to make github comments on what you changed since my last proposal.

@liyang
liyang commented Oct 13, 2013

@nh2 I essentially rewrote the whole thing. At the time, it seemed like a choice between crediting[0] you for the original code or just claiming that I wrote everything. (I generally dislike including refactors of earlier commits in a feature branch.)

Okay, I'll rebase my refactor on top of your original when I get in to work on Monday.

[0] Or blaming. ;)

@nh2
nh2 commented Oct 13, 2013

Much appreciated!

@liyang
liyang commented Oct 14, 2013

I think I've addressed most of the issues mentioned earlier. The one remaining task is to test this on Windows and OS X (and FreeBSD? I have never used FreeBSD. Someone else will have to do this, or we just trust Wikipedia.)

@liyang
liyang commented Oct 15, 2013

On OS X (with Homebrew removed from $PATH), I successfully built and installed Cabal followed by cabal-install.

However I noticed that the metadata for the symbol table (first entry in libHSCabal-1.19.0.a) are non-zero (or 0644) in the installed location, although not under dist/build. (Under Ubuntu/GNU/Linux it's fine in both locations.) I guess ranlib was invoked at some point during install. Will fix this when I get a chance.

Haven't checked with Homebrew enabled yet.

Will look at Windows when I'm home. (Windows GHC uses mingw32 and GNU tools, so I don't foresee any compatibility problems there.)

@23Skidoo
Haskell member

@liyang Thanks for doing that work.

@liyang
liyang commented Oct 16, 2013

How embarrassing. Under Windows I actually got some .o files of odd size, and found that I'd messed up. (Didn't seek forward 1 byte when offset was odd.) Have force-pushed and fixed that case.

Checking OS X (with and without Homebrew (again)) now.

@liyang liyang added a commit to liyang/cabal that referenced this pull request Oct 16, 2013
@liyang liyang Distribution.Simple: remove ranlibProgram; was workaround for old OS …
…X ld issues

∙ At one point under OS X ld required that the mtime of the symbol table to
  be not older than that of the archive itself. This means that even copying
  an .a file required invoking ranlib to update the symbol table mtime.
∙ However running ranlib renders pointless the deterministic mode of #1537.
∙ As of at least 2008 according to the BUGS section of ranlib(1), this has
  no longer been necessary.

First introduction: 12b3590
Current implementation: af3da76
e764580
@liyang
liyang commented Oct 16, 2013

See comments for the latest commit. Under OS X, I managed to install cabal-install from a bare GHC install without encountering any complaints at all from ld, and the installed archives do indeed have 0 mtime throughout.

I think it's time to ship now.

@tibbe
Haskell member
tibbe commented Oct 16, 2013

Any chance we can add a test so we don't accidentally break this feature in the future?

@liyang
liyang commented Oct 17, 2013

Break which aspect of this feature? Could you be more specific?

I could add a test that builds a simple archive and checks that all the metadata in the headers are as expected, but I'm not sure what kind of breakage this would help to protect against.

@23Skidoo
Haskell member

@liyang Well, having at least some tests is better than no tests at all.

@liyang liyang added a commit to liyang/cabal that referenced this pull request Nov 5, 2013
@liyang liyang Distribution.Simple: remove ranlibProgram; was workaround for old OS …
…X ld issues

∙ At one point under OS X, ld required the mtime of the symbol table to be
  not older than that of the archive itself. This means that even copying an
  .a file required invoking ranlib to update the symbol table mtime.
∙ However running ranlib renders pointless the deterministic mode of #1537.
∙ As of at least 2008 according to the BUGS section of ranlib(1), this has
  no longer been necessary.

First introduction: 12b3590
Current implementation: af3da76
f1f5fda
nh2 and others added some commits Aug 5, 2013
@nh2 nh2 Distribution.Simple.Program.Ar: Implement re-linking avoidance.
We do this by not putting timestamps into the .a file,
and not touching it if is already there and the contents are the same.
75c39af
@nh2 nh2 Distribution.Simple.GHC.buildLib: don't remove old files before building
We need them to compare against, so that we can decide if we want to
write new files or keep the old ones untouched.
ab54640
@nh2 nh2 Distribution.Compat.CopyFile: add filesEqual for comparing contents 91fa8fa
@liyang liyang Distribution.Simple.Program.Ar: Emulate GNU ar's deterministic mode.
After invoking ar(1), replace each object file's metadata with zero for the
mtime, UID and GID, and 0644 for the file mode. Do not touch the existing
archive if the contents are the same.

This rewrites the previous patch to,
 ∙ also wipe UID, GID and file mode;
 ∙ do a single read/write for each header, and less seeking in general;
 ∙ do its work inside a temporary directory, via 'withTempDirectory';
   ∘ Which also deletes the temporary target when it's identical to the old.
 ∙ use Distribution.Compat.CopyFile.filesEqual;
 ∙ use the standard 'dieWithLocation' instead of invoking 'error'.
2b69a58
@liyang liyang Distribution.Simple: remove ranlibProgram; was workaround for old OS …
…X ld issues

∙ At one point under OS X, ld required the mtime of the symbol table to be
  not older than that of the archive itself. This means that even copying an
  .a file required invoking ranlib to update the symbol table mtime.
∙ However running ranlib renders pointless the deterministic mode of #1537.
∙ As of at least 2008 according to the BUGS section of ranlib(1), this has
  no longer been necessary.

First introduction: 12b3590
Current implementation: af3da76
831b3da
@liyang
liyang commented Nov 5, 2013

Sorry for taking so long over this, had been busy with other things.

The latest commit builds and installs a dummy library and checks that the resulting archives in both the dist/build directory and the final installed location (user's ~/.cabal) have zero timestamps, UIDs, GIDs and 0644 modes.

I've tested this test by commenting out the call to wipeMetadata in Distribution.Simple.Program.Ar, and rebased on top of the current master.

Does it look good to merge now?

@tibbe tibbe and 1 other commented on an outdated diff Nov 5, 2013
Cabal/tests/PackageTests/DeterministicAr/Check.hs
+import Control.Monad
+import qualified Data.ByteString as BS
+import qualified Data.ByteString.Char8 as BS8
+import Data.Char (isSpace)
+import Data.List
+import Data.Traversable
+import PackageTests.PackageTester
+import System.Exit
+import System.FilePath
+import System.IO
+import Test.HUnit (Assertion, Test (TestCase), assertFailure)
+
+-- Perhaps these should live in PackageTester.
+
+assertFailure' :: String -> IO a
+assertFailure' msg = assertFailure msg >> error "assertFailure didn't."
@tibbe
Haskell member
tibbe added a note Nov 5, 2013

How can assertFailure fail? In other words: why does this function exist?

@liyang
liyang added a note Nov 5, 2013

It can't, but it has the type Assertion = IO () whereas I wanted a polymorphic IO a. If you like, I can rewrite it as:

assertFailure' msg = assertFailure msg >> return {-unpossible!-}undefined

and add a comment--I thought it was obvious what it was doing but maybe it's not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@tibbe
Haskell member
tibbe commented Nov 5, 2013

One comment. Looks good to merge otherwise.

@tibbe tibbe merged commit bc0e2a4 into haskell:master Nov 6, 2013
@liyang liyang deleted the liyang:deterministic-ar branch Nov 11, 2013
@lukexi
lukexi commented Jan 27, 2014

Hi guys,

This seems to be crashing me when using cabal with GHC iOS:

Resolving dependencies...
Configuring random-1.0.1.1...
Building random-1.0.1.1...
Preprocessing library random-1.0.1.1...
[1 of 1] Compiling System.Random    ( System/Random.hs, dist/arm/build/System/Random.o )
[1 of 1] Compiling System.Random    ( System/Random.hs, dist/arm/build/System/Random.p_o )
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/ranlib: archive library: dist/arm/build/libHSrandom-1.0.1.1.a-90308/libHSrandom-1.0.1.1.a will be fat and ar(1) will not be able to operate on it
cabal: dist/arm/build/libHSrandom-1.0.1.1.a-90308/libHSrandom-1.0.1.1.a:
Distribution.Simple.Program.Ar.wipeMetadata: Bad global header
Failed to install random-1.0.1.1

I'm using the arm scripts from https://github.com/ghc-ios/ghc-ios-scripts.

Happy to help debug, or we can just skip this optimization if the metadata can't be parsed?

@nh2
nh2 commented Jan 27, 2014

Happy to help debug

The .a file should start with (see here):

archLF = "!<arch>\x0a" -- global magic, 8 bytes
...
   -- http://en.wikipedia.org/wiki/Ar_(Unix)#File_format_details
   wipeArchive :: Handle -> Integer -> IO ()
   wipeArchive h archiveSize = do
       global <- BS.hGet h (BS.length archLF)
       unless (global == archLF) $ wipeError "Bad global header"
       wipeHeader (toInteger $ BS.length archLF)

Could you check if your .a file starts with !<arch>\x0a (== !<arch>\n) or with something else?

@23Skidoo 23Skidoo added a commit that referenced this pull request Jan 30, 2014
@23Skidoo 23Skidoo Don't wipe .a metadata on ARM. 58cdd44
@23Skidoo
Haskell member

@lukexi I've disabled this behaviour on ARM for now. Please tell whether my patch fixed the problem.

@lukexi
@23Skidoo
Haskell member

@lukexi

Thanks. Would be nice to know why it broke in the first place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.