Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some ghc-options should not be considered part of build hash #4247

Closed
ezyang opened this issue Jan 19, 2017 · 26 comments
Closed

Some ghc-options should not be considered part of build hash #4247

ezyang opened this issue Jan 19, 2017 · 26 comments

Comments

@ezyang
Copy link
Contributor

ezyang commented Jan 19, 2017

This is related to #3883 but it is a distinct problem so I wanted to make a ticket for it by itself.

Some ghc-options don't have any affect on build outputs, e.g., -v, but if someone specifies it via ghc-options, e.g., in their cabal.project file, it will cause us to rebuild all deps because elabProgramArgs has changed, which changes the hash, which means we need to rebuild. This is not good.

It's unclear how to go about fixing this. We could build in knowledge to cabal-install what arguments should and should not be hashed. But we'd be constantly chasing the most up-to-date set, and there would always be some time where we wouldn't have gotten it exactly right. Alternately, we could let the user specify, "use this option when building, but don't include it in the hash." But here it seems easy for a user to shoot themselves in the foot, ESP when it's a package being installed to the global store; worst case scenario you'd have to blow away the entire store because you corrupted it.

CC @bgamari who reported this most recently.

@DanielG
Copy link
Collaborator

DanielG commented Jan 19, 2017

How about adding a command to GHC that tells cabal which flags should influence the hash? For old versions this knowledge would still have to be included but at least then we don't need to chase after GHC.

@ezyang
Copy link
Contributor Author

ezyang commented Jan 19, 2017

That's an interesting idea. The relevant code in GHC is fingerprintDynFlags in FlagChecker, we'd have to reverse engineer the flags that affect fields in DynFlags here.

It's worth noting GHC's recompilation avoidance doesn't always get this right. See #3891. So it' still challenging to compute the flags, even on GHC's side ;)

@23Skidoo
Copy link
Member

Yep, just bumped into this when I tried to enable -Werror for my project.

@mightybyte
Copy link
Collaborator

I also hit this when I did cabal new-build --ghc-option=-fno-code

@hvr
Copy link
Member

hvr commented Jun 19, 2017

I think that -fno-code is a bit special here, as it does affect the .o output... by not producing any! This could have the ability to confuse cabal quite a lot. Maybe this rather demands a special "dry" mode to invoke, i.e. cabal new-build --no-code which takes care of passing this to ghc and not get confused about the .o files not being generated. What's your use-case for -fno-code?

@mightybyte
Copy link
Collaborator

No-code is great for getting something to compile. It's much faster than anything else I know of, and significantly speeds up the compile-fix-errors loop. The ideal situation in my mind would be to have a cabal tc (for type check) command that does this. Making it a separate command makes it more convenient and discoverable (as well as being a little quicker to type).

@merijn
Copy link
Collaborator

merijn commented Mar 27, 2018

I think this issue is a bit stuck in idealism over pragmatism. As far as I can tell everyone agrees that:

  1. cabal needs a way to filter flags that don't affect build outputs from the hash
  2. hard-coding isn't great because that means manually keeping the flags list in cabal in sync with GHC.

So now this ticket is stuck on the "devise scheme to figure out which flags can be safely used". I (of course) support that idea, but would like to argue that we shouldn't let "perfect" be the enemy of "good". Since we need a code-path to filter out flags anyway, why add said code and, for now, hard-code a set of common flags that can be safely filtered out. Then once we figure out a more sustainable/future-proof way of dealing with GHC options that don't affect output we just have to drop the hard-coded set and replace it with whatever method people come up with.

This would make new-build a significant bit more pleasant to use right now while introducing no obstacles for a more perfect future solution. If someone can point to where the hashing code is I'm even willing to see if I can hack this in somehow.

@alexbiehl
Copy link
Member

alexbiehl commented Mar 27, 2018

@merijn PackageHashConfigInputs is the input to the package hash function for package and program configuration. The ghc-options property is put into the pkgHashProgramArgs currently.

Construction of PackageHashConfigInputs happens here:

packageHashConfigInputs :: ElaboratedSharedConfig
-> ElaboratedConfiguredPackage
-> PackageHashConfigInputs
packageHashConfigInputs
ElaboratedSharedConfig{..}
ElaboratedConfiguredPackage{..} =
PackageHashConfigInputs {
pkgHashCompilerId = compilerId pkgConfigCompiler,
pkgHashPlatform = pkgConfigPlatform,
pkgHashFlagAssignment = elabFlagAssignment,
pkgHashConfigureScriptArgs = elabConfigureScriptArgs,
pkgHashVanillaLib = elabVanillaLib,
pkgHashSharedLib = elabSharedLib,
pkgHashDynExe = elabDynExe,
pkgHashGHCiLib = elabGHCiLib,
pkgHashProfLib = elabProfLib,
pkgHashProfExe = elabProfExe,
pkgHashProfLibDetail = elabProfLibDetail,
pkgHashProfExeDetail = elabProfExeDetail,
pkgHashCoverage = elabCoverage,
pkgHashOptimization = elabOptimization,
pkgHashSplitSections = elabSplitSections,
pkgHashSplitObjs = elabSplitObjs,
pkgHashStripLibs = elabStripLibs,
pkgHashStripExes = elabStripExes,
pkgHashDebugInfo = elabDebugInfo,
pkgHashProgramArgs = elabProgramArgs,
pkgHashExtraLibDirs = elabExtraLibDirs,
pkgHashExtraFrameworkDirs = elabExtraFrameworkDirs,
pkgHashExtraIncludeDirs = elabExtraIncludeDirs,
pkgHashProgPrefix = elabProgPrefix,
pkgHashProgSuffix = elabProgSuffix
}

As a first approximation you could strip irrelevant options with something like this: Map.alter strip_irrelevant_ghc_options "ghc" pkgHashProgramArgs

@hvr
Copy link
Member

hvr commented Mar 27, 2018

hard-coding isn't great because that means manually keeping the flags list in cabal in sync with GHC.

to be fair, we do already hardcode knowledge about GHC (and other compilers) in lib:Cabal (and that's where we have to encode that information); so there is a precedent to embed knowledge about GHC. It's just annoying that GHC's CLI grammar is non-trivial, so we can't just blindly filter out tokens from the list of CLI args; we have to do a bit more work.

@merijn
Copy link
Collaborator

merijn commented Mar 27, 2018

ok, looking at the code the hashing appears to initiate in one single point: buildAndInstallUnpackedPackage, which is also the first function in the chain to be in IO. So my idea is to propagate a set of filters from there into the hashing functions, which means that any future replacement process can just hook into the IO in buildAndInstallUnpackedPackage to compute a filter dynamically to replace any initial static filter.

@merijn
Copy link
Collaborator

merijn commented Mar 27, 2018

ok, so as a bunch of you probably already noticed in #hackage I had a bit of a setback in that most of the datatypes are being serialised for the rebuild cache and we can't easily serialise functions, making things Difficult (TM). The earlier places I suggested for adding code are just not really easily doable without either 1) gimping the functionality or 2) requiring major work.

But there's is a hope. After some gym downtime of not staring at cabal code I've found another path:
the Program type in Cabal is already not serialisable, so adding a field there is fairly straightforward. We also have an already existing registry of programs we care about built into Cabal. Specifically Distribution.Simple.Program.Builtin.builtinPrograms which has Program entries for every tool Cabal cares about, which probably covers everything new-build should realistically deal with, so with an option filter added to Program we have an easily accessible place to find which flags we can filter from programs Cabal knows about.

Two open questions:

  1. Should compilers be treated differently? On the one hand we have a separate Compiler type stored in ElaboratedSharedConfig which might be more appropriate to use, but that brings us back to the "can't serialise functions nightmare". On the other hand all compilers Cabal knows about are already listed in builtinPrograms, so why not have the interface nicely uniform regardless of program.

  2. The Program type doesn't really deal with versions and thus anything stored in it can't really be program version specific. We could parameterise the function stored in Program over the version of said program and then look up what version we currently have configured in the ProgramDb (which is stored in ElaboratedSharedConfig and also easily accessible, which might render this issue moot.

merijn added a commit to merijn/cabal that referenced this issue Mar 28, 2018
…grams from

the new-build hash input to reduce redundant recompilation.

See haskell#4247.
@merijn
Copy link
Collaborator

merijn commented Mar 28, 2018

Ok, I made a (surprisingly small and minimal) PR that adds the prerequisite infrastructure for filtering tool version specific command-line arguments from the new-build hash computation. Next step will be to extend the builtin programs in Distribution.Simple.Program.Builtin with any relevant filtering logic.

@merijn
Copy link
Collaborator

merijn commented Mar 28, 2018

In other news, cabal new-build --ghc-options=-Wall no longer starts recompiling my entire transitive dependencies! Progress!

@merijn
Copy link
Collaborator

merijn commented Mar 29, 2018

I'm working on a filter for GHC >=8.0 and there are some flags of which I'm not sure whether they can be ignored by new-build:

Mystery flags (not documented in GHC user guide)

-sig-of
-H
-relative-dynlib-paths
-dno-llvm-mangler
-haddock
-haddock-opts
-fmax-pmcheck-iterations (possibly equivalent to -Wmax-pmcheck-iterations?)

SafeHaskell

-fpackage-trust
-fno-safe-infer

Makefile dependency output

-dep-suffix
-dep-makefile
-include-pkg-degs
-exclude-module

Misc

-frule-check (rule diagnostic output)
-fno-code
-hpcdir (output dir for HPC coverage output)
-tmpdir
-stubdir (ffi stubs)
-dumpdir
-ddump-file-prefix

@angerman
Copy link
Collaborator

  • -dno-llvm-mangler is not documented on purpose; it's supposed to be a debug flag only. It changes the LLVM code gen from

    Cmm -(llvm codegen)-> LLVM (textual) IR -(opt)-> Assembly -(Mangler)-> Assembly -(llc)-> Object
    

    to

    Cmm -(llvm codegen)-> LLVM (textual) IR -(opt)-> Assembly -(llc)-> Object
    

    there is a similar flag -fast-llvm that changes this even to

    Cmm -(llvm codegen)-> LLVM (textual) IR -(opt)-> Object
    

    As such the -dno-llvm-manger can change the produced object code.

@merijn
Copy link
Collaborator

merijn commented Mar 29, 2018

Honourable mention in the category "fucking up beautiful ideas": -Werror

Specifically, should -Werror affect the new-build hash? It can potentially cause no output to be produced. OTOH, if output is produced -Werror should leave it unchanged.

Additionally, if we deem that -Werror should affect the new-build hash, then that ruins the entire idea of filtering warning flags out, since a package might have -Werror in it's GHC options and thus any warning flag we add could potentially stop build outputs from being created.

So I'm leaning somewhat to "-Werror should be filtered out", but I dunno if that would break something in new-build...

@merijn
Copy link
Collaborator

merijn commented Apr 9, 2018

Some new flags added in 8.4 I'm unsure about:
-keep-hi-file
-no-keep-hi-file
-keep-hi-files
-no-keep-hi-files
-keep-o-file
-no-keep-o-file
-keep-o-files
-no-keep-o-files

@merijn
Copy link
Collaborator

merijn commented Apr 9, 2018

Also, the updated PR in #5237 lets me work around the -Werror issue by passing in the PackageDescription and verifying the -Werror is nowhere in the package or the arguments.

GHC 8.4 actually adds -Werror= and -Wwarn= variants, because FML. But I've got an idea to handle those (and -Werror) slightly more generally and flexibly that I hope to finish tomorrow.

@merijn
Copy link
Collaborator

merijn commented Apr 9, 2018

Initial implementation of normalising flags for haddock and GHC 8.0-8.4 (modulo buggy handling of -Werror in 8.4) is here: https://github.com/merijn/cabal/blob/filter-flag-prototype/Cabal/Distribution/Simple/Program/GHC.hs#L51-L216

merijn added a commit that referenced this issue Apr 17, 2018
…grams from

the new-build hash input to reduce redundant recompilation.

See #4247.
@merijn
Copy link
Collaborator

merijn commented Apr 23, 2018

Ping to everyone in this thread to have a look at #5266 and chime in if it looks ok and/or help dogfood it.

@merijn
Copy link
Collaborator

merijn commented Apr 25, 2018

Ok, I've been using this code for about a week now with no issues, none of the people in the original issue or that I've asked have objected, so unless someone has a complaint before tomorrow I'm merging this.

@merijn
Copy link
Collaborator

merijn commented Apr 25, 2018

On a related note, fixing this so that local packages exhibit the same behaviour looks to be a trivial fix. Right now any configuration change will always trigger local packages to rebuild, so cabal new-build --ghc-option=-j2 (uselessly) rebuilds everything for no good reason. I'm experimenting with some code that will not rebuild local packages if the only differences is in the GHC flags that don't affect the result.

@alexbiehl
Copy link
Member

Do it. We can always improve if we find things to be missing or wrong.

@merijn
Copy link
Collaborator

merijn commented Apr 25, 2018

The main downside of applying this code to local packages would be that enabling a warning flag would no longer toggle every local package to build, and thus you lose "easily verify some warning doesn't occur" functionality we now (unintentionally) have. OTOH, adding -Werror should disable this filtering and thus trigger a rebuild of local packages anyway.

Although adding -Werror would trigger rebuilding your entire transitive dependency tree too, which is a bit unfortunate. Someone more knowledgeable about new-build and think through whether the -Werror case is actually an issue.

@merijn
Copy link
Collaborator

merijn commented Jun 8, 2018

#5287 is updated. Now also applies all the flag ignoring stuff (including ghci options) to in-place packages and adds a new --repl-options to pass ephemeral REPL flags to (new-)repl.

@jneira
Copy link
Member

jneira commented May 1, 2022

Afaiu this can be closed after the merge of #5237, #5266 and #5287. Feel free to reopen if it needs something else

@jneira jneira closed this as completed May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants