Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static linking with GHC 9.8.1 is broken #275304

Open
bgamari opened this issue Dec 19, 2023 · 26 comments · Fixed by #275609
Open

Static linking with GHC 9.8.1 is broken #275304

bgamari opened this issue Dec 19, 2023 · 26 comments · Fixed by #275609
Labels
0.kind: regression Something that worked before working no longer 6.topic: haskell 6.topic: static

Comments

@bgamari
Copy link
Contributor

bgamari commented Dec 19, 2023

Describe the bug

Haskell packages in nixpkgs.pkgsStatic.haskell.packages.ghc98 are unable to be built.

Steps To Reproduce

Steps to reproduce the behavior:

  1. nix build nixpkgs#legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc98.Diff

Expected behavior

Diff is built, linking against musl.

Observed behavior

nix-repl> :b legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc98.Diff                                                                                                                                               error: build of '/nix/store/ickzn6az4asiwrh1wq20blm0dvcr5w17-Diff-static-x86_64-unknown-linux-musl-0.4.1.drv' on 'ssh://ben@maurer.local' failed: builder for '/nix/store/ickzn6az4asiwrh1wq20blm0dvcr5w17-Diff-static-x86_64-unknown-linux-musl-0.4.1.drv' failed with exit code 1;
       last 10 log lines:
       >    | ^^^^^^^^^^^^^^^^^^^^^^^^^^
       >
       > src/Data/Algorithm/Diff.hs:29:1: error: [GHC-47808]
       >     Failed to load dynamic interface file for Data.Array:
       >       Exception when reading interface file  /nix/store/zb7g1q1vza1x0fmb8qk8cv9y23b9w81g-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data/Array.dyn_hi
       >         /nix/store/zb7g1q1vza1x0fmb8qk8cv9y23b9w81g-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data/Array.dyn_hi: withBinaryFile: does not exist (No such file or directory)
       >    |
       > 29 | import Data.Array (listArray, (!))
       >    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       > load' failed
       For full logs, run 'nix log /nix/store/ickzn6az4asiwrh1wq20blm0dvcr5w17-Diff-static-x86_64-unknown-linux-musl-0.4.1.drv'.
error: builder for '/nix/store/ickzn6az4asiwrh1wq20blm0dvcr5w17-Diff-static-x86_64-unknown-linux-musl-0.4.1.drv' failed with exit code 1;
       last 10 log lines:
       >    | ^^^^^^^^^^^^^^^^^^^^^^^^^^
       >
       > src/Data/Algorithm/Diff.hs:29:1: error: [GHC-47808]
       >     Failed to load dynamic interface file for Data.Array:
       >       Exception when reading interface file  /nix/store/zb7g1q1vza1x0fmb8qk8cv9y23b9w81g-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data/Array.dyn_hi
       >         /nix/store/zb7g1q1vza1x0fmb8qk8cv9y23b9w81g-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data/Array.dyn_hi: withBinaryFile: does not exist (No such file or directory)
       >    |
       > 29 | import Data.Array (listArray, (!))
       >    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       > load' failed
       For full logs, run 'nix log /nix/store/ickzn6az4asiwrh1wq20blm0dvcr5w17-Diff-static-x86_64-unknown-linux-musl-0.4.1.drv'.

Additional context

The problem here appears to manifest during building of Haddock documentation. For instance,

$ nix repl
nix-repl> :lf nixpkgs#
nix-repl> :b legacyPackages.x86_64-linux.haskell.lib.dontHaddock (legacyPackages.x86_64-linux.pkgsStatic.haskell.packages.ghc98.Diff)

This derivation produced the following outputs:
  out -> /nix/store/5096rq1664b9qq2f2g9ih0sv6is48d8z-Diff-static-x86_64-unknown-linux-musl-0.4.1

Notify maintainers

@nh2

@bgamari
Copy link
Contributor Author

bgamari commented Dec 19, 2023

My suspicion here is that this is either a Cabal or Haddock bug, although I'm not yet sure which.

@angerman
Copy link
Contributor

Of note: something assumes the existence of dynamic files, while there are none. Not quite sure why GHC would try to load dynamic files.
Trying to read

/nix/store/s13v3xsi60z627ic821fm70mlw43a3za-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data/Array.dyn_hi

however

/nix/store/s13v3xsi60z627ic821fm70mlw43a3za-x86_64-unknown-linux-musl-ghc-9.8.1/lib/x86_64-unknown-linux-musl-ghc-9.8.1/lib/../lib/x86_64-linux-ghc-9.8.1/array-0.5.6.0-inplace/Data
total 27K
dr-xr-xr-x 3 root root    5 Jan  1  1970 .
dr-xr-xr-x 3 root root    7 Jan  1  1970 ..
dr-xr-xr-x 6 root root   22 Jan  1  1970 Array
-r--r--r-- 2 root root 2.9K Jan  1  1970 Array.hi
-r--r--r-- 2 root root 2.9K Jan  1  1970 Array.p_hi

Maybe someone has an idea where the dyn_hi load comes from.

@rnhmjoj
Copy link
Contributor

rnhmjoj commented Dec 19, 2023

ping: @NixOS/static

@sternenseemann sternenseemann added 6.topic: haskell 0.kind: regression Something that worked before working no longer and removed 0.kind: bug labels Dec 19, 2023
@sternenseemann
Copy link
Member

Good to know that the hadrian regression from #208959 has been fixed, so we can at least build GHC now.

@sternenseemann
Copy link
Member

My diagnosis is the following:

  • Hadrian silently disables building the haddock executable, presumably because docs are disabled (so maybe building core lib docs and building the haddock executable is still the same flag?)
  • This is not known to haskellPackages.mkDerivation, since I dropped the enableHaddockProgram flag when I ported the GHC expression to Hadrian, presumably either because cross was completely broken initially or I assumed it got fixed.
  • When Cabal gets told to build docs, it uses the only haddock in scope, the one from the build->build compiler which doesn't work of course.

I can fix that by just disabling haddock in the same way as we do for GHC < 9.6. I'll try doing that later.

@bgamari @angerman The question is of course, and you can answer that better than me, has anything changed w.r.t. hadddock and cross with Hadrian?

@bgamari
Copy link
Contributor Author

bgamari commented Dec 19, 2023

Thanks @sternenseemann! Your hypothesis does sound plausible.

Recently we did rework Haddock to take documentation from Haskell interface (.hi) files. I can't help but wonder whether this logic may be culpable: https://gitlab.haskell.org/ghc/haddock/-/blob/b0b0e0366457c9aefebcc94df74e5de4d00e17b7/haddock-api/src/Haddock.hs#L170. This was apparently introduced due to haskell/haddock#256.

@sternenseemann
Copy link
Member

sternenseemann commented Dec 19, 2023

Seems plausible. I'm personally not too fussed that this change means that haddock is not “retargetable”, i.e. you always need to use the precise haddock bundled with the GHC you are using to compile the documented code. In fact, we probably should explicitly tell Cabal which haddock to use, so this kind of issue doesn't happen or is easier to diagnose.

I'll need to investigate, though, under which circumstances we can build haddock with hadrian now.

@sternenseemann
Copy link
Member

The problems seems to be that the haddock package is only built using the stage1 compiler (so as part of stage2) which we necessarily never reach in the case of cross compilation. Presumably we can work around this in UserSettings somehow (although IME you are quite limited if your solution is to be maintainable), but I feel like this is a genuine gap and there ought to be a better way to build a cross-compiler with hadrian…

@angerman
Copy link
Contributor

I've just skimmed the code, but why do we do this:

  -- Inject dynamic-too into ghc options if the ghc we are using was built with
  -- dynamic linking
  flags'' <- ghc flags $ do
        df <- getDynFlags
        case lookup "GHC Dynamic" (compilerInfo df) of
          Just "YES" -> return $ Flag_OptGhc "-dynamic-too" : flags
          _ -> return flags

what's the rational for adding -dynamic-too here? I can somewhat extract the rational from haskell/haddock#256, but the comment above this is rather poor. Also it does not provide any way to pass to haddock to prevent this automagic.

I guess the proper thing here is to just disable haddocks for cross, and rely on native compilers haddocks.

@sternenseemann
Copy link
Member

I guess the proper thing here is to just disable haddocks for cross, and rely on native compilers haddocks.

Do you mean the native compiler's haddock executable or re-using the documentation built natively? The former currently happens (unintentionally) and seems to be the source of the problem…

sternenseemann added a commit to sternenseemann/nixpkgs that referenced this issue Dec 20, 2023
In this situation, haddock would not be built by hadrian, as there is no
stage0:exe:haddock target by default. (We should eventually try adding
one.) If haddock is enabled and the build->host haddock missing, Cabal
tries using the build->build haddock which may fail to load the
documentation from the interface files produced by the build->host
GHC (e.g. due to a mismatch between dynamic and static linking).

Resolves NixOS#275304.
sternenseemann added a commit to sternenseemann/nixpkgs that referenced this issue Dec 20, 2023
In this situation, haddock would not be built by hadrian, as there is no
stage0:exe:haddock target by default. (We should eventually try adding
one.) If haddock is enabled and the build->host haddock missing, Cabal
tries using the build->build haddock which may fail to load the
documentation from the interface files produced by the build->host
GHC (e.g. due to a mismatch between dynamic and static linking).

Add regression tests to haskell-updates jobset.

Resolves NixOS#275304.
@sternenseemann sternenseemann linked a pull request Dec 20, 2023 that will close this issue
13 tasks
@angerman
Copy link
Contributor

@sternenseemann

re-using the documentation built natively

This :D

@domenkozar
Copy link
Member

domenkozar commented Jan 24, 2024

It's still broken: https://github.com/domenkozar/nixpkgs-static-repo

@domenkozar domenkozar reopened this Jan 24, 2024
@domenkozar
Copy link
Member

Even easier reproducer:

nix-build -A pkgsStatic.haskell.packages.ghc98.th-orphans

error: builder for '/nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv' failed with exit code 1;
       last 10 log lines:
       > /nix/store/20rsi77ny2i4i1rbd63h4392a245j5dz-gnutar-1.35/bin/tar
       > No uhc found
       > Running phase: buildPhase
       > Preprocessing library for th-orphans-0.13.14..
       > Building library for th-orphans-0.13.14..
       > [1 of 2] Compiling Language.Haskell.TH.Instances.Internal ( src/Language/Haskell/TH/Instances/Internal.hs, dist/build/Language/Haskell/TH/Instances/Internal.o )
       > [2 of 2] Compiling Language.Haskell.TH.Instances ( src/Language/Haskell/TH/Instances.hs, dist/build/Language/Haskell/TH/Instances.o )
       >
       > <no location info>: error:
       >     Couldn't find a target code interpreter. Try with -fexternal-interpreter
       For full logs, run 'nix log /nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv'.

@angerman
Copy link
Contributor

Even easier reproducer:

nix-build -A pkgsStatic.haskell.packages.ghc98.th-orphans

error: builder for '/nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv' failed with exit code 1;
       last 10 log lines:
       > /nix/store/20rsi77ny2i4i1rbd63h4392a245j5dz-gnutar-1.35/bin/tar
       > No uhc found
       > Running phase: buildPhase
       > Preprocessing library for th-orphans-0.13.14..
       > Building library for th-orphans-0.13.14..
       > [1 of 2] Compiling Language.Haskell.TH.Instances.Internal ( src/Language/Haskell/TH/Instances/Internal.hs, dist/build/Language/Haskell/TH/Instances/Internal.o )
       > [2 of 2] Compiling Language.Haskell.TH.Instances ( src/Language/Haskell/TH/Instances.hs, dist/build/Language/Haskell/TH/Instances.o )
       >
       > <no location info>: error:
       >     Couldn't find a target code interpreter. Try with -fexternal-interpreter
       For full logs, run 'nix log /nix/store/dibiy3qjbg2l5ahlqf28axfqz5xw91xn-th-orphans-static-x86_64-unknown-linux-musl-0.13.14.drv'.

That suggests that the GHC was not built as stage2 compiler, or some of the new cross target logic prohibits native codegen now as well.

@sternenseemann
Copy link
Member

Yes, we are only building stage 1 here. As it turns out, for GHC < 9.6 we used to build the stage 2 compiler in this case, so seems like a detail I missed when porting the expression to hadrian.

#283773

@sternenseemann
Copy link
Member

Unfortunately, also building Stage 2 doesn't fix the problem according to my testing, maybe @domenkozar can confirm on #283773.

@domenkozar
Copy link
Member

new cross target logic prohibits native codegen now as well.

Does now refer to hadrian, ghc, or nixpkgs?

@wolfgangwalther
Copy link
Contributor

Unfortunately, also building Stage 2 doesn't fix the problem according to my testing, maybe @domenkozar can confirm on #283773.

I looked into this a little bit and the problem seems that hadrian-based builds don't build ghc-iserv anymore, which leads to Couldn't find a target code interpreter. Try with -fexternal-interpreter. GHC 9.4 without hadrian was still building it and thus succeeds.

I think the logic in hadrian is kind of the same as before 853c121 - the full platform is compared and not something like "can execute".

@angerman
Copy link
Contributor

Just sidestep the whole braindead install logic from hadrian. It's so bad...

Just build and install the compiler with cp.

@angerman
Copy link
Contributor

The haskell.nix builder for GHC work around this by sidestepping hadrians build and install process and doing it a bit more explicit.

https://github.com/input-output-hk/haskell.nix/blob/6eaafcdf04bab7be745d1aa4f74d2cc85700042b/compiler/ghc/default.nix#L787

I'm not even sure Hadrian can (or should be fixed). The proper solution seems to just bin it outright and build GHC with cabal only.

@domenkozar
Copy link
Member

I can confirm it works with haskell.nix: https://github.com/domenkozar/nixpkgs-static-repo/tree/haskell.nix

@wolfgangwalther
Copy link
Contributor

I looked into this a little bit and the problem seems that hadrian-based builds don't build ghc-iserv anymore, which leads to Couldn't find a target code interpreter. Try with -fexternal-interpreter. GHC 9.4 without hadrian was still building it and thus succeeds.

As pointed out by @sternenseemann in #287794 (comment), the missing ghc-iserv is not exactly the reason for this error message, but as I mentioned in #287794 (comment) probably closely related:

I think iserv can't be built, because it needs a GHCi built with -internal-interpreter, which is not built via hadrian

If we build with -finternal-interpreter, then maybe the "Couldn't find a target code interpreter." would be solved. I am referring to this part in the hadrian source:

https://gitlab.haskell.org/ghc/ghc/-/blob/master/hadrian/src/Settings/Packages.hs#L127-156

A workaround mentioned there, could be to build the static/cross compiler with the same version of GHC as bootstrap:

          -- The workaround we use is to check if the bootstrap compiler has
          -- the same version as the one we are building. In this case we can
          -- avoid the first step above and directly build with
          -- `-finternal-interpreter`.

FTR, I tried that. To do so, I had to patch the hadrian source to allow a newer Cabal first. I then changed the bootPkgs to use ghc981 when building cross:

      bootPkgs =
        if stdenv.hostPlatform != stdenv.targetPlatform then
          buildPackages.haskell.packages.ghc981
        else
          packages.ghc947;

The build then fails with a lot of this:

ghc/Main.hs:18:1: error: [GHC-53693]
    Something is amiss; requested module  ghc-9.8.1:GHC differs from name found in the interface file ghc:GHC (if these names look the same, try again with -dppr-debug)

I have not idea what that means and haven't gone further, yet. Just putting this here in case somebody has an idea.

@angerman
Copy link
Contributor

angerman commented Feb 12, 2024

You can explicitly build iserv using hadrian. It's not a default target for some reason. And then sidestep the broken install phase of hadrian. You don't need any of this anyway with nix as you a priori know all your install locations. So you can replace the convoluted install phase with a simple cp.

EDIT: just use the same logic we have in Haskell.nix, it should be translatable to the nixpkgs GHC builder: #275304 (comment), both builders are still fairly similar.

@wolfgangwalther
Copy link
Contributor

Just confirmed this is still a problem with GHC 9.10.1.

@sternenseemann
Copy link
Member

The build then fails with a lot of this:

This is a hadrian bug, there's apparently a patch on GHC master (9.10?). That being said, self bootstrapping isn't exactly tested upstream and has become annoying with hadrian due to the strict bounds.

@wolfgangwalther
Copy link
Contributor

Just confirmed this is still a problem with GHC 9.10.1.

To be precise: I tested pkgsStatic.haskell.packages.ghc9101.th-orphans still fails with the above external interpreter error message.

I did not test, at least yet, the self-bootstrapping approach I tried earlier with GHC 9.8. That might still be worth a try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.kind: regression Something that worked before working no longer 6.topic: haskell 6.topic: static
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants