Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cabal should probably add extra-lib-dirs to RPATH #7339

Closed
bgamari opened this issue Mar 21, 2021 · 13 comments · Fixed by #9554
Closed

Cabal should probably add extra-lib-dirs to RPATH #7339

bgamari opened this issue Mar 21, 2021 · 13 comments · Fixed by #9554
Assignees
Labels
re: dynamic-linking Concerning dynamic linking (e.g. flags "shared", "*-dynamic") type: bug

Comments

@bgamari
Copy link
Contributor

bgamari commented Mar 21, 2021

Describe the bug
In GHC #19350 it was noted that dynamic libraries findable via extra-lib-dirs are not found in GHCi on POSIX platforms. While in principle GHC has the information that it needs to find these libraries from the package database registration, due to limitations of the the POSIX dynamic linking interface, it is not possible for GHC to do better here.

I believe the solution here is to (on POSIX platforms) add Cabal's extra-lib-dirs paths to RPATH during the final link of libraries. This isn't a perfect solution (since it may result in spurious RPATH entries) but this seems to be the best we can do.

To Reproduce
See this testcase.

Expected behavior
GHCi is able to load libraries with foreign dependencies findable only via extra-lib-dirs.

System information

  • Operating system
  • cabal, ghc versions

Additional context
Add any other context about the problem here.

@phadej
Copy link
Collaborator

phadej commented Mar 21, 2021

How this interacts with change proposed in #7094

@bgamari
Copy link
Contributor Author

bgamari commented Mar 22, 2021

In #7094 we were essentially forced into placing more RPATH logic in GHC (specifically when targeting Darwin) since the patching we are forced to do in Darwin is rather involved. This was a pragmatic choice, but ultimately it was hack that meant conflating two distinct notions. Specifically, it means that GHC uses -L inconsistently with how other compilers treat that flag (since usually it merely affect the compile-time search path; in GHC it now also affects the run-time search behvaior).

To address GHC #19350 we could use a similar approach, teaching GHC to implicitly add -L directories to RPATH. However I think we should avoid this. -L and RPATH affect two distinct things; while on Darwin, due to the nature of the environment, we can conflate these, doing so on Linux would cause trouble for packagers (since you would almost certainly end up with redundant RPATH entries in some cases; e.g. when you are building a package in a temporary prefix for installation elsewhere). By keeping these two concepts distinct, GHC remains flexible enough to be usable in such cases.

In short, there is not appreciable interaction between #7094. #7094 is a pragmatic exception to the rule that -L and RPATH are orthogonal concerns. Here I am suggesting that the compiler continue considering them to be orthogonal but we add the policy logic for determining RPATH to Cabal.

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2021

I don't understand what this issue suggests?

Change extra-lib-dirs: directories to not generate -L flags, but -rpath ones? But on Darwin we won't as rpath support is not hasInstallNameTool, and then -rpath stuff is passed as -L anyway?

@bgamari
Copy link
Contributor Author

bgamari commented Mar 22, 2021

Concretely, extra-lib-dirs should generate both -L flags and -rpath flags.

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2021

I was also going to suggest

Perhaps GHC should grow a -rpath flag of its own to abstract this away?

But you did it yourself.

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2021

For the reference, as to me rpath is mystical thing: https://en.wikipedia.org/wiki/Rpath

@phadej
Copy link
Collaborator

phadej commented Mar 22, 2021

And to clarify. I'm just asking questions. I have no opinion on this, slightly worried that linking is becoming ad-hoc spagetti, but maybe it is just what it is.

@bgamari
Copy link
Contributor Author

bgamari commented Mar 22, 2021

Unfortunately it does tend to be quite platform specific. There was sadly a reason why libtool was necessary for so many years.

@angerman
Copy link
Collaborator

Concretely, extra-lib-dirs should generate both -L flags and -rpath flags.

I'm not sure I agree, it should generate -L sure, but leave the rpath logic on darwin up to ghc. The whole idea that cabal takes care of rpath and ghc takes care of rpath is not very good. Ideally cabal would just delegate to hc, and hc would deal with the actual linking, let's not replicate linking logic in both.

@bgamari
Copy link
Contributor Author

bgamari commented Jun 22, 2021

GHC #20026 is relevant here.

@Mikolaj
Copy link
Member

Mikolaj commented Jun 25, 2021

Also, #7382 seems related.

@gbaz gbaz added the type: bug label Aug 28, 2021
@andreasabel andreasabel added the re: dynamic-linking Concerning dynamic linking (e.g. flags "shared", "*-dynamic") label Nov 13, 2021
@Mikolaj
Copy link
Member

Mikolaj commented Jun 2, 2022

Please somebody with competence and a stake drive this issue. There is narrow window to get it into cabal 3.8.

@bgamari
Copy link
Contributor Author

bgamari commented Nov 14, 2023

I'm not sure I agree, it should generate -L sure, but leave the rpath logic on darwin up to ghc.

I don't believe that we are actually in disagreement. I agree that it is necessary for GHC to handle the invocation of the linker and post-processing of objects to realize the RUNPATH needed for a particular link.

However, I do not believe it is necessary for GHC to conflate static and runtime library search paths. Cabal currently uses GHC's -optl-Wl,-rpath and -L flags to precisely specify the compile-time and runtime library search paths needed for a particular install plan. From the perspective of Darwin (the issue fixed in #7094) this status quo is unfortunate since we do not benefit from the clever rpath handling introduced in ghc@4ff93292243888545da452ea4d4c1987f2343591.

However, this is not a fundamental issue: it can be easily avoided by introducing a -rpath flag in GHC's command-line interface, providing a reliable and precise way for the user to influence the runtime library search path. This would allow us to fix the issue pointed out in this ticket (that extra-lib-dirs don't make it into RPATH) while giving us a more robust way to address the Darwin issue addressed in #7094.

One question that remains is whether Cabal should make a distinction between compile-time and run-time library search paths. Arguably it should (e.g. by perhaps introducing extra-run-time-lib-dirs and extra-compile-time-lib-dirs fields in the .project syntax and LocalBuildInfo).

alt-romes added a commit to alt-romes/cabal that referenced this issue Nov 16, 2023
alt-romes added a commit to alt-romes/cabal that referenced this issue Dec 22, 2023
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
alt-romes added a commit to alt-romes/cabal that referenced this issue Dec 22, 2023
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
alt-romes added a commit to alt-romes/cabal that referenced this issue Dec 22, 2023
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
alt-romes added a commit to alt-romes/cabal that referenced this issue Jan 3, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
alt-romes added a commit to alt-romes/cabal that referenced this issue Jan 3, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
Mikolaj pushed a commit to alt-romes/cabal that referenced this issue Jan 20, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
@mergify mergify bot closed this as completed in #9554 Jan 20, 2024
mergify bot pushed a commit that referenced this issue Jan 20, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes #7339
Fixes ghc#19350

(cherry picked from commit addbcbf)

# Conflicts:
#	Cabal/src/Distribution/Simple/GHC/BuildGeneric.hs
#	Cabal/src/Distribution/Simple/GHC/BuildOrRepl.hs
#	Cabal/src/Distribution/Simple/Program/GHC.hs
alt-romes added a commit to alt-romes/cabal that referenced this issue Jan 25, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
alt-romes added a commit that referenced this issue Jan 25, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes #7339
Fixes ghc#19350
Mikolaj pushed a commit that referenced this issue Jan 25, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes #7339
Fixes ghc#19350
erikd pushed a commit to erikd/cabal that referenced this issue Apr 22, 2024
Runtime search paths are hard. Here's the current picture to understand
why this patch exists:

* When linking a shared library, GHC will include in the rpath entries
  of the shared library all the paths listed in the library dirs section
  of the installed package info of all packages the shared library
  depends on.
    * On darwin, GHC has special logic to inject the library dirs listed
      in the installed dependent packages info into the rpath section
      instead of passing the dirs as -rpath flags to the linker.
      However, only the dirs where used libraries are found are
      actually injected. The others are ignored. This works around
      limitations of the darwin loader.

* Cabal, in addition, passes directly to the linker (via
  -optl-Wl,-rpath,...) the library dirs of packages the
  shared library for the package being built depends on.
    * In a vanilla cabal installation, this will typically only be the
      path to the cabal store and the path to the installed GHC's boot
      libraries store.
    * When using nix there will a different library dir per installed
      package. Since these lib dirs are passed directly to the linker as
      rpaths, we bypass the darwin loader logic and, for very big
      packages, on darwin, we could end up reaching the load command
      limit and fail linking. We don't address this situation in this MR.

When we specify `extra-lib-dirs` in Cabal, these extra-lib-dirs will be
added to the library dirs listed in the installed package info of the
library they were specified for. Furthermore, when building a shared
library, extra-lib-dirs will be passed as `-L` flags to the linker
invocation. However, the same extra-lib-dirs will not be passed as
`-rpath` to the linker.

The end situation is as follows:

    1. The shared library `libA` built for a package `A` will be linked
       against some libraries `libExtra` found in extra-lib-dirs
       `extraA`.

    2. The RPATH section of `A` will NOT contain `extraA`, because we
       don't pass -rpath extra-lib-dirs when linking the library, but it
       will depend on `libExtra`.

    3. The installed package info of that package `A` will contain, in
       the library dirs section, the extra-lib-dirs `extraA` and the
       path to `libA`.

    4. When a package `B` depends on package `A`, it will include in the
       RPATH section of the shared library `libB` the lib dirs from the
       installed package info of `A`, i.e. `/path/to/libA` and `extraA`,
       and depends on `libA` and, transitively, on `libExtra`.

The conclusion is:

    5. When we load `libB`, we will load `libA`, which is found in
       `/path/to/libA`, and, transitively, load `libExtra` which is
       found in `extraA` -- they are both found because both
       `/path/to/libA` and `extraA` are listed in the RPATH entries.

    6. However, if we load `libA` directly we will /NOT/ find
       `libExtra`, because `extraA` is not included in the RPATH
       entries.

So, ultimately, what this commit fixes, is the failure described in (6),
caused by the incorrect behaviour of (2), by specifying `-rpath
extra-lib-dirs` when linking the shared library of a package, to include
the extra lib dirs in the RPATH entries of that shared library (even
though dependents of this library would already get the extra-lib-dirs
in their RPATH, the library itself didn't, resulting in cabal#7339 and
ghc#19350)

Fixes haskell#7339
Fixes ghc#19350
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
re: dynamic-linking Concerning dynamic linking (e.g. flags "shared", "*-dynamic") type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants