Build dynamic GHC in github CI#137
Closed
hasufell wants to merge 19 commits intostable-ghc-9.14from
Closed
Conversation
9886269 to
3004a5d
Compare
9cbe6d3 to
a206e83
Compare
8235630 to
ca591b6
Compare
2 tasks
|
Of note: from #45 work, the compiler conceptually worked. The issue was mostly getting the test-suite on linux green. On darwin linking was working well, on linux it wasn't as we were missing the library paths. I tried really hard to not rely on the LD_LIBRARY_PATH/LD_PRELOAD solution. Ideally GHC would set these correctly. I remember writing about this some where (slack, discord?) at length, but it's been month ago, I should have written this here :( |
89052eb to
cd43088
Compare
cd43088 to
0f9cc93
Compare
- Add -no-ghc-internal flag to prevent auto-injection of ghc-internal when building RTS sublibraries (mirrors existing -no-rts flag). This prevents circular dependency issues during RTS build. - Move CMM sources into common rts-cmm-sources-base stanza shared by all sublibraries (threaded, debug, etc.) - Implement AutoApply.cmm.h workaround: generate .cmm.h files in main library, include them via wrapper .cmm files in sublibraries. This ensures each sublibrary gets properly parameterized CMM code. - Double all build flags (ghc-options, cpp-options, cmm-options, cc-options) in rts.cabal to ensure they're definitely propagated - Remove auto-link from hp2ps and unlit (C-only utilities that don't need Haskell runtime linking)
Add libffi-clib package configuration to suppress C compiler warnings that are promoted to errors by default. The bundled libffi code produces warnings that would otherwise fail the build. This is needed because: - GHC uses bundled libffi for FFI support - The libffi-clib wrapper exposes libffi to the RTS - Upstream libffi code triggers compiler warnings - With -Werror (often set by default), these become fatal errors
This commit fixes several dynamic linking issues that arise when using GHC with RTS sublibraries and the GHC API: - Inject rpath for RTS and libffi-clib in dynamic builds. This ensures the dynamic linker can find these libraries at runtime, especially when cabal passes -dyload deploy. - Promote ghc-internal to RTLD_GLOBAL when loaded via dlopen. This prevents duplicate symbol errors when multiple shared libraries reference the same ghc-internal symbols. - Export RTS symbols from ghc-iserv for dynamic builds. Programs using the GHC API load shared libraries via dlopen() that reference RTS symbols like stg_INTLIKE_closure. - Apply -rdynamic unconditionally for GHC API programs (Linux/FreeBSD) and -flat_namespace for macOS. This makes RTS symbols visible to dynamically loaded libraries even when the main executable wasn't compiled with -dynamic. See Note [Export dynamic symbols for GHC API programs] in Linker/Static.hs and Note [ghc-iserv and dynamic symbol export] in ghc-iserv.cabal.in.
Improve error handling and path resolution in the test driver: - Add proper error handling for missing directories and files when searching for shared objects, raising StatsException with clear error messages instead of silently failing - Fix path resolution in collect_size_func to handle both absolute and relative paths correctly - Improve ghc-pkg output parsing to handle various output formats - Add fallback logic for finding shared objects: try inplace first, fall back to non-inplace if that fails - Convert silent failures to explicit StatsException raises so test failures are properly reported
Adjust the test suite for the RTS sublibrary split: - Prefix ghcconfig filename with hash of TEST_HC binary to ensure we recompute the config when the compiler changes. This prevents stale config values when switching between different GHC versions. - Disable rts test which is invalid since the RTS split (the test assumes monolithic RTS structure) - Mark T2228 as not broken (it was incorrectly marked) - Add testsuite-specific .gitignore entries
Replace hardcoded ["rts", "libffi-clib"] list with a function that dynamically computes which packages need rpath injection by checking: 1. Any package named "rts" (covers all RTS sublibraries) 2. Any direct dependency of an RTS package This is more robust as it will automatically handle any future library dependencies the RTS might gain. Also adds Note [RTS sublibrary rpath injection] explaining why GHC must always inject rpaths for RTS-related libraries regardless of Cabal's -dynload deploy setting - Cabal fundamentally cannot see the RTS sublibrary selection which happens at GHC link time.
Static.hs imported GHC.Linker.ExtraObj which doesn't exist in the stable-ghc-9.14 base branch. The required functions are already available in GHC.Linker.Executable with ExecutableLinkOpts-based API. - Export mkExtraObjToLinkIntoBinary and mkNoteObjsToLinkIntoBinary from Executable.hs - Update Static.hs to import from Executable instead of ExtraObj - Convert DynFlags to ExecutableLinkOpts using initExecutableLinkOpts
The linker module APIs differ between stable-ghc-9.14 and the rebased branch. Update to use the correct function signatures: - maybeCreateManifest: use initManifestOpts dflags instead of dflags - initLinkerConfig: takes only DynFlags (no require_cxx parameter) - runLink: pass require_cxx as 4th argument - runInjectRPaths: use configureOtool/configureInstallName instead of toolSettings - runRanlib: use configureRanlib dflags instead of dflags Import the required config functions from GHC.SysTools.Tasks.
On Darwin, install_name_tool -add_rpath fails if the rpath already exists. When building with DYNAMIC=1, the GHC linker already injects rpaths during linking (via runInjectRPaths), so binaries already have the @executable_path rpath when the bindist target tries to add it again. Fix by checking if the rpath already exists before attempting to add it.
On Unix systems, fundamental libraries like libc, libm, pthread, dl, and rt are always linked into any process at startup. When GHCi tries to load these via dlopen (e.g., because ghc-internal has "extra-libraries: c m"), it can cause problems on NixOS where gcc may find a different version than the one the interpreter is linked against. Loading multiple copies of libc causes memory corruption and "strange closure type" GC crashes. The fix adds isAlwaysLinkedLib check in load_dyn to skip loading these fundamental system libraries, as they're always available. This fixes ghci-ext test failures on DYNAMIC=1 builds with NixOS.
The dlerror() function can only be called once per error - subsequent calls return NULL. Debug logging was consuming dlerror() before it could be saved for the linker script fallback handler. On Linux, system libraries like libc.so and libm.so are often GNU ld linker scripts rather than actual ELF files. The RTS has code in loadNativeObjFromLinkerScript_ELF() to handle this case by parsing the linker script and extracting the real library path (e.g., libc.so -> libc.so.6). However, this fallback requires the error message from dlerror() which contains the filename. When debug logging called dlerror() first, the error message was lost and the linker script handler received NULL, causing it to fail silently. Fix by saving dlerror() immediately after dlopen fails, before any debug logging. This allows the linker script fallback to work correctly on systems (like NixOS) where libc.so is a linker script. Also disables non-Nix CI PR triggers in release.yml to save resources.
The RTLD_NOLOAD check was in a preprocessor guard that occurred before including <dlfcn.h>, so it was never defined and the promoteBootLibrariesToGlobal function was being compiled out entirely. This fixes the ghci-ext test failures on Linux with DYNAMIC=1 builds.
On some Linux/glibc configurations, RTLD_NOLOAD requires _GNU_SOURCE to be defined. This must be done before any headers are included. This ensures promoteBootLibrariesToGlobal() is compiled on Linux.
|
Superceeded by #45 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.