Skip to content

fix: certain macos sdks set symbols to NULL, resulting in segfaults#363

Merged
bgamari merged 5 commits intohaskell:masterfrom
MangoIV:mangoiv/experiments
Apr 28, 2026
Merged

fix: certain macos sdks set symbols to NULL, resulting in segfaults#363
bgamari merged 5 commits intohaskell:masterfrom
MangoIV:mangoiv/experiments

Conversation

@MangoIV
Copy link
Copy Markdown
Contributor

@MangoIV MangoIV commented Apr 22, 2026

    In certain toolchain versions, apple will act as if symbols exist
    because the toolchain *can be used* for targets that have them.
    At runtime, however, they will be NULL.
    This resulted in a segfault trying to use the
    posix_spawn_file_actions_addchidr symbol.
    The resultion is as follows: If both symbols are available, first
    check if the _np version *should* be available, if it is, add
    a runtime check whether the non-_np version is NULL and fallback
    to the _np version if it is.

    The easiest and safest fix here would be to move both checks entirely to
    the runtime, but for conservativity reasons this was not done.

Also a couple changes to ci

  • make build on ghc 9.12
  • make build on newer macos
  • build a test that only runs upstream (ghc) normally

Resolves #356.

@tomjaguarpaw
Copy link
Copy Markdown
Member

Looks like

  1. we need to skip older GHCs on MacOS
  2. there's some type error triggered on 9.12 across all platforms

@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

@tomjaguarpaw I'm experimenting here b/c I need to know where a segfault comes from on macos, so nvm this MR for now ^^

@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

there's some type error triggered on 9.12 across all platforms

yeah that's #362

@MangoIV MangoIV force-pushed the mangoiv/experiments branch 3 times, most recently from e53f040 to a256e89 Compare April 22, 2026 14:25
@MangoIV MangoIV changed the title chore: test on macos 15, test ghc test, too fix: certain macos sdks set symbols to NULL, resulting in segfaults Apr 22, 2026
@MangoIV MangoIV force-pushed the mangoiv/experiments branch 3 times, most recently from abb7f2c to 352dba4 Compare April 22, 2026 15:16
@MangoIV MangoIV marked this pull request as ready for review April 22, 2026 15:19
@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

I hope I didn't run us out of actions budget.

@MangoIV MangoIV force-pushed the mangoiv/experiments branch 2 times, most recently from afe6266 to c84ac5a Compare April 22, 2026 15:23
@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

I hope I didn't run us out of actions budget.

ah no just typo

@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

This fixes the issue. Unfortunately the tests we're running wouldn't be able to confirm it because their toolchain version is too old.

I don't know how to fix this weird custom setup non-sense for ghc version 9.12 though. Seems like it's just bugged.

On another note. It's interesting that this wasn't happening on uploads to hackage done by you @tomjaguarpaw -- I assume this is the case because the configure script you uploaded doesn't even correctly configure with posix_spawnp using fork in all cases, see #356. The configure script uploaded by you gives the following output:

`./configure`
checking for gcc... gcc
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -E
checking for grep that handles long lines and -e... /usr/bin/grep
checking for egrep... /usr/bin/grep -E
checking for ANSI C header files... yes
checking for sys/types.h... yes
checking for sys/stat.h... yes
checking for stdlib.h... yes
checking for string.h... yes
checking for memory.h... yes
checking for strings.h... yes
checking for inttypes.h... yes
checking for stdint.h... yes
checking for unistd.h... yes
checking for pid_t... yes
checking vfork.h usability... no
checking vfork.h presence... no
checking for vfork.h... no
checking for fork... yes
checking for vfork... yes
checking for working fork... yes
checking for working vfork... (cached) yes
checking signal.h usability... yes
checking signal.h presence... yes
checking for signal.h... yes
checking sys/wait.h usability... yes
checking sys/wait.h presence... yes
checking for sys/wait.h... yes
checking fcntl.h usability... yes
checking fcntl.h presence... yes
checking for fcntl.h... yes
checking for setitimer... yes
checking for sysconf... yes
checking value of SIG_DFL... -1
checking value of SIG_IGN... -1
configure: creating ./config.status

crucially, all the checks for posix_spawnp related symbols are missing:

checking for posix_spawnp... yes
checking for posix_spawn_file_actions_addchdir_np... yes
checking for posix_spawn_file_actions_addchdir... yes

Could you check whether you could upgrade your autotools version? Alternatively, we could consider creating the configure script as a job artifact, that way it's at least transparent what is being uploaded to hackage and why.

Thank you!

@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

There's another thing worth noting here. I tried to keep the change as localized to this one bug as possible. The cleanest solution here would probably be to move this check entirely to runtime, since apparently some platforms won't actually make symbols available, even if they configure with that symbol.

Please tell me if you want to do that instead and get rid of all the CPP.

@tomjaguarpaw
Copy link
Copy Markdown
Member

Alternatively, we could consider creating the configure script as a job artifact, that way it's at least transparent what is being uploaded to hackage and why.

I think this one is the correct way. It still not even clear to me why the system of the uploader (or rather, probably the one who runs cabal sdist is relevant). Quite possibly something is just broken because of that. (Quite honestly I am out of my depth maintaining this package. See haskell/core-libraries-committee#411).

@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 22, 2026

It still not even clear to me why the system of the uploader (or rather, probably the one who runs cabal sdist is relevant)

For sure. It should be. It’s weird as well in the sense that

  1. hackage distributes process pre-configured (I think because cabal can’t run autotools)
  2. GHC itself reconfigures it before building it as a boot lib
  3. My version of autotools 2.69 (the one that you seem to be using, too) does something different than mine

I found this all very obscure. I think both the workflow for hackage uploads (which requires „extra steps“ instead of uploading the same thing that is on GitHub) and the fact that different distributions of autotools don’t seem to do the same thing even if they’re the same version, are at fault here.

@MangoIV MangoIV force-pushed the mangoiv/experiments branch from c84ac5a to 9e54be9 Compare April 23, 2026 08:24
@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 23, 2026

Another thing worth noting is that other languages (like rust) prefer the _np version. They also have special handling for relative paths on macos though since the _np version is also broken on macos (which we don't; if we generally want to prever the _np version we probably increase the amounts of failures we get because we do not have this special handling.

@tomjaguarpaw
Copy link
Copy Markdown
Member

Also a couple changes to ci

  • make build on ghc 9.12
  • make build on newer macos
  • build a test that only runs upstream (ghc) normally

Would you mind moving anything that is not dependent on this PR to its own PR? I can merge changes to CI very quickly but looking at code changes takes longer.

Comment thread cbits/posix/posix_spawn.c
@bgamari
Copy link
Copy Markdown
Contributor

bgamari commented Apr 28, 2026

The content of this change looks reasonable. I do think we should try to understand the autoconf dependence reported in #356 but I would be happy to put out another release in the meantime.

Beyond that, I believe we should improve the traceability of our release process. The fact that we have multiple maintainers putting out releases with different autoconf versions which differ in observable ways is very far from ideal. This really should be reproducibly scripted at very least.

@MangoIV
Copy link
Copy Markdown
Contributor Author

MangoIV commented Apr 28, 2026

Yes, I already proposed an am willing to implement outputting release ready artifacts (including the configure script) via github actions. I think this would improve reproducibility of the releases.

MangoIV added 3 commits April 28, 2026 16:24
In certain toolchain versions, apple will act as if symbols exist
because the toolchain *can be used* for targets that have them.
At runtime, however, they will be NULL.
This resulted in a segfault trying to use the
posix_spawn_file_actions_addchidr symbol.
The resultion is as follows: If both symbols are available, first
check if the _np version *should* be available, if it is, add
a runtime check whether the non-_np version is NULL and fallback
to the _np version if it is.

The easiest and safest fix here would be to move both checks entirely to
the runtime, but for conservativity reasons this was not done.

Resolves haskell#356.
The pragma for the changes needs to be applied for cabal version 3.14
and upwards, not 3.16
@MangoIV MangoIV force-pushed the mangoiv/experiments branch from 9e54be9 to 786f829 Compare April 28, 2026 14:25
@bgamari
Copy link
Copy Markdown
Contributor

bgamari commented Apr 28, 2026

Yes, I think adding a workflow which produces a ready-to-upload source distribution would be a good improvement.

@bgamari bgamari merged commit 0175b4d into haskell:master Apr 28, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

process can segfault on macOS Sequoia with a configure script from Autoconf 2.72

3 participants