Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RTS ticker thread can cause trouble #3

Open
sorki opened this issue Nov 12, 2023 · 2 comments · May be fixed by #5
Open

RTS ticker thread can cause trouble #3

sorki opened this issue Nov 12, 2023 · 2 comments · May be fixed by #5

Comments

@sorki
Copy link
Contributor

sorki commented Nov 12, 2023

Recently I found that a testsuite using this package started failing with unshare: invalid argument but I wasn't sure what was going on as command line unshare worked just fine. Comparing both calls I didn't see much difference but then stumbled on clone3 call done by GHC forking a process called ghc_ticker. The use of ticker seems to depend on compile time options and availability of packages during GHC build.

Some more info https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts/signals#the-rts-timer-signal

This started manifesting in CI which used latest Ubuntu and on NixOS as well.

The fix is to disable the timer with

ghc-options:       -rtsopts "-with-rtsopts -V0"

From help:

hnix-store-remote-tests:   -V<secs>  Master tick interval in seconds (0 == disable timer).
hnix-store-remote-tests:             This sets the resolution for -C and the heap profile timer -i,
hnix-store-remote-tests:             and is the frequency of time profile samples.
hnix-store-remote-tests:             Default: 0.01 sec.

Should we add this to the comments (or README) that already mention issues with -threaded?

During the debugging I've also extracted the example to separate cabal executable - want a PR? I can also PR a simple testsuite + CI if you want.

sorki added a commit to sorki/hnix-store that referenced this issue Nov 12, 2023
@redneb
Copy link
Owner

redneb commented Dec 6, 2023

Thanks for this report and apologies for the late reply.

I have never run into this problem. Maybe it is caused in the scenario when the ticker thread uses signals? I thought that this only happens on old kernels. I would be happy to add a note to the documentation and the place where -threaded is mentioned is probably the best place. But I would be reluctant to suggest disabling the ticker as the default recommended practice, as this can have undesired consequences.

So a PR would be welcome.

sorki added a commit to sorki/hs-linux-namespaces that referenced this issue Dec 19, 2023
@sorki sorki linked a pull request Dec 19, 2023 that will close this issue
@sorki
Copy link
Contributor Author

sorki commented Dec 19, 2023

I have never run into this problem. Maybe it is caused in the scenario when the ticker thread uses signals? I thought that this only happens on old kernels.

I think it is the opposite - the signals are fine but less efficient so newer configurations of GHC spawn a separate thread for the ticker. I wish I've had an exact commit where this changed but the test+CI where this started failing wasn't rolling with nixpkgs versions so I would have to bisect a lot and I guess I would arrive at some GHC bump.

It also depends on the build environment/configuration so the some GHCs might exhibit this depending on the distro while the exact same version might not elsewhere.

I'm mostly using this package for testsuites (in hnix-store-remote to trick nix-daemon to think it's running as root and before I've used it in rtnetlink-hs so its testsuite doesn't need to run as root).

Btw if you have a more elaborate example of unshare and then a subprocess using unshare with multiple mappings that would help me a lot improving the test environment for nix-daemon as it spawns more processes that are not happy with just a single mapping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants