Skip to content

Bypass library finalisation to fix libgomp crash#1202

Merged
lionel- merged 2 commits intomainfrom
bugfix/shutdown
May 6, 2026
Merged

Bypass library finalisation to fix libgomp crash#1202
lionel- merged 2 commits intomainfrom
bugfix/shutdown

Conversation

@lionel-
Copy link
Copy Markdown
Contributor

@lionel- lionel- commented May 6, 2026

(Hopefully) fixes these flakes:

     SIGABRT [   0.980s] ( 667/1297) ark::dap_notebook test_notebook_debug_info
  stdout ───

    running 1 test
    test test_notebook_debug_info ... ok

    test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 18 filtered out; finished in 0.21s

  stderr ───
    dap_notebook-d63c677b2730ef53: ../../../src/libgomp/oacc-init.c:84: goacc_register: Assertion `!dispatchers[disp->type]' failed.

    (test aborted with signal 6: SIGABRT)

libgomp is a library loaded via BLAS on our Ubuntu runners. The stderr output indicates that the library is getting initialised a second time, triggering this assertion abort.

We still don't know why libgomp is getting initialised a second time, but we know that it happens after the test passed (e.g. ok result in output). We also know from LD_DEBUG output that it happens during linked library teardown when the process exits:

calling fini: libR.so [0]
calling fini: libgomp.so.1 [0]
find library=libR.so [0]; searching
calling init: libgomp.so.1    ← second init triggers assertion failure

According to glibc'd deepwiki, library finalisation happens via the atexit hook, registered very early by the dynamic linker at process startup. The workaround implemented in this PR is to register an atexit hook in the test harness on Linux to call _exit(), which is documented to exit immediately without any further cleanup (https://man7.org/linux/man-pages/man2/_exit.2.html):

The function _exit() is like exit(3), but does not call any functions registered with atexit(3) or on_exit(3).

Furthermore, from exit() (https://man7.org/linux/man-pages/man3/exit.3.html):

If one of these functions [from atexit] does not return (e.g., it calls _exit(2), or kills itself with a signal), then none of the remaining functions is called, and further exit processing (in particular, flushing of stdio(3) streams) is abandoned.

So exiting with code 0 from an atexit hook should circumvent all this library finalisation procedure.

@lionel- lionel- requested a review from DavisVaughan May 6, 2026 09:10
@lionel-
Copy link
Copy Markdown
Contributor Author

lionel- commented May 6, 2026

40 successful runs of Linux CI 🤞

Comment thread crates/ark_test/src/dummy_frontend.rs
@lionel- lionel- force-pushed the bugfix/shutdown branch from 5bdab74 to cdce2d9 Compare May 6, 2026 13:43
Comment on lines +43 to +47
- name: Configure R compilation flags
run: |
mkdir -p ~/.R
echo "CPPFLAGS += -I$(brew --prefix gettext)/include" >> ~/.R/Makevars
echo "LDFLAGS += -L$(brew --prefix gettext)/lib" >> ~/.R/Makevars
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think a note about this being for data table when we need to compile from source would be useful, otherwise im left going 'wtf is this for'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's for any package we need to compile, not just data.table. It's basic Makevars configuration for that runner

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

      # This configures Makevars to look for brew headers and libraries. Useful
      # to compile packages used in tests when they have just been released to
      # CRAN and there is no binary we can download yet. We've needed this for
      # data.table.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rarely if ever need to fiddle with makevars on ci anymore, even for compiled packages, so some kind of justification always feels necessary when i need to have something like this

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should do this instead

https://github.com/Rdatatable/data.table/blob/cd3ef291a67b28daeaa2669d8e98aaf5dbfb0ac4/.github/workflows/R-CMD-check.yaml#L49-L53

its tailor made for this use case

stan-dev/cmdstanr#1072

R no longer bundles libintl, and the current setup-r-dependencies action r-lib/actions#998, so it was recommended we use separate action for this

and libintl was the problem

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about tailor made for this use case. It's presented as one of the potential solutions, along with installing the dep yourself.

In this case we already have the system lib, somehow, so downloading all sys reqs again (326mb) to extract a copy in /opt/R seems a bit heavy handed compared to pointing R to the brew files. Perhaps it's more future proof than whatever we did to the CI setup causing the sysreq to be already installed by brew. I wouldn't worry about this for now though.

@lionel- lionel- force-pushed the bugfix/shutdown branch from cdce2d9 to d4e47e9 Compare May 6, 2026 13:59
@lionel- lionel- merged commit 82f5214 into main May 6, 2026
15 checks passed
@lionel- lionel- deleted the bugfix/shutdown branch May 6, 2026 13:59
@github-actions github-actions Bot locked and limited conversation to collaborators May 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants