Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: asan build randomly failing #5801

Closed
grondo opened this issue Mar 18, 2024 · 2 comments · Fixed by #5802
Closed

ci: asan build randomly failing #5801

grondo opened this issue Mar 18, 2024 · 2 comments · Fixed by #5802

Comments

@grondo
Copy link
Contributor

grondo commented Mar 18, 2024

The address sanitizer build wasn't fully fixed by the most recent PR #5800. Instead, I'm seeing the build sometimes succeed and sometimes fail even with those changes. The random failures appear to occur in the Lua unit tests. I used tmate to see what's going on in the actual CI builder and noticed that make check in src/bindings/lua sometimes succeeds, but more often fails with the error:

ERROR: tests/t0000-json
=======================

AddressSanitizer:DEADLYSIGNAL
=================================================================
==986==ERROR: AddressSanitizer: SEGV on unknown address 0x62e53ae85328 (pc 0x79a0c6bae3e0 bp 0x79a0c60b5d6f sp 0x7ffd01713d60 T-1)
==986==The signal is caused by a READ memory access.
AddressSanitizer:DEADLYSIGNAL
AddressSanitizer: nested bug in the same thread, aborting.
ERROR: tests/t0000-json.t - missing test plan
ERROR: tests/t0000-json.t - exited with status 1

I'm not sure what would cause random failures in asan, but looking into it.

@vchuravy
Copy link
Contributor

This might be https://zolutal.github.io/aslrnt/ | google/sanitizers#1614
See libuv/libuv#4365 for a potential fix.

@grondo
Copy link
Contributor Author

grondo commented Mar 18, 2024

Thank you @vchuravy! That is indeed the problem.

grondo added a commit to grondo/flux-core that referenced this issue Mar 18, 2024
Problem: Recent kernel address space layout randomization (ASLR)
appears to confuse some versions of address sanitizers, leading to
random failures in github CI.

Disable kernel.randomize_va_space for the github asan build to avoid
these failures. Thanks to @vchuravy for pointing out the issue.

Since LD_PRELOAD of libasan wasn't the root cause, reinstate preloading
the libasan library to get as full coverage as possible.

Fixes flux-framework#5801.
grondo added a commit to grondo/flux-core that referenced this issue Mar 18, 2024
Problem: Recent kernel address space layout randomization (ASLR)
appears to confuse some versions of address sanitizers, leading to
random failures in github CI.

Disable kernel.randomize_va_space for the github asan build to avoid
these failures. Thanks to @vchuravy for pointing out the issue.

Since LD_PRELOAD of libasan wasn't the root cause, reinstate preloading
the libasan library to get as full coverage as possible.

Fixes flux-framework#5801.
@mergify mergify bot closed this as completed in #5802 Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants