Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: set docker run --ulimit to workaround Valgrind assertion #27364

Closed

Conversation

fanquake
Copy link
Member

Running the native_fuzz_with_valgrind_job, on aarch64 (Fedora 37), I've seen the following:

Run addr_info_deserialize with args ['valgrind', '--quiet', '--error-exitcode=1', '/home/fedora/ci_scratch/ci/scratch/build/bitcoin-aarch64-unknown-linux-gnu/src/test/fuzz/fuzz', '-runs=1', '/home/fedora/ci_scratch/ci/scratch/qa-assets/fuzz_seed_corpus/addr_info_deserialize']
valgrind: m_libcfile.c:66 (vgPlain_safe_fd): Assertion 'newfd >= VG_(fd_hard_limit)' failed.


valgrind: m_libcfile.c:66 (vgPlain_safe_fd): Assertion 'newfd >= VG_(fd_hard_limit)' failed.

Target "valgrind --quiet --error-exitcode=1 /home/fedora/ci_scratch/ci/scratch/build/bitcoin-aarch64-unknown-linux-gnu/src/test/fuzz/fuzz -runs=1 /home/fedora/ci_scratch/ci/scratch/qa-assets/fuzz_seed_corpus/addr_info_deserialize" failed with exit code -11
./ci/test/04_install.sh: line 98: pop_var_context: head of shell_variables not a function context

This was first reported as a Valgrind bug, https://bugs.kde.org/show_bug.cgi?id=465435, however:

I really think that the problem is with Docker. It's advertising some ridiculously high value for ulimit -n like 1048576. Valgrind wants to put its own files in the top 12 of those slots, and is trying to to a fcntl(oldfd, F_DUPFD, 1048576-12) - note that 1048576-12 matches the 1048564 that you get from the patch message. Then Docker fails to honour its promised file descriptor limit and the fcntl fails.

So the easiest thing to do here might just be to set some sane ulimit values (during docker run), that still work for all other jobs, and avoid the Valgrind assertion (which should become a more useful error message at some point?).

Opening a PR for discussion/brainstorming. The changes in this PR (from the bug report) "fix" this particular issue, but I haven't yet tested all jobs etc. Maybe we'd rather only do this on the affected test.

@DrahtBot
Copy link
Contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.
A summary of reviews will appear here.

@Sjors
Copy link
Member

Sjors commented Mar 30, 2023

I was able to run this, at least to get past addr_info_deserialize, on x86_64 Ubuntu 22.10 with Docker version 23.0.2.

Potential caveat: if you're running Docker in rootless mode then —ulimit seems to be ignored by default. See https://docs.docker.com/engine/security/rootless/#limiting-resources (the instructions there worked for me)

@maflcko
Copy link
Member

maflcko commented Mar 31, 2023

Have you tried podman? The podman-docker on Ubuntu/Debian might help (haven't tried it), but not sure if it exists on your Fedora.

@fanquake fanquake force-pushed the native_fuzz_valgrind_docker_ulimit branch from ee86f7f to c20a3b8 Compare April 13, 2023 12:28
@maflcko
Copy link
Member

maflcko commented Apr 13, 2023

I tried to reproduce on fedora 37 on current master, and it passed with podman

@fanquake
Copy link
Member Author

fanquake commented May 4, 2023

Migrated to podman on 38.

@fanquake fanquake closed this May 4, 2023
@fanquake fanquake deleted the native_fuzz_valgrind_docker_ulimit branch May 4, 2023 12:57
@bitcoin bitcoin locked and limited conversation to collaborators May 3, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants