-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix coredump deadlock with overly long backtraces #25055
Conversation
|
so I tried a systemd with this patch on a gnome-calendar crash, and still didn't get a coredump, probably because: so it took over five minutes to deal with the core dump, and timed out... |
On my machine, generating the crash takes 5 seconds. So if it timed out, either something is wrong with the patch, or it wasn't applied properly, or there's some other bug. I'll test my patch again here. Edit: yeah, it seems to work fine here. $ journalctl -b _PID=2667772 -o cat | head
Process 2667524 (3) of user 1000 dumped core.
Module linux-vdso.so.1 with build-id b21929a49b68b52a4059932eb797eb7182e48f08
Module libopensc.so.8 with build-id 7e70dbbc60cc7358472bcd2a198f1d548af8ab62
Metadata for module libopensc.so.8 owned by FDO found: {
"type" : "rpm",
"name" : "opensc",
"version" : "0.22.0-7.fc37",
"architecture" : "x86_64",
"osCpe" : "cpe:/o:fedoraproject:fedora:37"
$ journalctl -b _PID=2667772 -o cat | wc -c
69644 |
37cd36a
to
1d89509
Compare
I'm not sure why this fails. It works for me locally and reading the man page, it should work. I changed it to open |
It is useful to distinguish if json_parse_file() got no input or invalid input. Use different return codes for the two cases.
We would deadlock when passing the data back from the forked-off process that was doing backtrace generation back to the coredump parent. This is because we fork the child and wait for it to exit. The child tries to write too much data to the output pipe, and and after the first 64k blocks on the parent because the pipe is full. The bug surfaced in Fedora because of a combination of four factors: - 8770778 was backported to v251.5, which allowed coredump processing to be successful. - 1a0281a was NOT backported, so the output was very verbose. - Fedora has the ELF package metadata available, so a lot of output can be generated. Most other distros just don't have the information. - gnome-calendar crashes and has a bazillion modules and 69596 bytes of output are generated for it. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2135778. The code is changed to try to write data opportunistically. If we get partial information, that is still logged. In is generally better to log partial backtrace information than nothing at all.
1d89509
to
076b807
Compare
|
Hmm, usually when reading output from a child we wait for EOF on that before waiting for the exit code of the child. Here it is done the other way round, which is the bug, no? |
I considered that too. We could read and accumulate input from the child until EOF. But that'd make the receiving side more complicated, because it'd need to grow a buffer, do a loop, and so on. But realistically, I don't think this is useful. The child generates the whole string first, so it knows how big it is, and can size the pipe appropriately. And also, the output should not be arbitrarily long. After all, we're going to stuff this in the journal, and we don't want to put a million lines in the message. So a more generic transfer mechanism would be more complicated but it wouldn't really give us any gains. |
|
The C8S fail is unrelated (#25060). |
|
hmm, one question: maybe the transfer mechanism should not actually be a pipe, but a memfd? a memfd can auto-grow and we wouldn't be subject to pipe size limits? i.e. instead of a pipe, just allocate a memfd, and pass it to the child? child writes its stuff there. afterwards we seek back to the front and read it? I think the code would be more robust and even simpler? |
|
memfd are 3.17 stuff, which is older than our recommended baseline, hence I think that would be Ok to simply use? |
|
memfd are just memory that is accounted like other memory, so it's generally a good idea to use, since less special than pipes |
Probably best to leave that for the next release, and just fix the bug as-is for 252? It's a different model, so there is inherent risk of introducing new regressions by changing over |
|
dunno i think switching this to memfds will remove the entire category of bug here, instead of just trying to massage it to not happen anymore. |
Yes, but it could introduce new bugs that we aren't aware of - post-rc2 is late, surely it can wait a couple of weeks till after the next release? |
|
Centos CI (CentOS Stream 8) is TEST-15-DROPIN, i.e. #25060. Not related.
Yep. I think we should merge this. Memfd might be nicer, but it'll likely touch a lot of code and has higher potential for regressions. |
Hmm, that's odd. Note I did a build of the systemd from F37, so 251, with the patch backported. But yeah, then all I did was make gnome-calendar crash and wait. I did do it in a VM with 2G of RAM, so maybe that was a problem... |
|
Just to make it easier to fill out some form I'll leave a script bringing down systemd/units/systemd-coredump.socket Line 21 in 09cd639
#!/bin/bash
set -ex
ulimit -c unlimited
d=$(mktemp -d)
pushd "$d"
cat <<'EOL' >hey.c
void f() { f(); }
int main(int argc, char *argv[]) {
f();
return 0;
}
EOL
gcc hey.c
a=$(perl -e 'print "A" x 255')
for i in {1..10}; do
mkdir "$a"
mv a.out "$a"
cd "$a"
done
for i in {1..16}; do
./a.out &
done |
No description provided.