Skip to content
This repository has been archived by the owner. It is now read-only.

Systemd core dump management gets process name wrong #172

Closed
bshi opened this issue Oct 23, 2014 · 13 comments
Closed

Systemd core dump management gets process name wrong #172

bshi opened this issue Oct 23, 2014 · 13 comments

Comments

@bshi
Copy link

@bshi bshi commented Oct 23, 2014

Mostly, the new systemd core dump management gets file names right, e.g. a "node" process crash in a docker container is correctly labelled as

    core.node.0.cdbcf55dec1346a3af6aa35edc809e7a.542.1414038423000000

Occasionally, though, a crash will get mislabelled

    core.kworker\x2f3:1H.0.cb71c07955f946aa989231bb17969232.540.1414023975000000

or

    core.btrfs-freespace.0.24dd567d1f4f4c4cac8749c823f0b369.415.1414013963000000    

we have verified that these are core dumps for "node" rather than kworker or btrfs-freespace.

@loe
Copy link

@loe loe commented Dec 16, 2014

I see this as well on Alpha 509.

@crawford
Copy link
Member

@crawford crawford commented Oct 20, 2015

Please reopen if this is still an issue with later versions of CoreOS.

@crawford crawford closed this Oct 20, 2015
@metamatt
Copy link

@metamatt metamatt commented Oct 20, 2015

@crawford Yes, this still happens. Has anyone investigated? I don't seem to have permission to reopen the bug report.

@marineam
Copy link

@marineam marineam commented Oct 20, 2015

@crawford yes, this has always been a pain. The systemd coredump utility hasn't a clue how to deal with PID namespaces so it uses the container PID but grabs process info from the main namespace.

I don't know why this is the case, it seems like a pretty surprising oversight for systemd...

@marineam
Copy link

@marineam marineam commented Oct 20, 2015

@metamatt no, I don't think any of us has looked into it

@bshi
Copy link
Author

@bshi bshi commented Oct 21, 2015

@marineam Is this something we need to bring up with the systemd folks? Is it possible to disable/replace systemd-coredump on CoreOS?

@vcaputo
Copy link

@vcaputo vcaputo commented Oct 21, 2015

@marineam is this another one of those things the systemd folks consider fixed by kdbus?

@antrik
Copy link

@antrik antrik commented Nov 13, 2015

The actual problem here is that the default /usr/lib/sysctl.d/50-coredump.conf shipped with systemd (and CoreOS) sets the core dump command to '/usr/lib/systemd/systemd-coredump %p %u %g %s %t %e', where %p is the "PID of dumped process, as seen in the PID namespace in which the process resides" (according to http://man7.org/linux/man-pages/man5/core.5.html ) -- so systemd-coredump already gets the wrong PID passed in. Just changing %p to %P ("PID of dumped process, as seen in the initial PID namespace (since Linux 3.12)") seems to fix this.

I don't know whether there is a rationale for using %p rather than %P in systemd upstream, or it's just an oversight -- but I don't see any obvious reason not to change this default in CoreOS at least?

@vcaputo
Copy link

@vcaputo vcaputo commented Nov 13, 2015

@antrik That's a great catch, are you going to create an issue for it @ https://github.com/systemd/systemd/issues? This should be fixed upstream.

@antrik
Copy link

@antrik antrik commented Nov 17, 2015

I submitted an upsteam issue. Note though that I'm totally new to all this stuff (systemd, namespaces, CoreOS etc.) -- so someone else might need to chime in if there is some follow-up discussion there...

@vcaputo
Copy link

@vcaputo vcaputo commented Nov 17, 2015

@antrik thanks

@marineam what do you think, wait and see what upstream does on this or go s/%p/%P/ in sysctl.d/50-coredump.conf.in on v225-coreos?

@marineam
Copy link

@marineam marineam commented Nov 17, 2015

We should absolutely make that change.
On Nov 17, 2015 5:23 AM, "Vito Caputo" notifications@github.com wrote:

@antrik https://github.com/antrik thanks

@marineam https://github.com/marineam what do you think, wait and see
what upstream does on this or go s/%p/%P/ in sysctl.d/50-coredump.conf.in
on v225-coreos?


Reply to this email directly or view it on GitHub
#172 (comment).

@crawford crawford added this to the CoreOS 835.5.0 milestone Nov 17, 2015
@crawford crawford removed this from the CoreOS 835.5.0 milestone Nov 17, 2015
@crawford crawford added this to the CoreOS 871.0.0 milestone Nov 17, 2015
@crawford crawford added this to the CoreOS 871.0.0 milestone Nov 17, 2015
@crawford crawford removed this from the CoreOS 835.5.0 milestone Nov 17, 2015
@crawford
Copy link
Member

@crawford crawford commented Nov 17, 2015

Fixed by coreos/systemd#25.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
7 participants