Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

pmdalinux crash #109

Closed
test-account-0 opened this Issue Aug 26, 2016 · 3 comments

Comments

Projects
None yet
3 participants
Contributor

test-account-0 commented Aug 26, 2016

Version: 3.11.2
OS: ubuntu 12.04

What I know is that pmdalinux stops working in it just hangs as a zombie:

pcp      27674  0.3  0.0  51696  2172 ?        Ss   10:46   0:00 /usr/lib/pcp/bin/pmcd
root     27677  9.3  0.0      0     0 ?        Z    10:46   0:08  \_ [pmdalinux] <defunct>

It stops working after being asked by script similar to pcp2graphite (but a nagios one) for most available metrics.
Not sure how to debug it. Logs and strace do not say much. Nothing under gdb, but maybe I'm using it in the wrong way (just db /var/lib/pcp/pmdas/linux/pmdalinux - it exits after a moment).

Contributor

test-account-0 commented Aug 26, 2016

Sorry, it was just timeout. I have just read than pmcd kills agent if it is too slow.

But shouldn't it be restarted after timeout and not just killed?

Contributor

fche commented Aug 26, 2016

Yeah, there are a couple of mechanisms there for restarting. These aren't perfect.
Ideally, the PMDA should be recoded so it doesn't time out at all during routine operations.

Contributor

natoscott commented Sep 28, 2016

But shouldn't it be restarted after timeout and not just killed?

I've written a FAQ entry for this topic - http://pcp.io/faq.html#T12 - the first restart mechanism appears to have failed for you here @test-account-0 but the second is more persistent & should give you some relief.

@natoscott natoscott closed this Sep 28, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment