oom diagnostic logging #54

Merged
merged 1 commit into from Mar 7, 2014

7 participants

@glyn
Cloud Foundry member

Disable the oom killer for a container's memory cgroup when the memory
limit is set. If oom occurs later, the problem task (that is, the one
which triggered oom) will be suspended ([1]) and the oom notifier will
be driven. This gives us the opportunity to gather diagnostics showing
details of the out of memory condition.

In the oom notifier, log memory usage, memory limit, swap + memory
usage, swap + memory limit and statistics and re- enable the oom
killer.

Re-enabling the oom killer is necessary in spite of the fact that the
oom notifier proceeds to terminate the container. This is because the
current method of terminating the container can, at least in theory,
deadlock when the container is out of memory (wshd can require more
memory, e.g. for a stack frame, and be suspended due to lack of
memory).

Once re-enabled, the oom killer will kill a task (usually the
application) in the container ([2]) and this will enable container
termination to successfully kill the remaining tasks via wshd. If wshd
hits the container's memory limit with the oom killer enabled, the oom
killer will kill it and this will kill all the other processes in the
container (since wshd is the PID namespace parent).

IntelliJ IDEA .iml files are ignored.

Footnotes:

[1] Linux kernel's ./Documentation/cgroups/memory.txt states:

"If OOM-killer is disabled, tasks under cgroup will hang/sleep in
 memory cgroup's OOM-waitqueue when they request accountable
 memory."

[2] Although memory.txt does not specify that re-enabling the oom
killer in the oom notifier will cause it to kill a task, this
seems like the only reasonable behaviour and it seems to work
that way in practice.

@glyn glyn oom diagnostic logging
Disable the oom killer for a container's memory cgroup when the memory
limit is set. If oom occurs later, the problem task (that is, the one
which triggered oom) will be suspended ([1]) and the oom notifier will
be driven. This gives us the opportunity to gather diagnostics showing
details of the out of memory condition.

In the oom notifier, log memory usage, memory limit, swap + memory
usage, swap + memory limit and statistics and re- enable the oom
killer.

Re-enabling the oom killer is necessary in spite of the fact that the
oom notifier proceeds to terminate the container. This is because the
current method of terminating the container can, at least in theory,
deadlock when the container is out of memory (wshd can require more
memory, e.g. for a stack frame, and be suspended due to lack of
memory).

Once re-enabled, the oom killer will kill a task (usually the
application) in the container ([2]) and this will enable container
termination to successfully kill the remaining tasks via wshd. If wshd
hits the container's memory limit with the oom killer enabled, the oom
killer will kill it and this will kill all the other processes in the
container (since wshd is the PID namespace parent).

IntelliJ IDEA .iml files are ignored.

Footnotes:

[1] Linux kernel's ./Documentation/cgroups/memory.txt states:

    "If OOM-killer is disabled, tasks under cgroup will hang/sleep in
     memory cgroup's OOM-waitqueue when they request accountable
     memory."

[2] Although memory.txt does not specify that re-enabling the oom
    killer in the oom notifier will cause it to kill a task, this
    seems like the only reasonable behaviour and it seems to work
    that way in practice.
1447b38
@cf-gitbot

We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: http://www.pivotaltracker.com/story/show/65709772

@ematpl
Cloud Foundry member

Hi, @glyn,

Looks like there were a few Travis failures on the warden-client running on Ruby 2.0.0. Could you investigate those?

Thanks,
-CF Community Pair (@ematpl & @mmb)

@glyn
Cloud Foundry member
@glyn
Cloud Foundry member
@ematpl
Cloud Foundry member

Sure, it's at https://travis-ci.org/cloudfoundry/warden/builds/18796998. That link is also available from the little red X to the left of the commit sha in this thread, but that's far from obvious. :)

Thanks,
CF Community Pair (@ematpl & @mmb)

@glyn
Cloud Foundry member
@glyn
Cloud Foundry member
@ematpl
Cloud Foundry member

Hi, @glyn,

Looks like this may be a warden test that has become flaky. We've created a story for Runtime to investigate the test failures. In the meantime, we rekicked that Travis job and it passed, so we'll move the story for this PR over to Runtime as well for processing. Thanks!

-CF Community Pair (@ematpl & @mmb)

@glyn
Cloud Foundry member
@kelapure

Glyn this is fantastic. Thank you for making this change!

@rmorgan

Glyn, I did some digging into this PR today, I see it's scheduled in the runtime backlog for the week of the 24th. See: https://www.pivotaltracker.com/story/show/65709772

@ruthie ruthie merged commit dc6d938 into cloudfoundry:master Mar 7, 2014

1 check passed

Details default The Travis CI build passed
@MarkKropf
Cloud Foundry member

@vito @mkocher Should similar logic be considered for garden?

@glyn
Cloud Foundry member
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment