Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

oom diagnostic logging #54

Merged
merged 1 commit into from

7 participants

@glyn
Owner

Disable the oom killer for a container's memory cgroup when the memory
limit is set. If oom occurs later, the problem task (that is, the one
which triggered oom) will be suspended ([1]) and the oom notifier will
be driven. This gives us the opportunity to gather diagnostics showing
details of the out of memory condition.

In the oom notifier, log memory usage, memory limit, swap + memory
usage, swap + memory limit and statistics and re- enable the oom
killer.

Re-enabling the oom killer is necessary in spite of the fact that the
oom notifier proceeds to terminate the container. This is because the
current method of terminating the container can, at least in theory,
deadlock when the container is out of memory (wshd can require more
memory, e.g. for a stack frame, and be suspended due to lack of
memory).

Once re-enabled, the oom killer will kill a task (usually the
application) in the container ([2]) and this will enable container
termination to successfully kill the remaining tasks via wshd. If wshd
hits the container's memory limit with the oom killer enabled, the oom
killer will kill it and this will kill all the other processes in the
container (since wshd is the PID namespace parent).

IntelliJ IDEA .iml files are ignored.

Footnotes:

[1] Linux kernel's ./Documentation/cgroups/memory.txt states:

"If OOM-killer is disabled, tasks under cgroup will hang/sleep in
 memory cgroup's OOM-waitqueue when they request accountable
 memory."

[2] Although memory.txt does not specify that re-enabling the oom
killer in the oom notifier will cause it to kill a task, this
seems like the only reasonable behaviour and it seems to work
that way in practice.

@glyn glyn oom diagnostic logging
Disable the oom killer for a container's memory cgroup when the memory
limit is set. If oom occurs later, the problem task (that is, the one
which triggered oom) will be suspended ([1]) and the oom notifier will
be driven. This gives us the opportunity to gather diagnostics showing
details of the out of memory condition.

In the oom notifier, log memory usage, memory limit, swap + memory
usage, swap + memory limit and statistics and re- enable the oom
killer.

Re-enabling the oom killer is necessary in spite of the fact that the
oom notifier proceeds to terminate the container. This is because the
current method of terminating the container can, at least in theory,
deadlock when the container is out of memory (wshd can require more
memory, e.g. for a stack frame, and be suspended due to lack of
memory).

Once re-enabled, the oom killer will kill a task (usually the
application) in the container ([2]) and this will enable container
termination to successfully kill the remaining tasks via wshd. If wshd
hits the container's memory limit with the oom killer enabled, the oom
killer will kill it and this will kill all the other processes in the
container (since wshd is the PID namespace parent).

IntelliJ IDEA .iml files are ignored.

Footnotes:

[1] Linux kernel's ./Documentation/cgroups/memory.txt states:

    "If OOM-killer is disabled, tasks under cgroup will hang/sleep in
     memory cgroup's OOM-waitqueue when they request accountable
     memory."

[2] Although memory.txt does not specify that re-enabling the oom
    killer in the oom notifier will cause it to kill a task, this
    seems like the only reasonable behaviour and it seems to work
    that way in practice.
1447b38
@cf-gitbot
Owner

We have created an issue in Pivotal Tracker to manage this. You can view the current status of your issue at: http://www.pivotaltracker.com/story/show/65709772

@ematpl
Owner

Hi, @glyn,

Looks like there were a few Travis failures on the warden-client running on Ruby 2.0.0. Could you investigate those?

Thanks,
-CF Community Pair (@ematpl & @mmb)

@glyn
Owner
@glyn
Owner
@ematpl
Owner

Sure, it's at https://travis-ci.org/cloudfoundry/warden/builds/18796998. That link is also available from the little red X to the left of the commit sha in this thread, but that's far from obvious. :)

Thanks,
CF Community Pair (@ematpl & @mmb)

@glyn
Owner
@glyn
Owner
@ematpl
Owner

Hi, @glyn,

Looks like this may be a warden test that has become flaky. We've created a story for Runtime to investigate the test failures. In the meantime, we rekicked that Travis job and it passed, so we'll move the story for this PR over to Runtime as well for processing. Thanks!

-CF Community Pair (@ematpl & @mmb)

@glyn
Owner
@kelapure

Glyn this is fantastic. Thank you for making this change!

@rmorgan
Owner

Glyn, I did some digging into this PR today, I see it's scheduled in the runtime backlog for the week of the 24th. See: https://www.pivotaltracker.com/story/show/65709772

@ruthie ruthie merged commit dc6d938 into cloudfoundry:master

1 check passed

Details default The Travis CI build passed
@MarkKropf
Owner

@vito @mkocher Should similar logic be considered for garden?

@glyn
Owner
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Feb 13, 2014
  1. @glyn

    oom diagnostic logging

    glyn authored
    Disable the oom killer for a container's memory cgroup when the memory
    limit is set. If oom occurs later, the problem task (that is, the one
    which triggered oom) will be suspended ([1]) and the oom notifier will
    be driven. This gives us the opportunity to gather diagnostics showing
    details of the out of memory condition.
    
    In the oom notifier, log memory usage, memory limit, swap + memory
    usage, swap + memory limit and statistics and re- enable the oom
    killer.
    
    Re-enabling the oom killer is necessary in spite of the fact that the
    oom notifier proceeds to terminate the container. This is because the
    current method of terminating the container can, at least in theory,
    deadlock when the container is out of memory (wshd can require more
    memory, e.g. for a stack frame, and be suspended due to lack of
    memory).
    
    Once re-enabled, the oom killer will kill a task (usually the
    application) in the container ([2]) and this will enable container
    termination to successfully kill the remaining tasks via wshd. If wshd
    hits the container's memory limit with the oom killer enabled, the oom
    killer will kill it and this will kill all the other processes in the
    container (since wshd is the PID namespace parent).
    
    IntelliJ IDEA .iml files are ignored.
    
    Footnotes:
    
    [1] Linux kernel's ./Documentation/cgroups/memory.txt states:
    
        "If OOM-killer is disabled, tasks under cgroup will hang/sleep in
         memory cgroup's OOM-waitqueue when they request accountable
         memory."
    
    [2] Although memory.txt does not specify that re-enabling the oom
        killer in the oom notifier will cause it to kill a task, this
        seems like the only reasonable behaviour and it seems to work
        that way in practice.
This page is out of date. Refresh to see the latest.
View
1  .gitignore
@@ -12,3 +12,4 @@ ci-working-dir
*.swp
.rvmrc
*.pid
+*.iml
View
46 warden/lib/warden/container/features/mem_limit.rb
@@ -22,7 +22,7 @@ def initialize(container)
@container = container
oom_notifier_path = Warden::Util.path("src/oom/oom")
- @child = DeferredChild.new(oom_notifier_path, container.cgroup_path(:memory))
+ @child = DeferredChild.new(oom_notifier_path, container.memory_cgroup_path)
@child.logger = logger
@child.run
@child_exited = false
@@ -65,14 +65,49 @@ def restore
end
def oomed
- logger.warn("OOM happened for #{handle}")
+ memory = memory_cgroup_file_contents('memory.usage_in_bytes')
+ memory_limit = memory_cgroup_file_contents('memory.limit_in_bytes')
+ swap = memory_cgroup_file_contents('memory.memsw.usage_in_bytes')
+ swap_limit = memory_cgroup_file_contents('memory.memsw.limit_in_bytes')
+ stats = format_memory_stats(memory_cgroup_file_contents('memory.stat'))
+ logger.warn("OOM happened for container with handle '#{handle}', memory usage: #{memory}, memory limit: #{memory_limit}, memory + swap usage: #{swap}, memory + swap limit: #{swap_limit}, #{stats}")
events << "out of memory"
+
+ oom_killer true
+
if state == State::Active
dispatch(Protocol::StopRequest.new)
end
end
+ def format_memory_stats(memory_stats)
+ memory_stats.gsub(' ', ': ').gsub("\n", ', ')
+ end
+
+ private :format_memory_stats
+
+ def oom_killer(enable)
+ File.open(File.join(memory_cgroup_path, "memory.oom_control"), 'w') do |f|
+ f.write(enable ? '0' : '1')
+ end
+ end
+
+ private :oom_killer
+
+ def memory_cgroup_file_contents(filename)
+ File.read(File.join(memory_cgroup_path, filename)).chomp
+ rescue
+ # memory.memsw.* files cannot be read when swapping is off
+ '-'
+ end
+
+ private :memory_cgroup_file_contents
+
+ def memory_cgroup_path
+ cgroup_path(:memory)
+ end
+
def start_oom_notifier_if_needed
unless @oom_notifier
@oom_notifier = OomNotifier.new(self)
@@ -89,6 +124,9 @@ def start_oom_notifier_if_needed
private :start_oom_notifier_if_needed
def limit_memory(limit_in_bytes)
+ # Disable the oom killer before setting up the oom notifier.
+ oom_killer false
+
# Need to set up the oom notifier before we set the memory
# limit to avoid a race between when the limit is set and
# when the oom notifier is registered.
@@ -106,7 +144,7 @@ def limit_memory(limit_in_bytes)
# successfully. To mitigate this, both limits are written twice.
2.times do
["memory.limit_in_bytes", "memory.memsw.limit_in_bytes"].each do |path|
- File.open(File.join(cgroup_path(:memory), path), 'w') do |f|
+ File.open(File.join(memory_cgroup_path, path), 'w') do |f|
f.write(limit_in_bytes.to_s)
end
end
@@ -126,7 +164,7 @@ def do_limit_memory(request, response)
end
end
- limit_in_bytes = File.read(File.join(cgroup_path(:memory), "memory.limit_in_bytes"))
+ limit_in_bytes = File.read(File.join(memory_cgroup_path, "memory.limit_in_bytes"))
response.limit_in_bytes = limit_in_bytes.to_i
nil
View
10 warden/spec/container/linux_spec.rb
@@ -204,7 +204,11 @@ def trigger_oom
describe "setting limits" do
def integer_from_memory_cgroup(file)
- File.read(File.join("/tmp/warden/cgroup/memory", "instance-#{@handle}", file)).to_i
+ memory_cgroup_file_contents(file).to_i
+ end
+
+ def memory_cgroup_file_contents(file)
+ File.read(File.join("/tmp/warden/cgroup/memory", "instance-#{@handle}", file))
end
let(:hundred_mb) { 100 * 1024 * 1024 }
@@ -214,6 +218,10 @@ def integer_from_memory_cgroup(file)
response.limit_in_bytes.should == hundred_mb
end
+ it 'disables oom killer' do
+ memory_cgroup_file_contents("memory.oom_control").should match(/oom_kill_disable\s*1/)
+ end
+
it "sets `memory.limit_in_bytes`" do
integer_from_memory_cgroup("memory.limit_in_bytes").should == hundred_mb
end
Something went wrong with that request. Please try again.