Setting oom_score_adj while launching applications #115

wking · 2015-08-26T18:31:56Z

This spins off a more tightly-scoped version of #114. The oom_score_adj is more of a host-side and/or multi-container-orchestration issue, and less of a bundle issue, which means it probably should be in runtime.json if/once #88 lands. Possible approaches for setting this include:

The runtime writes to /proc/<pid>/oom_score_adj, which would need a config-side setting.
The host injects a pre-start hook to write to /proc/<pid>/oom_score_adj. This would require a hook with sufficient permissions for the write.
The application has a startup phase where it handles this sort of thing before execing the main process. This would require an application have sufficient permissions for the write, although it could drop them after writing and before execing the “real” application.

A number of attributes where you could use (2) currently have explicit, (1)-style configs for via hooks (e.g. setting up networking and creating cgroups and namespaces). I'd guess the balance involves “how easy is it to handle without (1)?” and “how frequently will folks be tweaking this attribute?”, with high-cost or high-frequency attributes being handled via (1). So which way do we think makes the most sense for this particular setting?

Is it easy to handle via (2) or (3)? It seems like (2) would be easy assuming sufficient hook permissions, but (3) is probably too annoying to be worth the trouble.

How frequently do we expect folks will use this? I can't weigh in here, since I haven't set this. And I expect most runtime managers that set this will be doing it automatically, so in that case it's a wash between (1) and (2) for difficulty.

If those assumptions are correct, then I think we should go with (2), since that is the least work on the spec/runtime-implementation side. If nobody chimes in with anti-(2) thoughts in the next few days, I'll merge opencontainers/runc#160 locally see whether I can get it working ;).

The text was updated successfully, but these errors were encountered:

wking · 2015-09-06T18:00:25Z

I merged opencontainers/runc#160 (at opencontainers/runc@4f7ff04) into runc's master (at opencontainers/runc@0f85e4e), reverted opencontainers/runc@cc232c47 (part of opencontainers/runc#232) to avoid clobbering my adjustment, and wrote a host-side hook to handle the adjustment:

$ cat /tmp/oom-score-adj.sh
#!/bin/sh
ADJ="$1"
#PID=$(jq -r .pid)
PID=$(jq -r .init_process_pid)
echo "${ADJ}" > "/proc/${PID}/oom_score_adj"

The init_process_pid vs. pid in the state JSON is a difference between runC's state as of opencontainers/runc#160 and the spec that landed in #87 (after the initial runC work).

Then I added the hook to my config.json:

$ cat config.json
{
  …
  "hooks": {
    "prestart": [
      {
        "path": "/tmp/oom-score-adj.sh",
        "args": ["oom-score-adj.sh", "+10"]
      }
    ],
    "poststop": null
  },
  …
}

Then launching the container gives the right score:

# runc
/ # cat /proc/1/oom_score
10

So the hook approach ((2) in my original post here) is valid, and not particularly difficult. I propose we document that approach in this spec (for the reasons outlined in my original post here) and roll back opencontainers/runc#232 in runC.

vishh · 2015-10-12T20:18:26Z

The init process is a runC thing. In my mind, pre-start hooks are launched when the container environment has been established. Whether there is an init process involved or not is an implementation detail.

Do we have other use cases which demand the need for giving access to the init process from a hook?

wking · 2015-10-12T20:35:24Z

On Mon, Oct 12, 2015 at 01:18:28PM -0700, Vish Kannan wrote:

The init process is a runC thing.

Regardless of implementation, a runtime will almost certainly need to
call setns(2), unshare(2), pivot_root(2) or some such activity before
execing the user-specified container process. I don't know how you'd
do that without some runtime-specified code being run in the container
process before the user-specified code is executed.

In my mind, pre-start hooks are launched when the container
environment has been established.

Agreed. And the container isn't established until there's a process
in there holding open PID namespaces, mount namespaces, etc., etc.

Do we have other use cases which demand the need for giving access
to the init process from a hook?

What do you mean “giving access”? Just “listing the
container-process's PID in the pre-start state JSON”?

wking · 2015-10-13T00:07:55Z

On Mon, Oct 12, 2015 at 01:33:20PM -0700, W. Trevor King wrote:

Mon, Oct 12, 2015 at 01:18:28PM -0700, Vish Kannan:

The init process is a runC thing.

Regardless of implementation, a runtime will almost certainly need
to call setns(2), unshare(2), pivot_root(2) or some such activity
before execing the user-specified container process. I don't know
how you'd do that without some runtime-specified code being run in
the container process before the user-specified code is executed.

@vishh pointed out that the unshare and pivot_root calls can live in a
separate container-side process that exits before the process that
will become the user-specified container process starts 1. You'll
still need runtime-specified setns calls in the process that will
become the user-specified container process, but if you allow for a
missing PID namespace at pre-start time those hooks may land in the
window between the two container processes:

Launch container-side setup process to unshare, pivot_root, etc.
Bind mount those new namespaces somewhere persistent.
Container-side setup process exits.
Run pre-start hooks
Launch container-side process to unshare a PID namespace (if the
user wanted a new one), configure uid/group mappings, setns,
setuid, setgid, setgroups, drop caps, and exec user-specified code.

You can get a fair ways toward handling its unshare/setns bits with
util-linux's unshare(1) and nsenter(1). But by the time you add all
the rest of the logic I still think you're outside the realm of what's
easily possible with stock binaries, and we might as well just use the
same container process for (1) and (5).

So I don't think it's a particularly large win to support the “two
separate container process's” runtime implementation. And once you've
got a single container process, your pre-start hooks can rely on
having a PID namespace and a useful PID in their state JSON.

Am I missing something?

While in there, to show why someone may want it, I also added support for: runc run cmd args... runc batch batchFileOfCommands | - Signed-off-by: Doug Davis <dug@us.ibm.com>

vbatts · 2016-03-16T17:23:59Z

with hooks being in the spec, the setting of this should be supported.
Closing this issue.

wking · 2016-03-16T21:40:45Z

On Wed, Mar 16, 2016 at 10:24:02AM -0700, Vincent Batts wrote:

with hooks being in the spec, the setting of this should be supported.

This issue was “how do we want to support things like oom_score_adj”,
with me arguing for folks to handle it in a hook instead (option 2 in
the topic post) of adding a new config-side setting (option 1) or the
user-configured application (option 3). The bigger picture issue is
which kernel APIs this spec wraps and how thin a wrapper it should be
(see also 1). I think those are still important questions to sort
out, but in this case #222 chose in favor of a new config-side
setting. @vishh had a few comments/links motivating that choice
[2,3,4], but I'm still not clear on why a config-side setting was
chosen over the hook-based approach 5.

Anyway, I'm happy to continue this discussion on the mailing list, or
as needed as new settings are proposed.

 Subject: removal of cgroups from the OCI Linux spec
 Date: Wed, 28 Oct 2015 17:01:59 +0000
 Message-ID: <CAD2oYtO1RMCcUp52w-xXemzDTs+J6t4hS5Mm4mX+uBnVONGDfA@mail.gmail.com>

wking mentioned this issue Aug 29, 2015

Adding oom_score_adj as a container config param opencontainers/runc#232

Merged

wking mentioned this issue Sep 7, 2015

Add prestart/poststop hooks to runc opencontainers/runc#160

Merged

This was referenced Oct 12, 2015

Add oom_score_adj to the runtime Spec. #222

Merged

Support for Quality of Service #114

Closed

wking mentioned this issue Oct 12, 2015

runtime: Add more steps to the lifecycle docs #207

Closed

This was referenced Oct 13, 2015

Expand on the definition of our ops #225

Merged

Container manipulated via multiple commands #226

Closed

wking referenced this issue in duglin/runc Oct 29, 2015

Split container.Start() into create() and runProcess()

bb180ac

While in there, to show why someone may want it, I also added support for: runc run cmd args... runc batch batchFileOfCommands | - Signed-off-by: Doug Davis <dug@us.ibm.com>

wking mentioned this issue Dec 11, 2015

runtime-config: Require serial hook execution #265

Closed

vbatts closed this as completed Mar 16, 2016

wking mentioned this issue Mar 18, 2016

Support Lifecycle Hooks #20

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Setting oom_score_adj while launching applications #115

Setting oom_score_adj while launching applications #115

wking commented Aug 26, 2015

wking commented Sep 6, 2015

vishh commented Oct 12, 2015

wking commented Oct 12, 2015

wking commented Oct 13, 2015

vbatts commented Mar 16, 2016

wking commented Mar 16, 2016

Setting oom_score_adj while launching applications #115

Setting oom_score_adj while launching applications #115

Comments

wking commented Aug 26, 2015

wking commented Sep 6, 2015

vishh commented Oct 12, 2015

wking commented Oct 12, 2015

wking commented Oct 13, 2015

vbatts commented Mar 16, 2016

wking commented Mar 16, 2016