Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drivers/executor: set oom_score_adj for raw_exec #19515

Merged
merged 5 commits into from Jan 2, 2024

Conversation

mattrobenolt
Copy link
Contributor

This might not be wholly true since I don't know all configurations of Nomad, but in our use cases, we run some of our tasks as raw_exec for reasons.

We observed that our tasks were running with oom_score_adj = -1000, which prevents them from being OOM'd. This value is being inherited from the nomad agent parent process, as configured by systemd.

Similar to #10698, we also were shocked to have this value inherited down to every child process and believe that we should also set this value to 0 explicitly.

I have no idea if there are other paths that might leverage this or other ways that raw_exec can manifest, but this is how I was able to observe and fix in one of our configurations.

We have been running in production our tasks wrapped in a script that does: echo 0 > /proc/self/oom_score_adj to avoid this issue.

This might not be wholly true since I don't know all configurations of
Nomad, but in our use cases, we run some of our tasks as `raw_exec` for
reasons.

We observed that our tasks were running with `oom_score_adj = -1000`,
which prevents them from being OOM'd. This value is being inherited from
the nomad agent parent process, as configured by systemd.

Similar to hashicorp#10698, we also were shocked to have this value inherited
down to every child process and believe that we should also set this
value to 0 explicitly.

I have no idea if there are other paths that might leverage this or
other ways that `raw_exec` can manifest, but this is how I was able to
observe and fix in one of our configurations.

We have been running in production our tasks wrapped in a script that
does: `echo 0 > /proc/self/oom_score_adj` to avoid this issue.
@mattrobenolt
Copy link
Contributor Author

To add, we applied this patch to a production host, patched over 1.7.1 tag and it behaves as expected.

@shoenig
Copy link
Member

shoenig commented Jan 2, 2024

Thanks @mattrobenolt! Hope you don't mind I've cleaned up the code just a bit and added an e2e test. Spot checking to make sure it works:

before:

➜ go test -v
=== RUN   TestRawExec
=== RUN   TestRawExec/testOomAdj
    assert.go:14:
        rawexec_test.go:28: expected string to contain substring; it does not
        ↪ substring: 0
        ↪    string: -999
--- FAIL: TestRawExec (4.01s)
    --- FAIL: TestRawExec/testOomAdj (3.83s)

after:

➜ go test -v
=== RUN   TestRawExec
=== RUN   TestRawExec/testOomAdj
--- PASS: TestRawExec (2.96s)
    --- PASS: TestRawExec/testOomAdj (2.77s)
PASS
ok      github.com/hashicorp/nomad/e2e/rawexec  2.962s

@mattrobenolt
Copy link
Contributor Author

That's great! Thank you. :)

Do you know if this will be applicable for a 1.7.x release or need to wait for 1.8?

@shoenig
Copy link
Member

shoenig commented Jan 2, 2024

It will go into the next 1.7.x bugfix release (1.7.3 assuming we don't get a CVE release before then)

@shoenig shoenig merged commit 656bb5c into hashicorp:main Jan 2, 2024
14 of 17 checks passed
@shoenig shoenig added the backport/1.7.x backport to 1.7.x release line label Jan 2, 2024
@mattrobenolt mattrobenolt deleted the oom-score-adj branch January 4, 2024 03:56
nvanthao pushed a commit to nvanthao/nomad that referenced this pull request Mar 1, 2024
* drivers/executor: set oom_score_adj for raw_exec

This might not be wholly true since I don't know all configurations of
Nomad, but in our use cases, we run some of our tasks as `raw_exec` for
reasons.

We observed that our tasks were running with `oom_score_adj = -1000`,
which prevents them from being OOM'd. This value is being inherited from
the nomad agent parent process, as configured by systemd.

Similar to hashicorp#10698, we also were shocked to have this value inherited
down to every child process and believe that we should also set this
value to 0 explicitly.

I have no idea if there are other paths that might leverage this or
other ways that `raw_exec` can manifest, but this is how I was able to
observe and fix in one of our configurations.

We have been running in production our tasks wrapped in a script that
does: `echo 0 > /proc/self/oom_score_adj` to avoid this issue.

* drivers/executor: minor cleanup of setting oom adjustment

* e2e: add test for raw_exec oom adjust score

* e2e: set oom score adjust to -999

* cl: add cl

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
nvanthao pushed a commit to nvanthao/nomad that referenced this pull request Mar 1, 2024
* drivers/executor: set oom_score_adj for raw_exec

This might not be wholly true since I don't know all configurations of
Nomad, but in our use cases, we run some of our tasks as `raw_exec` for
reasons.

We observed that our tasks were running with `oom_score_adj = -1000`,
which prevents them from being OOM'd. This value is being inherited from
the nomad agent parent process, as configured by systemd.

Similar to hashicorp#10698, we also were shocked to have this value inherited
down to every child process and believe that we should also set this
value to 0 explicitly.

I have no idea if there are other paths that might leverage this or
other ways that `raw_exec` can manifest, but this is how I was able to
observe and fix in one of our configurations.

We have been running in production our tasks wrapped in a script that
does: `echo 0 > /proc/self/oom_score_adj` to avoid this issue.

* drivers/executor: minor cleanup of setting oom adjustment

* e2e: add test for raw_exec oom adjust score

* e2e: set oom score adjust to -999

* cl: add cl

---------

Co-authored-by: Seth Hoenig <shoenig@duck.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.7.x backport to 1.7.x release line
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants