Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make OOMScoreAdj= a runtime changeable property #29032

Closed
Werkov opened this issue Sep 1, 2023 · 5 comments
Closed

Make OOMScoreAdj= a runtime changeable property #29032

Werkov opened this issue Sep 1, 2023 · 5 comments
Labels
not-our-bug pid1 RFE 🎁 Request for Enhancement, i.e. a feature request

Comments

@Werkov
Copy link
Contributor

Werkov commented Sep 1, 2023

Component

systemd

Is your feature request related to a problem? Please describe

Since commit ce7de0b systemd user instance runs with reduced oom_score_adj. This is a PITA for podman and rootless containers because they may fail to start because they cannot set oom_score_adj = 0 (lower than the default 100) and they treat it fatally.

A workaround is reducing user@$UID.service score back to 0. However, that requires a restart of the service which is not very convenient in a middle of a running user session.

Describe the solution you'd like

The suggestion is to implement a setter for OOMScoreAdjust= applicable to MainPID= of a service.

Describe alternatives you've considered

  • Revert of ce7de0b.
  • Less fatal handling in podman/runc (whatever layer hits it).
  • "Soft logout", akin to recent soft reboot for systemd system instance, this would be an analogy for user instances allowing survival of user sessions (somehow).
  • echo 0 >/proc/$(systemctl -P MainPID show user@$UID.service)/oom_score_adj as a privileged user

The systemd version you checked that didn't have the feature you are asking for

253

@Werkov Werkov added the RFE 🎁 Request for Enhancement, i.e. a feature request label Sep 1, 2023
@github-actions github-actions bot added the pid1 label Sep 1, 2023
@poettering
Copy link
Member

So far we decided that we runtime changable settings are only those that we can safely apply to the whole unit, which means it's the per-cgroup settings, but not the per-process settings. I don#t think we should try to depart from that: don' expose settings which we cannot reasonably apply at once to the whole unit.

I am pretty sure podman/runc should handle issues around oom score adjustment gracefully anyway. Have you asked them to handle this more nicely?

Alternatively, if podman doesn't want to fix that they could ship a dropin for user@.service.d that turns the adjustment off. But of course, that would degrade system behaviour for everybody.

Hence I'd really just consider this a podman issue.

@poettering
Copy link
Member

why does runc even insist on resetting the oom adjust value to zero? that's pretty broken: allowing user container payloads to mark themselves as more relevant as the rest of the user code? weird. conceptually backwards if you ask me.

@poettering
Copy link
Member

was this reported to podman/runc?

@Werkov
Copy link
Contributor Author

Werkov commented Sep 1, 2023

I find touching pre-exec process info post-hoc conceptually analogous to catching a train that already has left the station, true. But it's user convenient if you don't have to wait for a next connection :-p
(oom_score_adj is a particular case that hit me badly).

Scoping is difficult too (hence MainPID=, iterating cgroup should be avoided).

I've filed a podman issue too now.

If you don't consider user convenience a strong point, I'm fine having this closed.

@poettering
Copy link
Member

I guess this has been addressed in podman now:

containers/podman#19843

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not-our-bug pid1 RFE 🎁 Request for Enhancement, i.e. a feature request
Development

No branches or pull requests

2 participants