Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libct/cg: support hugetlb rsvd #4073

Merged
merged 1 commit into from
Oct 19, 2023
Merged

Conversation

kolyshkin
Copy link
Contributor

@kolyshkin kolyshkin commented Oct 14, 2023

This adds support for hugetlb.<pagesize>.rsvd limiting and accounting.

The previous non-rsvd max/limit_in_bytes does not account for reserved
huge page memory, making it possible for a processes to reserve all the
huge page memory, without being able to allocate it (due to cgroup
restrictions).

In practice this makes it possible to successfully mmap more huge page
memory than allowed via the cgroup settings, but when using the memory
the process will get a SIGBUS and crash. This is bad for applications
trying to mmap at startup (and it succeeds), but the program crashes
when starting to use the memory. eg. postgres is doing this by default.

This also keeps writing to the old max/limit_in_bytes, for backward
compatibility.

More info can be found here: https://lkml.org/lkml/2020/2/3/1153

(commit message mostly written by @odinuge)

Fixes: #3859.

This adds support for hugetlb.<pagesize>.rsvd limiting and accounting.

The previous non-rsvd max/limit_in_bytes does not account for reserved
huge page memory, making it possible for a processes to reserve all the
huge page memory, without being able to allocate it (due to cgroup
restrictions).

In practice this makes it possible to successfully mmap more huge page
memory than allowed via the cgroup settings, but when using the memory
the process will get a SIGBUS and crash. This is bad for applications
trying to mmap at startup (and it succeeds), but the program crashes
when starting to use the memory. eg. postgres is doing this by default.

This also keeps writing to the old max/limit_in_bytes, for backward
compatibility.

More info can be found here: https://lkml.org/lkml/2020/2/3/1153

(commit message mostly written by Odin Ugedal)

Co-authored-by: Odin Ugedal <odin@ugedal.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
@kolyshkin
Copy link
Contributor Author

I'm on the verge of backporting this to 1.1 -- there is not too much code, and this helps a lot to fix issues with e.g. postgres (see #3859 (comment))

@kolyshkin kolyshkin added backport/1.1-pr A backport to 1.1.x release. backport/todo/1.1 A PR in main branch which needs to be backported to release-1.1 and removed backport/1.1-pr A backport to 1.1.x release. labels Oct 16, 2023
Comment on lines +36 to +38
if err := cgroups.WriteFile(path, prefix+".rsvd"+suffix, val); err != nil {
if errors.Is(err, os.ErrNotExist) {
skipRsvd = true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea here, those .rsvd. files either exist or not, so if the first such file doesn't exist, we set the skipRsvd=true and do not try to use .rsvd. files any more.

Comment on lines +60 to +63
if rsvd != "" && errors.Is(err, os.ErrNotExist) {
rsvd = ""
goto again
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For getting the stats, we prefer .rsvd. files, if they exist.

return err
}
if skipRsvd {
continue

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a break instead, so the loop stops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to stop the loop here. We want to write all hugetlb.XXX.limit_in_bytes and, if available, all hugetlb.XXX.rsvd.limit_in_bytes as well (and we're looping over XXX).

@odinuge
Copy link
Contributor

odinuge commented Oct 19, 2023

Thanks for pushing this @kolyshkin! Life and work happened, and I haven't had much time to spend on runc and kernel stuff after graduating from University. Very nice seeing you pick this up and pushing it over the edge!

@mrunalp mrunalp merged commit 437725c into opencontainers:main Oct 19, 2023
45 checks passed
@kolyshkin kolyshkin added backport/done/1.1 A PR in main branch which was backported to release-1.1 and removed backport/todo/1.1 A PR in main branch which needs to be backported to release-1.1 labels Oct 19, 2023
@kolyshkin kolyshkin mentioned this pull request Oct 23, 2023
@AkihiroSuda AkihiroSuda mentioned this pull request Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cgroupv1 area/cgroupv2 backport/done/1.1 A PR in main branch which was backported to release-1.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support rsvd hugetlb cgroup
5 participants