Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Correctly read container limits when running under k8s #1692

Merged
merged 8 commits into from
Aug 23, 2023
Merged

Conversation

talbii
Copy link
Contributor

@talbii talbii commented Aug 10, 2023

Closes #1683.


Background: In k8s, the memory/CPU cgroups which are reported by /proc/self/cgroup do not exist, i.e., the path /sys/fs/cgroup/<resource>/<k8s cgroup> doesn't exist. This causes Dragonfly to not read container limits at all, hence missing potential limits set by the user.

Fix: Begin by first reading /sys/fs/cgroup/<resource>/<limit>, which, in words, means "read the resource limits of the system". Later the path /sys/fs/cgroup/<resource>/<specific group> is read, updating the limit if needed.

Signed-off-by: talbii <ido@dragonflydb.io>
@talbii
Copy link
Contributor Author

talbii commented Aug 10, 2023

I would be happy if someone suggested a way to run Dragonfly under Kubernetes, as I'm rather noobish in that area 😄

Copy link
Contributor

@kostasrim kostasrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, have you tested this manually to see if there is an issue? (0 experience with k8 here)

@kostasrim
Copy link
Contributor

I would be happy if someone suggested a way to run Dragonfly under Kubernetes, as I'm rather noobish in that area 😄

hahaha, I missed this and I explicitly asked for this! I am afraid I can't help much with k8 :(

src/server/dfly_main.cc Outdated Show resolved Hide resolved
@romange
Copy link
Collaborator

romange commented Aug 11, 2023

Please expand in the commit message what was the bug and what has been changed. It's not really easy to understand what the fix is just by reading the code.

Signed-off-by: talbii <ido@dragonflydb.io>
Signed-off-by: talbii <ido@dragonflydb.io>
Signed-off-by: talbii <ido@dragonflydb.io>
@@ -515,8 +515,10 @@ error_code GetCGroupPath(string* memory_path, string* cpu_path) {
if (groups.size() == 1) {
// for v2 we only read 0::<name>
size_t pos = cgv.rfind(':');
if (pos == string::npos)
if (pos == string::npos) {
*memory_path = cgv;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a bit hacky but I feel like it would save a bit of hassle in the future. Happy to hear some comments

(fyi: this is used in line 582)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fun of this because you transform a hard error to a passing condition.

Furthermore isn't this a bug? I mean both v1 and v2 must be of the form:

  // N:<cgroup name>:<path> -- in case of v1, in many lines
  // 0::<cgroup name> -- in case of v2, in a single line

So if : is missing, then something was read wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

realistically this branch should never enter because that is simply the expected format of cgroups. this is assuming the function which reads is correct, but I think its a fair assumption 😸

I guess originally I added that check as a sanity check. I will change this to LOG(ERROR) and exit, instead of returning.

@talbii talbii requested a review from kostasrim August 17, 2023 13:26
Signed-off-by: talbii <ido@dragonflydb.io>
Signed-off-by: talbii <ido@dragonflydb.io>
Signed-off-by: talbii <ido@dragonflydb.io>
Copy link
Contributor

@kostasrim kostasrim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tested this?

I have to look a little bit deeper on how control groups are managed by linux. I will come back to this PR, in the meantime when you get a chance reply to my questions!

@@ -515,8 +515,10 @@ error_code GetCGroupPath(string* memory_path, string* cpu_path) {
if (groups.size() == 1) {
// for v2 we only read 0::<name>
size_t pos = cgv.rfind(':');
if (pos == string::npos)
if (pos == string::npos) {
*memory_path = cgv;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fun of this because you transform a hard error to a passing condition.

Furthermore isn't this a bug? I mean both v1 and v2 must be of the form:

  // N:<cgroup name>:<path> -- in case of v1, in many lines
  // 0::<cgroup name> -- in case of v2, in a single line

So if : is missing, then something was read wrong?

src/server/dfly_main.cc Outdated Show resolved Hide resolved
if (mmax && !absl::StartsWith(*mmax, "max")) {
CHECK(absl::SimpleAtoi(*mmax, &mdata->mem_total));
}
mdata->mem_avail = min(mdata->mem_avail, mdata->mem_total);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the total memory ever be less than the available memory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh good catch! because read_mem() also updates via min I think this min is actually redundant!

Copy link
Contributor Author

@talbii talbii Aug 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit: wait no that is not right. If we set memory.max to 8G, not set memory.high, and the system has more than 8G of available memory then mem_avail > mem_total.

On the other hand, if the available memory is less than what was set in mem_total, then mem_avail <= mem_total.

So I don't think its possible to remove the min() here.

@talbii
Copy link
Contributor Author

talbii commented Aug 21, 2023

Have you tested this?

I have to look a little bit deeper on how control groups are managed by linux. I will come back to this PR, in the meantime when you get a chance reply to my questions!

if you have any issues feel free to contact internally :)

When I first introduced this I wrote a small guide for testing: e646476

Edit: yes I have tested, inside and outside a K8s pod, everything worked well.

Signed-off-by: talbii <ido@dragonflydb.io>
@talbii talbii merged commit 74fcd3e into main Aug 23, 2023
10 checks passed
@talbii talbii deleted the fix-cgroup-v1 branch August 23, 2023 11:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect CGroup v1 Resource Detection
3 participants