Set 0.05 CPU guarantee for all user pods #801

yuvipanda · 2021-11-01T08:41:28Z

If no CPU limit is set, it is possible for a single user or group of users to
starve everyone else of CPU time on a node, even causing new user pods to completely
fail as the notebook server process gets no CPU to complete auth handshake with
the server, and even trivial cells like print("hello world") may not run.
Unlike memory guarantees, CPU guarantees are actually enforced by the Linux Kernel
(see https://medium.com/@betz.mark/understanding-resource-limits-in-kubernetes-cpu-time-9eff74d3161b)
By giving each user a 5% CPU guarantee (represented by 0.05), we ensure that:

Simple cells will always execute
Notebook server processes will always start - so users won't have server spawn failure
We don't accidentally set just a high limit for a particular hub and not set a
guarantee, at which point kubernetes treats the limit as the guarantee! This causes
far more nodes to be scaled up than needed, making everything super slow (like in
[Incident] PaleoHack User sessions won't start #790)
Most of our workloads are still memory bound, and we want scaling to happen only
when a node is full on its memory guarantees. But a 0.05 guarantee means a n1-highmem-8
node can fit 160 user pods, and since kubernetes already caps us at 100 pods a node,
this guarantee doesn't actually change our scheduling.

Ref #790

When limit is set but not guarantee, guarantee is set to match limit! For most use cases, when we set limit but not guarantee, we want to offer no guarantee and a limit. This doesn't seem to be possible at all (need to investigate why). In the meantime, setting a super low guarantee here means we can guard against issues like 2i2c-org#790, where setting the limit but not guarantee just gave users a huge guarantee, causing node spin ups to fail Ref 2i2c-org#790

yuvipanda · 2021-11-01T08:41:56Z

@consideRatio how would kubespawner currently represent 'no guarantee but a limit'? Is that even possible?

consideRatio · 2021-11-01T09:12:32Z

From the k8s docs at https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits

Note: If a Container specifies its own memory limit, but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit. Similarly, if a Container specifies its own CPU limit, but does not specify a CPU request, Kubernetes automatically assigns a CPU request that matches the limit.

I assume kubespawner would represent no cpu/memory guarantee as an unset resource requests, which then makes the k8s api-server automatically set the resource requests to the explicit limit as documented.

consideRatio · 2021-11-01T09:16:57Z

hub-templates/basehub/values.yaml

+    cpu:
+      # When limit is set but not guarantee, guarantee is set to match limit! For most
+      # use cases, when we set limit but not guarantee, we want to offer no guarantee and
+      # a limit. This doesn't seem to be possible at all (need to investigate why). In the
+      # meantime, setting a super low guarantee here means we can guard against issues like
+      # https://github.com/2i2c-org/infrastructure/issues/790, where setting the limit
+      # but not guarantee just gave users a huge guarantee, causing node spin ups to fail
+      guarantee: 0.01


This can make sense in a way, but it also forces everyone to learn about 2i2c specific configuration.

Its a trade off between that users can assume k8s native behavior vs that they can't and potentially avoids mistakes due to it.

I'm conflicted about setting various defaults in the base chart overall due to this, but since we already have defaults about resource requests/limits I don't see it as an escalation of such complexity.

I'm conflicted about setting various defaults in the base chart overall due to this, but since we already have defaults about resource requests/limits I don't see it as an escalation of such complexity.

+1. Specifically, we already set default memory requests and limits here, and I don't think of these as anything 2i2c specific.

yuvipanda · 2021-11-01T09:16:59Z

@consideRatio right - so this means that 'limit but no guarantee' is unrepresentable in kubernetes?

consideRatio · 2021-11-01T09:21:35Z

so this means that 'limit but no guarantee' is unrepresentable in kubernetes?

You can set resource requests to zero explicitly to represent 'limit but no guarantee', but having them unset makes it imply something else than 'no guarantee'.

damianavila · 2021-11-01T14:13:18Z

Since we are here... do we want to also have a limit on the cpu?

choldgraf · 2021-11-01T16:02:59Z

@damianavila I believe that this PR is setting a guarantee on the CPU - it already exists for RAM I believe. EDIT: i see you're talking about limits not guaranteed, ignore my comment, sorry for the noise!

also it feels like this is something that any community would run into themselves if they ran their own Z2JH setup, so to me this feels non 2i2c-specific

consideRatio · 2021-11-01T17:18:22Z

@damianavila for very efficient CPU utilization and avoiding starvation of CPU resources, I think a good strategy is to set CPU guarantee + but use unbounded CPU limits.

By using a cpu limit, you make pods constrainted to use CPU without accomplishing anything I'd say - unless - there was a container that got CPU starved. A pod/container can be starved from cpu resources if it is guaranteed/requested something very very small (minimum of 0.002 I think - you always get this minimum at least), and there was another container that went full throttle and had a request/guarantee of a lot more, like 2 - then they would share the CPU capacity based on the difference in request. So, in the 2 vs 0.002 example, on a 4 core machine would mean the pod with 2 in CPU requests/guarantee would make use of 3.992 of the CPU, and the other would be using 0.008 of the CPU which could have been to little to function.

damianavila · 2021-11-01T20:57:58Z

So, given that all the user pods have the same request/guarantee CPU value, you are not expecting a starvation scenario like the one you described above, and then, there are no additional benefits for using a CPU limit, am I correct?

sgibson91 · 2021-11-04T11:41:18Z

I see there's some discussion in this PR but I'm not quite sure how/if it's blocking an approval?

consideRatio · 2021-11-04T11:52:42Z

I suggest we close this PR.

This PR is trying to mitigate us making a common mistake, but by doing so, it also makes us more prone to actually make the mistake in other contexts where this mitigation strategy isn't applied behind the curtains.

I think the best bet we have is to embrace the complexity without trying to mitigate it. Practically, I suggest we don't set either guarantee or limits without setting both whenever we set it for either cpu or memory in a chart - unless it's the final user config for some chart.

From the k8s docs at https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits

Note: If a Container specifies its own memory limit, but does not specify a memory request, Kubernetes automatically assigns a memory request that matches the limit. Similarly, if a Container specifies its own CPU limit, but does not specify a CPU request, Kubernetes automatically assigns a CPU request that matches the limit.

I'm fine with it being merged, but I find that this kind of complexity has made my work in the JMTE config a bit harder overall - as the configuration is harder to understand without manually inspecting configurations.

Most helm chart leaves all resource specifications unset, both for cpu/guarantee, so it is in general a broken expectation if you have a Helm chart and one specific part of the resource requests/limits is specified already.

damianavila · 2021-11-04T13:56:54Z

I think the best bet we have is to embrace the complexity without trying to mitigate it. Practically, I suggest we don't set either guarantee or limits without setting both whenever we set it for either cpu or memory in a chart - unless it's the final user config for some chart.

If we are going with this approach, we need to have a unified direction... I guess it should be all or nothing... closing this PR without further modifications leaves us in the middle of something where we have limits/guarantees for RAM but not CPU.

In addition, if we are going with the above approach, we need to make sure we are setting up limits and guarantee on all the final/end hubs config and, maybe, establish some "checks" so we are sure we do not miss adding those values, right?

consideRatio · 2021-11-04T14:25:03Z

I think the practice is worth thinking about, but I don't feel confident to champion any one specific choice because there are trade offs.

I don't want to hold back on taking any kind of action, or invest my own time to think much more about it since I think everything is acceptable even though I have a preference towards either closing options.

merging this
closing this and doing nothing
closing this and systematically making basehub not request anything and require the end configuration set it all

yuvipanda · 2021-11-05T12:08:11Z

I think our alternatives are:

Ensure somehow that each hub in our config has explicit CPU and memory set, and validate this on CI
Merge this PR as is, and open an issue for (1)

My current preference is for (2), mostly because I think the advantage we get out of it is worth that. (1) is more perfect, but I don't want that to block this good enough PR

sgibson91 · 2021-11-05T12:23:16Z

@yuvipanda I agree with your assessment. Let's merge this and open a new issue to iterate on implementation improvements!

damianavila

Yep, let's merge this one as is and open a follow-up issue for option 1.

yuvipanda · 2021-11-09T08:27:52Z

So I ended up encountering a related problem on the Berkeley DataHub a few days ago (berkeley-dsep-infra/datahub#2966), where a few users running expensive simulations meant basically the entire hub was 'down' for most people - even print("hello world") won't work. In investigating and fixing that, I've actually changed my approach to this PR and have increased the CPU guarantee as well. I've laid out my new rationale in the PR description now, although I believe this is something that should actually make it into the z2jh docs...

damianavila

The new rationale is sensible to me so re-approving this one.

yuvipanda requested review from consideRatio, damianavila and sgibson91 November 1, 2021 08:42

consideRatio reviewed Nov 1, 2021

View reviewed changes

sgibson91 approved these changes Nov 5, 2021

View reviewed changes

damianavila approved these changes Nov 5, 2021

View reviewed changes

Bump up cpu guarantee for all users to 0.05

adf8c9d

yuvipanda changed the title ~~Set a tiny CPU guarantee for all user pods~~ Set 0.5 CPU guarantee for all user pods Nov 9, 2021

damianavila approved these changes Nov 9, 2021

View reviewed changes

consideRatio changed the title ~~Set 0.5 CPU guarantee for all user pods~~ Set 0.05 CPU guarantee for all user pods Nov 9, 2021

yuvipanda merged commit f859c86 into 2i2c-org:master Nov 9, 2021

choldgraf mentioned this pull request Nov 10, 2021

[Incident] PaleoHack User sessions won't start #790

Closed

4 tasks

yuvipanda mentioned this pull request Nov 16, 2021

Team Sync - Monday, November 15th 2i2c-org/team-compass#301

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set 0.05 CPU guarantee for all user pods #801

Set 0.05 CPU guarantee for all user pods #801

yuvipanda commented Nov 1, 2021 •

edited

Loading

yuvipanda commented Nov 1, 2021

consideRatio commented Nov 1, 2021

consideRatio Nov 1, 2021

yuvipanda Nov 1, 2021 •

edited

Loading

yuvipanda commented Nov 1, 2021

consideRatio commented Nov 1, 2021

damianavila commented Nov 1, 2021

choldgraf commented Nov 1, 2021 •

edited

Loading

consideRatio commented Nov 1, 2021

damianavila commented Nov 1, 2021

sgibson91 commented Nov 4, 2021

consideRatio commented Nov 4, 2021 •

edited

Loading

damianavila commented Nov 4, 2021

consideRatio commented Nov 4, 2021

yuvipanda commented Nov 5, 2021 •

edited

Loading

sgibson91 commented Nov 5, 2021

damianavila left a comment

yuvipanda commented Nov 9, 2021

damianavila left a comment

Set 0.05 CPU guarantee for all user pods #801

Set 0.05 CPU guarantee for all user pods #801

Conversation

yuvipanda commented Nov 1, 2021 • edited Loading

yuvipanda commented Nov 1, 2021

consideRatio commented Nov 1, 2021

consideRatio Nov 1, 2021

Choose a reason for hiding this comment

yuvipanda Nov 1, 2021 • edited Loading

Choose a reason for hiding this comment

yuvipanda commented Nov 1, 2021

consideRatio commented Nov 1, 2021

damianavila commented Nov 1, 2021

choldgraf commented Nov 1, 2021 • edited Loading

consideRatio commented Nov 1, 2021

damianavila commented Nov 1, 2021

sgibson91 commented Nov 4, 2021

consideRatio commented Nov 4, 2021 • edited Loading

damianavila commented Nov 4, 2021

consideRatio commented Nov 4, 2021

yuvipanda commented Nov 5, 2021 • edited Loading

sgibson91 commented Nov 5, 2021

damianavila left a comment

Choose a reason for hiding this comment

yuvipanda commented Nov 9, 2021

damianavila left a comment

Choose a reason for hiding this comment

yuvipanda commented Nov 1, 2021 •

edited

Loading

yuvipanda Nov 1, 2021 •

edited

Loading

choldgraf commented Nov 1, 2021 •

edited

Loading

consideRatio commented Nov 4, 2021 •

edited

Loading

yuvipanda commented Nov 5, 2021 •

edited

Loading