Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit resources based on workspace class #11374

Merged
merged 7 commits into from
Aug 5, 2022
Merged

Limit resources based on workspace class #11374

merged 7 commits into from
Aug 5, 2022

Conversation

Furisto
Copy link
Member

@Furisto Furisto commented Jul 14, 2022

Description

This PR adds support for cpu limiting based on the workspace class of the workspace. The configuration for each workspace class contains a cpu limiting section which describes the minimal resources the workspaces gets and the burst limit. These values will be set as annotations on the workspace pod. ws-daemon will then use these annotations to distribute cpu resources

Related Issue(s)

Fixes #10981

How to test

  • Open workspace
  • Check that the workspace pod contains two annotations: gitpod.io/cpuMinLimit and gitpod.io/cpuBurstLimit
  • SSH into the harvester vm and find the cgroup for the workspace pod
  • Check that the min limit has been applied
  • Install stress-ng in the workspace
  • Run stress-ng --cpu 4
  • After a few seconds you should see that the burst limit has been applied to the cgroup of the workspace
  • Use kubectl annotate to change the annotations
  • The cgroup of the workspace should have been updated with the new values
  • Deactivate feature flag in configcat
  • Start workspace
  • Check that no limit annotations are present
  • Find the cgroup of the workspace
  • Check that the limits configured in the ws-daemon configmap are applied

Release Notes

None

Note

This relies on on a change in the werft files, that is why the main build failed. The custom build with the updated werft files (below) is successful however.

Werft options:

  • /werft with-preview

@werft-gitpod-dev-com
Copy link

started the job as gitpod-build-fo-class-daemon.1 because the annotations in the pull request description changed
(with .werft/ from main)

@Furisto Furisto force-pushed the fo/class-daemon branch 6 times, most recently from 81a3b15 to 491cd95 Compare July 18, 2022 11:18
@Furisto Furisto marked this pull request as ready for review July 18, 2022 14:01
@Furisto Furisto requested a review from a team July 18, 2022 14:01
@Furisto Furisto requested review from a team July 18, 2022 14:01
@github-actions github-actions bot added team: delivery Issue belongs to the self-hosted team team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team labels Jul 18, 2022
@kylos101
Copy link
Contributor

@Furisto what are your thoughts about limiting disk IO bandwidth? I ask because on XL (large) nodes, we "could" grant the workspaces a higher bandwidth because there are less workspaces and more available bandwidth.

For now, I think we should keep this as a 🛹, and continue to impose the same disk IO bandwidth limit like we do now (regardless of the node type). But, I am interested in your thoughts. 🤔 😄

Further, even when we go to PVC, we could initially impose the same disk IO bandwidth limit. And later, experiment with removing the disk IO bandwidth limit for /workspace, and assuming it doesn't introduce any performance degradation - ship it.

@Furisto
Copy link
Member Author

Furisto commented Jul 18, 2022

@Furisto what are your thoughts about limiting disk IO bandwidth? I ask because on XL (large) nodes, we "could" grant the workspaces a higher bandwidth because there are less workspaces and more available bandwidth.

For now, I think we should keep this as a skateboard, and continue to impose the same disk IO bandwidth limit like we do now (regardless of the node type). But, I am interested in your thoughts. thinking smile

Yes, that is the plan. For the initial implementation I only included cpu but in the future we can make other limits dependent on the workspace class like disk IO or egress.

Further, even when we go to PVC, we could initially impose the same disk IO bandwidth limit. And later, experiment with removing the disk IO bandwidth limit for /workspace, and assuming it doesn't introduce any performance degradation - ship it.

👍

@kylos101
Copy link
Contributor

/hold

Why? This is not protected by a feature flag, yet, and it would be ideal to have one.

@Furisto is starting this work, and has socialized with @easyCZ.

cc: @sagor999 this is something we'll want to include as part of gen55, because usage based pricing depends on workspace having the ability to impose CPU limits that vary by node type / workspace class. It might delay the day we ship till Wednesday or Thursday. Prior to which, we should try to test other things in an ephemeral cluster.

@kylos101
Copy link
Contributor

@aledbf may we ask for your review on this PR? This way @Furisto can react to feedback from you. This file filter view may help (Only files owned by you):
image

@Furisto I think it would be good to leave the hold, at least until we have gen56 stable. @aledbf wdyt?

@kylos101
Copy link
Contributor

@Furisto as a heads up, there's a conflict with components/server/src/workspace/workspace-starter.ts

@kylos101
Copy link
Contributor

kylos101 commented Aug 3, 2022

@aledbf can you help with this review? This is needed for gen60 and Usage based pricing. cc: @geropl

@Furisto
Copy link
Member Author

Furisto commented Aug 4, 2022

/werft run

👍 started the job as gitpod-build-fo-class-daemon.26
(with .werft/ from main)

}

func (a annotationLimiter) Limit(wsh *WorkspaceHistory) (Bandwidth, error) {
value, ok := wsh.LastUpdate.Pod.Annotations[a.Annotation]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we storing the Pod and not only the annotations?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just convenient to do and having the pod object around can be useful when we want to do limiting based on other attributes besides annotations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that in the future. Please try to use what we need now. The Pod object can be expensive to store (annotations, status, env vars, etc)

CPU string `json:"cpu"`
Memory string `json:"memory"`
EphemeralStorage string `json:"ephemeral-storage"`
Storage string `json:"storage,omitempty"`
}

type ResourceLimitConfiguration struct {
CPU *CpuResourceLimit `json:"cpu"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we don't set omitempty to all fields?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just kept it consistent with the previous behavior. It is also nice to not have type out the fields when editing the configmap in e.g. preview environments.

@Furisto Furisto requested a review from a team August 5, 2022 13:38
@aledbf aledbf self-requested a review August 5, 2022 13:43
@Furisto
Copy link
Member Author

Furisto commented Aug 5, 2022

/werft run

👍 started the job as gitpod-build-fo-class-daemon.28
(with .werft/ from main)

@Furisto
Copy link
Member Author

Furisto commented Aug 5, 2022

/unhold

@roboquat roboquat merged commit 225344b into main Aug 5, 2022
@roboquat roboquat deleted the fo/class-daemon branch August 5, 2022 14:42
@roboquat roboquat added deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production labels Aug 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production feature: workspace classes release-note-none size/XXL team: delivery Issue belongs to the self-hosted team team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make ws-daemon workspace class aware
10 participants