Skip to content
This repository has been archived by the owner on Oct 23, 2023. It is now read-only.

Add default limit to Task Resources #396

Closed
wants to merge 2 commits into from
Closed

Conversation

wild-endeavor
Copy link
Contributor

@wild-endeavor wild-endeavor commented Apr 18, 2023

TL;DR

Add a default limit TaskResourceSpec section to TaskResourceAttributes.

There's been lots of confusion around Admin's existing behavior of setting the a task's limits to the same as the requests. Adding a default limit section to admin explicitly will fix this. Limits from now on will just be used for checking that the default requests and default limits conform to the system level limit.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

NA

Tracking Issue

flyteorg/flyte#3574
flyteorg/flyte#3467

Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
Signed-off-by: Yee Hing Tong <wild-endeavor@users.noreply.github.com>
@codecov
Copy link

codecov bot commented Apr 18, 2023

Codecov Report

Merging #396 (cbfd849) into master (7b19eb3) will increase coverage by 2.37%.
The diff coverage is n/a.

❗ Current head cbfd849 differs from pull request most recent head 76b4d6e. Consider uploading reports for the commit 76b4d6e to get more accurate results

@@            Coverage Diff             @@
##           master     #396      +/-   ##
==========================================
+ Coverage   76.11%   78.49%   +2.37%     
==========================================
  Files          18       18              
  Lines        1390     1195     -195     
==========================================
- Hits         1058      938     -120     
+ Misses        280      205      -75     
  Partials       52       52              
Flag Coverage Δ
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

see 18 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@wild-endeavor wild-endeavor changed the title Adddefaultlimit Add default limit to Task Resources Apr 18, 2023

// If set, these limits will be used as the default limits for tasks that do not specify limits. If not set, but
// defaults (field 1) are set, then the defaults will be used as the limits.
TaskResourceSpec default_limits = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this naming has tripped people up somewhat already, what if we differentiated between defaults and limits like so

message TaskResourceAttributes
   TaskResourceSpec defaults = 1 [deprecated=true];

    TaskResourceSpec limits = 2;

   
   TaskResourceDefaults default_values = 1;
}

message TaskResourceDefaults{
  TaskResourceSpec requests = 1;

  TaskResourceSpec limits = 2;
}

So it's clear exactly what is a default and what is a limit

Copy link
Contributor

@hamersaw hamersaw May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to changing this terminology somehow. Also, didn't realize until now that changing the name of a proto field is easy if you use the same id number?!?! Can we maybe specify something like:

message TaskResourceAttributes {
    TaskResourceSpec defaults = 1; [deprecated=true]
    TaskResourceSpec default_requests = 1;
    
    TaskResourceSpec limits = 2; // IIUC these are the hard limits imposed by the platform
    
    TaskResourceSpec default_limits = 3;
}


// If set, these limits will be used as the default limits for tasks that do not specify limits. If not set, but
// defaults (field 1) are set, then the defaults will be used as the limits.
TaskResourceSpec default_limits = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with having different requests and limits is that Kubernetes scheduler will schedule multiple pods on the same node (oversubscribe) and this will lead to pod evictions. Pod evictions may cause deletions of the pod ignoring finalizers even and in the case inject-finalizer is False then absolutely. This is bad and will cause propeller to think that the pod was externally deleted and relaunch.

If the pod stays, but is evicted then either demystify pending or demystify error should take care of it.

you can specify pod priority to ensure order of evictions.

https://kubernetes.io/docs/concepts/scheduling-eviction/node-pressure-eviction/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

before we merge this, cc @hamersaw and @EngHabu can you please confirm that evicted pods are handled correctly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so from that document, "Guaranteed pods are guaranteed only when requests and limits are specified for all the containers and they are equal". I didn't know this. But eviction can still happen right? Was this guaranteeing behavior the reason why we set limits to requests in the first place? didn't know that.

We've had numerous requests to allow limits to grow unbounded. we can also add this change and add an option to admin to continue the current behavior.

Copy link
Contributor

@hamersaw hamersaw May 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Guaranteed pods in the docs is VERY interesting @yee. Maybe it's worth running some benchmarks (ex. large fanout maptasks) where resource requests != limits and requests == limits to see if non guaranteed QoS Pods are evicted at a high rate in practice. Would be very intersted in seeing how this applies EKS vs GKE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes agreed, one day, we will do this.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just keep in mind that this is handled very differently for cpu and memory!

@wild-endeavor
Copy link
Contributor Author

closing this entirely - we will address this in the future as we get a better handle on evictions. the concern raised was that by setting a default limit, we open up ourselves to over-subscription on the cluster, which reduces reproducibility because of evictions, and makes things unclear because users won't know what level should be used. By setting limits to defaults, we fall into the guaranteed bucket of K8s pods, which get evicted less often, and OOMs become reproducible. The admin PR in progress will continue - we will support Pods that do not have limits by requiring users to expressly set them to 0.

@flixr
Copy link

flixr commented May 5, 2023

Just to clarify: all the oversubscription and evictions due to pressure do NOT apply to CPU.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants