Skip to content
This repository has been archived by the owner on Oct 9, 2023. It is now read-only.

Fix deletion of elastic task resource requests #379

Merged
merged 1 commit into from
Aug 8, 2023

Conversation

fg91
Copy link
Member

@fg91 fg91 commented Jul 19, 2023

TL;DR

Currently, the following elastic task's pods will not have any resource requests (1.8 release):

@task(
    task_config=Elastic(
        nnodes=2,
        nproc_per_node=2,
    ),
    requests=Resources(cpu="1", mem="1Gi", gpu="1"),
)
def train():
    ...

This PR fixes this.

Type

  • Bug Fix
  • Feature
  • Plugin

Are all requirements met?

  • Code completed
  • Smoke tested
  • Unit tests added
  • Code documentation added
  • Any pending items have an associated Issue

Complete description

The kubeflow plugin was recently refactored in flytekit (flyteorg/flytekit#1627) as well as in flyteplugins (#345).

After the refactoring, users can set e.g. different resources for master and worker replicas of PyTorch tasks:

@task(
    task_config=PyTorch(
        worker=Worker(
            requests=Resources(cpu="2", mem="2Gi"),
            limits=Resources(cpu="4", mem="2Gi"),
        ),
        master=...
      )
      requests=Resources(cpu="1", mem="1Gi", gpu="1"),
)
....

If the user overrides one of the resource requests in the task_config, in the backend we override the resources configured on the task level here.

Now, in a @task(task_config=PyTorch()..) task, when the user doesn't override the resources, we send task_models.Resources(requests=[], limits=[]) (which is different from None/nil, see here) to the backend as part of the worker spec. We go into this if statement but not into this one within the first. As a consequence, in this case we don't overwrite the resource request set in @task(requests=Resources()). This makes sense.

However, the Elastic task config in flytekitplugins currently doesn't allow overwriting the resource request since elastic pytorch training jobs don't have a differentiation between master and worker replicas. There are only workers which means we can simply use the normal @task(requests=Resources()) to specify their resources.

This, however, means that in this case we send None/nil to the backend as part of the worker specs resource request. Here we go into the else case and simply delete what the user set as @task(requests=Resources()).

This seems wrong and this PR changes this.

Tracking Issue

NA

fixes https://github.com/flyteorg/flyte/issues/

Follow-up issue

NA

}
podSpec.Containers[idx].Resources = *resources
}
} else {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the problematic line.

@@ -720,7 +720,7 @@ func TestBuildResourcePytorchV1WithZeroWorker(t *testing.T) {
assert.Error(t, err)
}

func TestParasElasticConfig(t *testing.T) {
func TestParseElasticConfig(t *testing.T) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scope creep: fix typo

Signed-off-by: Fabio Graetz <fabiograetz@googlemail.com>
@fg91 fg91 force-pushed the fg91/fix/empty-elastic-task-resources branch from 5f9842a to 331a60c Compare July 19, 2023 14:03
@fg91 fg91 requested a review from hamersaw July 19, 2023 14:04
@fg91 fg91 self-assigned this Jul 19, 2023
@fg91 fg91 added the bug Something isn't working label Jul 19, 2023
@fg91
Copy link
Member Author

fg91 commented Jul 19, 2023

@yubofredwang I somehow can't tag you as reviewer but could you please sanity check this? Thanks

@codecov
Copy link

codecov bot commented Jul 19, 2023

Codecov Report

Merging #379 (5f9842a) into master (13ef7d6) will decrease coverage by 0.03%.
The diff coverage is 0.00%.

❗ Current head 5f9842a differs from pull request most recent head 331a60c. Consider uploading reports for the commit 331a60c to get more accurate results

@@            Coverage Diff             @@
##           master     #379      +/-   ##
==========================================
- Coverage   62.96%   62.94%   -0.03%     
==========================================
  Files         154      154              
  Lines       13019    13016       -3     
==========================================
- Hits         8198     8193       -5     
- Misses       4208     4210       +2     
  Partials      613      613              
Flag Coverage Δ
unittests 62.94% <0.00%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../plugins/k8s/kfoperators/common/common_operator.go 64.32% <0.00%> (-0.57%) ⬇️

... and 1 file with indirect coverage changes

@hamersaw
Copy link
Contributor

@fg91 just want to make sure I understand the logic here. Basically for the pytorch, MPI, and tensorflow plugins there are three states resources for each replica can be:

  • nil: this means that they were not provided and we need to set them to empty lists
  • empty list: we do nothing, this essentially means they were not set in flytekit
  • list: if provided then we override

This introduces issues because in the elastic plugin the resources will always be nil because they are not set. So this PR is changing the handling of nil resources. My concern is whether there is a good reason that nil resources are set to empty lists rather then leaving them as is.

Setting the resources for each replica in flytekit to empty lists for the elastic plugin would achieve the same result right?

@fg91
Copy link
Member Author

fg91 commented Jul 25, 2023

@fg91 just want to make sure I understand the logic here. Basically for the pytorch, MPI, and tensorflow plugins there are three states resources for each replica can be:

  • nil: this means that they were not provided and we need to set them to empty lists
  • empty list: we do nothing, this essentially means they were not set in flytekit
  • list: if provided then we override

Your understanding is correct that there are these three possible states for resources for each replica and that this is the current behaviour.

Setting the resources for each replica in flytekit to empty lists for the elastic plugin would achieve the same result right?

It would, yes, I first considered doing this, but I personally find the current logic here odd:

Is there a scenario where we would ever want to end up with a pod without any resource requests despite having set them in the task decorator's requests arg? I could imagine that one sets resources in the task decorator and then chooses lower/higher resources for one specific replica. But no resources at all?

From the perspective of the elastic plugin I find it unintuitive that in order for the resources set in @task(requests= to not be deleted, I have to include an empty list in the worker spec's resources and that if I completely omit the resources field there (which is the default), the ones I specified in @task(requests= are deleted.

Considering the function OverrideContainerSpec, which is passed a pod spec with certain resources and override resources, I find it unintuitive that when I pass empty resources, the original ones in the pod spec are preserved but if I pass resources=nil, the resources in the pod spec are set to empty resources.

This is why I personally think we should address this in the backend and not in flytekit but of course I'm happy to discuss!

I'd be interested in your thoughts on this @yubofredwang

@yubofredwang
Copy link
Contributor

@fg91 just want to make sure I understand the logic here. Basically for the pytorch, MPI, and tensorflow plugins there are three states resources for each replica can be:

  • nil: this means that they were not provided and we need to set them to empty lists
  • empty list: we do nothing, this essentially means they were not set in flytekit
  • list: if provided then we override

Your understanding is correct that there are these three possible states for resources for each replica and that this is the current behaviour.

Setting the resources for each replica in flytekit to empty lists for the elastic plugin would achieve the same result right?

It would, yes, I first considered doing this, but I personally find the current logic here odd:

Is there a scenario where we would ever want to end up with a pod without any resource requests despite having set them in the task decorator's requests arg? I could imagine that one sets resources in the task decorator and then chooses lower/higher resources for one specific replica. But no resources at all?

From the perspective of the elastic plugin I find it unintuitive that in order for the resources set in @task(requests= to not be deleted, I have to include an empty list in the worker spec's resources and that if I completely omit the resources field there (which is the default), the ones I specified in @task(requests= are deleted.

Considering the function OverrideContainerSpec, which is passed a pod spec with certain resources and override resources, I find it unintuitive that when I pass empty resources, the original ones in the pod spec are preserved but if I pass resources=nil, the resources in the pod spec are set to empty resources.

This is why I personally think we should address this in the backend and not in flytekit but of course I'm happy to discuss!

I'd be interested in your thoughts on this @yubofredwang

The reason of the code logic is that I assume when user does not specify resources at all. They want to use the default resources user set for that namespace. Considering the link here

How about having an extra check like this

if resources != nil {
  ...
} else if podSepc[containeridx].resources == nil {
  podSpec.Containers[idx].Resources = v1.ResourceRequirements{}
}

Does this fix this bug?

@fg91
Copy link
Member Author

fg91 commented Jul 27, 2023

The reason of the code logic is that I assume when user does not specify resources at all. They want to use the default resources user set for that namespace. Considering the link here

When not specifying any resources at all in a task, e.g.

@task
def train() -> str:
    print("Trained")

the respective pod should have the task defaults configured on the Flyte platform side , not defaults configured e.g. via a kind: LimitRange.

In our case the task above runs with the platform defaults we configured in the helm values

    resources:
      limits:
        cpu: "1"
        memory: 2Gi
      requests:
        cpu: "1"
        memory: 2Gi

I would expect the same resources for this task:

@task(
    task_config=Elastic(
        nnodes=2,
        nproc_per_node=2,
    )
)
def train() -> str:
    print("Trained")

When using flyte 1.8.0 (helm chart, flytekit, flytekitplugins-kfpytorch), however, the task doesn't have the default resources configured on the platform side:

resources: {}

In my opinion this is not the expected behaviour.

@hamersaw hamersaw merged commit c528bb8 into master Aug 8, 2023
5 of 7 checks passed
eapolinario pushed a commit that referenced this pull request Sep 6, 2023
Signed-off-by: Fabio Graetz <fabiograetz@googlemail.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
3 participants