koord-scheduler: DeviceShare supports preempting devices #1146

eahydra · 2023-03-27T11:30:36Z

Ⅰ. Describe what this PR does

Enhanced DeviceShare scheduling plugins

support preempting Devices

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Create a Pod and apply for all GPUs on a specified node. In the scenario I tested, a node has two GPU instances

$ cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: test-reserve-gpu
  name: test-gpu-deploy
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: test-reserve-gpu
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: test-reserve-gpu
        koordinator.sh/qosClass: LS
    spec:
      containers:
      - args:
        - "3600"
        command:
        - sleep
        image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
        name: test
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
            koordinator.sh/gpu: "200"
      schedulerName: koord-scheduler
EOF
deployment.apps/test-gpu-deploy created

Check Pod status

$ kubectl get pod -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP         NODE                    NOMINATED NODE   READINESS GATES
test-gpu-deploy-849874876f-cfvrr   1/1     Running   0          10s   10.0.3.7   cn-beijing.10.0.3.245   <none>           <none>

Create a Pod with high priority than pre-created pod

$ cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: high-test-reserve-gpu
  name: high-test-gpu-deploy
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: high-test-reserve-gpu
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: high-test-reserve-gpu
        koordinator.sh/qosClass: LS
    spec:
      containers:
      - args:
        - "3600"
        command:
        - sleep
        image: nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda10.2
        name: test
        resources:
          limits:
            cpu: "1"
            memory: 1Gi
            koordinator.sh/gpu: "100"
      priorityClassName: system-cluster-critical
      schedulerName: koord-scheduler
EOF
deployment.apps/high-test-gpu-deploy created

Watch the Pod status

$ kubectl get pod -o wide
NAME                                   READY   STATUS        RESTARTS   AGE   IP         NODE                    NOMINATED NODE          READINESS GATES
high-test-gpu-deploy-dcd465c44-mq9bv   0/1     Pending       0          4s    <none>     <none>                  cn-beijing.10.0.3.245   <none>
test-gpu-deploy-849874876f-8fpsk       0/1     Pending       0          4s    <none>     <none>                  <none>                  <none>
test-gpu-deploy-849874876f-cfvrr       1/1     Terminating   0          21s   10.0.3.4   cn-beijing.10.0.3.245   <none>                  <none>

Get events of preempted pod

$ kubectl get event | grep cfvrr
38s         Normal    Preempted           pod/test-gpu-deploy-849874876f-cfvrr        Preempted by default/high-test-gpu-deploy-dcd465c44-mq9bv on node cn-beijing.10.0.3.245

As the result shows, Pod test-gpu-deploy-849874876f-cfvrr is preempted by high-test-gpu-deploy-dcd465c44-mq9bv.

$ kubectl get pod 
NAME                                   READY   STATUS    RESTARTS   AGE
high-test-gpu-deploy-dcd465c44-mq9bv   1/1     Running   0          6m39s
test-gpu-deploy-849874876f-8fpsk       0/1     Pending   0          6m38s

Ⅳ. Special notes for reviews

V. Checklist

I have written necessary docs and comments
I have added necessary unit tests and integration tests
All checks passed in make test

Signed-off-by: Joseph <joseph.t.lee@outlook.com>

codecov · 2023-03-27T11:52:29Z

Codecov Report

Patch coverage: 66.66% and no project coverage change.

Comparison is base (b7d7a45) 66.77% compared to head (e7bc64a) 66.77%.

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #1146    +/-   ##
========================================
  Coverage   66.77%   66.77%            
========================================
  Files         271      271            
  Lines       29603    29751   +148     
========================================
+ Hits        19766    19865    +99     
- Misses       8425     8464    +39     
- Partials     1412     1422    +10

Flag	Coverage Δ
unittests	`66.77% <66.66%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pkg/scheduler/plugins/deviceshare/plugin.go	`71.42% <52.04%> (-16.37%)`	⬇️
pkg/scheduler/plugins/deviceshare/utils.go	`92.19% <57.89%> (-7.81%)`	⬇️
pkg/scheduler/plugins/deviceshare/device_cache.go	`89.68% <96.49%> (+3.45%)`	⬆️
pkg/scheduler/plugins/deviceshare/allocator.go	`87.50% <100.00%> (ø)`

... and 2 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

jasonliu747

/lgtm

hormes · 2023-03-29T04:49:37Z

/approve

koordinator-bot · 2023-03-29T04:49:44Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes, jasonliu747

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/scheduler/OWNERS~~ [hormes]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

koord-scheduler: DeviceShare support preempting devices

e7bc64a

Signed-off-by: Joseph <joseph.t.lee@outlook.com>

eahydra requested review from hormes, FillZpp, ZiMengSheng and jasonliu747 March 27, 2023 11:30

koordinator-bot bot requested a review from buptcozy March 27, 2023 11:30

koordinator-bot bot added the size/XL label Mar 27, 2023

eahydra changed the title ~~koord-scheduler: DeviceShare support preempting devices~~ koord-scheduler: DeviceShare supports preempting devices Mar 27, 2023

eahydra added the area/koord-scheduler label Mar 27, 2023

eahydra added this to the v1.2 milestone Mar 27, 2023

eahydra added the enhancement New feature or request label Mar 27, 2023

jasonliu747 approved these changes Mar 29, 2023

View reviewed changes

koordinator-bot bot assigned jasonliu747 Mar 29, 2023

koordinator-bot bot added the lgtm label Mar 29, 2023

koordinator-bot bot added the approved label Mar 29, 2023

koordinator-bot bot merged commit bda1e74 into koordinator-sh:main Mar 29, 2023
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

koord-scheduler: DeviceShare supports preempting devices #1146

koord-scheduler: DeviceShare supports preempting devices #1146

eahydra commented Mar 27, 2023 •

edited

codecov bot commented Mar 27, 2023 •

edited

jasonliu747 left a comment

hormes commented Mar 29, 2023

koordinator-bot bot commented Mar 29, 2023

koord-scheduler: DeviceShare supports preempting devices #1146

koord-scheduler: DeviceShare supports preempting devices #1146

Conversation

eahydra commented Mar 27, 2023 • edited

Ⅰ. Describe what this PR does

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

codecov bot commented Mar 27, 2023 • edited

Codecov Report

jasonliu747 left a comment

Choose a reason for hiding this comment

hormes commented Mar 29, 2023

koordinator-bot bot commented Mar 29, 2023

eahydra commented Mar 27, 2023 •

edited

codecov bot commented Mar 27, 2023 •

edited