GPUMounter-worker error in k8s v1.23.1 #19

cool9203 · 2022-01-16T12:01:13Z

GPUMounter-master.log:
2022-01-16T11:24:14.610Z INFO GPUMounter-master/main.go:25 access add gpu service
2022-01-16T11:24:14.610Z INFO GPUMounter-master/main.go:30 Pod: test Namespace: default GPU Num: 1 Is entire mount: false
2022-01-16T11:24:14.627Z INFO GPUMounter-master/main.go:66 Found Pod: test in Namespace: default on Node: rtxws
2022-01-16T11:24:14.634Z INFO GPUMounter-master/main.go:265 Worker: gpu-mounter-workers-7dsdf Node: rtxws
2022-01-16T11:24:19.648Z ERROR GPUMounter-master/main.go:98 Failed to call add gpu service
2022-01-16T11:24:19.648Z ERROR GPUMounter-master/main.go:99 rpc error: code = Unknown desc = Service Internal Error

GPUMounter-worker.log:
2022-01-16T11:24:14.635Z 2022-01-16T11:24:14.635Z 2022-01-16T11:24:14.645Z 2022-01-16T11:24:14.645Z 2022-01-16T11:24:14.645Z 2022-01-16T11:24:14.646Z 2022-01-16T11:24:14.657Z 2022-01-16T11:24:14.657Z 2022-01-16T11:24:14.661Z 2022-01-16T11:24:19.442Z 2022-01-16T11:24:19.442Z 2022-01-16T11:24:19.442Z 2022-01-16T11:24:19.444Z 2022-01-16T11:24:19.444Z 2022-01-16T11:24:19.444Z 2022-01-16T11:24:19.444Z 2022-01-16T11:24:19.444Z 2022-01-16T11:24:19.444Z 2022-01-16T11:24:19.445Z 2022-01-16T11:24:19.445Z INFO gpu-mount/server.go:35 AddGPU Service Called
INFO gpu-mount/server.go:36 request: pod_name:"test" namespace:"default" gpu_num:1
INFO gpu-mount/server.go:55 Successfully get Pod: default in cluster
INFO allocator/allocator.go:159 Get pod default/test mount type
INFO collector/collector.go:91 Updating GPU status
INFO collector/collector.go:136 GPU status update successfully
INFO allocator/allocator.go:59 Creating GPU Slave Pod: test-slave-pod-2f66ed for Owner Pod: test
INFO allocator/allocator.go:238 Checking Pods: test-slave-pod-2f66ed state
INFO allocator/allocator.go:264 Pod: test-slave-pod-2f66ed creating
INFO allocator/allocator.go:277 Pods: test-slave-pod-2f66ed are running
INFO allocator/allocator.go:84 Successfully create Slave Pod: %s, for Owner Pod: %s test-slave-pod-2f66edtest
INFO collector/collector.go:91 Updating GPU status
DEBUG collector/collector.go:130 GPU: /dev/nvidia0 allocated to Pod: test-slave-pod-2f66ed in Namespace gpu-pool
INFO collector/collector.go:136 GPU status update successfully
INFO gpu-mount/server.go:81 Start mounting, Total: 1 Current: 1
INFO util/util.go:19 Start mount GPU: {"MinorNumber":0,"DeviceFilePath":"/dev/nvidia0","UUID":"GPU-7fe47fc1-b21e-e675-f6ff-edd91910f8a7","State":"GPU_ALLOCATED_STATE","PodName":"test-slave-pod-2f66ed","Namespace":"gpu-pool"} to Pod: test
INFO util/util.go:24 Pod :test container ID: e317ca7f5eb5e3c523fab9f0744a065cd69013a7c09522318d4bbf98ad0bb1c3
INFO util/util.go:30 Successfully get cgroup path: /kubepods/burstable/podc815ee4b-bea0-44ed-8ef4-239e69516ba2/e317ca7f5eb5e3c523fab9f0744a065cd69013a7c09522318d4bbf98ad0bb1c3 for Pod: test
ERROR cgroup/cgroup.go:140 Exec "echo 'c 195:0 rw' > /sys/fs/cgroup/devices/kubepods/burstable/podc815ee4b-bea0-44ed-8ef4-239e69516ba2/e317ca7f5eb5e3c523fab9f0744a065cd69013a7c09522318d4bbf98ad0bb1c3/devices.allow" failed
ERROR cgroup/cgroup.go:141 Output: sh: 1: cannot create /sys/fs/cgroup/devices/kubepods/burstable/podc815ee4b-bea0-44ed-8ef4-239e69516ba2/e317ca7f5eb5e3c523fab9f0744a065cd69013a7c09522318d4bbf98ad0bb1c3/devices.allow: Directory nonexistent

2022-01-16T11:24:19.445Z ERROR cgroup/cgroup.go:142 exit status 2
2022-01-16T11:24:19.445Z ERROR util/util.go:33 Add GPU {"MinorNumber":0,"DeviceFilePath":"/dev/nvidia0","UUID":"GPU-7fe47fc1-b21e-e675-f6ff-edd91910f8a7","State":"GPU_ALLOCATED_STATE","PodName":"test-slave-pod-2f66ed","Namespace":"gpu-pool"}failed
2022-01-16T11:24:19.445Z ERROR gpu-mount/server.go:84 Mount GPU: {"MinorNumber":0,"DeviceFilePath":"/dev/nvidia0","UUID":"GPU-7fe47fc1-b21e-e675-f6ff-edd91910f8a7","State":"GPU_ALLOCATED_STATE","PodName":"test-slave-pod-2f66ed","Namespace":"gpu-pool"} to Pod: test in Namespace: default failed
2022-01-16T11:24:19.445Z ERROR gpu-mount/server.go:85 exit status 2

環境與版本

k8s version: v1.23
docker-client version:19.03.13
docekr-server version:20.10.12

在k8s v1.23裡, "/sys/fs/cgroup/devices/kubepods/burstable/pod[pod-id]/[container-id]/devices.allow" 改為 "/sys/fs/cgroup/devices/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod[pod-id]/docker-[container-id].scope/devices.allow"

所以當前GPUMounter在v1.23裡無法正常運作

是否可以更新至可符合k8s v1.23版，謝謝

pokerfaceSad · 2022-01-16T15:42:31Z

Thanks for your feedback. I will try to fix it.
PRs are also very welcomed!

cool9203 · 2022-01-20T18:32:26Z

ok, this bug is sloved.

use environment and version:

OS : ubuntu 20.04.1
k8s version : v1.23.1
docker-client version : 19.03.13
docker-server version : 20.10.12
CRI: docker
cgroup driver : systemd

i use nvidia k8s-device-plugin,
and i setting "/etc/docker/daemon.json" contant:

{
  "exec-opts": ["native.cgroupdriver=systemd"],
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "100m"
  },
  "storage-driver": "overlay2",
  "default-runtime": "nvidia",
  "runtimes": {
    "nvidia": {
      "path": "/usr/bin/nvidia-container-runtime",
      "runtimeArgs": []
    }
  }
}

in code of "/pkg/util/util.go", always pass "cgroupfs" to cgroupDriver when call function GetCgroupName, then will error.
so this bug not k8s version problem!
and pod id have _ in k8s v1.23.1, so don't check _ characters in function NewCgroupName.

so need detect for what use cgroup method of now.
but i'm rookie for golang, so need more time to coding, i will send PR in few day later.

and need edit title? if need will can direct edit.

pokerfaceSad · 2022-01-31T06:29:09Z

@cool9203 Happy Spring Festival!

Thanks for your efforts. Sorry for waiting so long time.

The checking of _ is to handle the systemd cgroup driver. But if _ can be involved in pod id, it may be complex to handle. Can you show me some k8s document descriptions about _ in pod id?

GPUMounter/pkg/util/cgroup/cgroup.go

Lines 33 to 40 in 7036133

    
           for _, component := range components { 
        
           	// Forbit using "_" in internal names. When remapping internal 
        
           	// names to systemd cgroup driver, we want to remap "-" => "_", 
        
           	// so we forbid "_" so that we can always reverse the mapping. 
        
           	if strings.Contains(component, "/") || strings.Contains(component, "_") { 
        
           		panic(fmt.Errorf("invalid character in component [%q] of CgroupName", component)) 
        
           	} 
        
           }

Pass constant cgroupfs is really a bug! It should be configurable.

GPUMounter/pkg/util/util.go

Line 25 in 7036133

cgroupPath, err := cgroup.GetCgroupName("cgroupfs", pod, containerID)

cool9203 · 2022-02-07T07:12:50Z

@pokerfaceSad Happy Spring Festival!!
thanks for your reply.

GPUMounter/pkg/util/cgroup/cgroup.go

Lines 33 to 40 in 7036133

    
           for _, component := range components { 
        
           	// Forbit using "_" in internal names. When remapping internal 
        
           	// names to systemd cgroup driver, we want to remap "-" => "_", 
        
           	// so we forbid "_" so that we can always reverse the mapping. 
        
           	if strings.Contains(component, "/") || strings.Contains(component, "_") { 
        
           		panic(fmt.Errorf("invalid character in component [%q] of CgroupName", component)) 
        
           	} 
        
           }

you right, today i test done, this is can run.
this is not necessary edit.
this edit is my test in the beginning.
my bug is pass cgroupfs in

GPUMounter/pkg/util/util.go

Line 25 in 7036133

cgroupPath, err := cgroup.GetCgroupName("cgroupfs", pod, containerID)

cool9203 · 2022-02-09T11:02:35Z

i got another problem.

GPUMounter/pkg/server/gpu-mount/server.go

Lines 124 to 135 in 7036133

    
           removeGPUs, err := gpuMountImpl.GetRemoveGPU(targetPod, request.Uuids) 
        
           if err != nil { 
        
           	Logger.Error("Failed to get remove gpu of Pod: ", targetPod.Name) 
        
           	Logger.Error(err) 
        
           	return nil, err 
        
           } 
        
           if len(removeGPUs) == 0 { 
        
           	Logger.Error("Invalid UUIDs: ", request.Uuids) 
        
           	return &gpu_mount.RemoveGPUResponse{ 
        
           		RemoveGpuResult: gpu_mount.RemoveGPUResponse_GPUNotFound, 
        
           	}, nil 
        
           }

in call RemoveGPU, some times get error Invalid UUIDs.
i track this error, found this is slave pod status is terminating, than pod will delete.
example:

kubectl get pod --all-namespaces

NAMESPACE     NAME                                       READY   STATUS        RESTARTS        AGE
gpu-pool      test460c04d4-slave-pod-bca118              1/1     Terminating   0               30s

GPUMounter/pkg/util/gpu/collector/collector.go

Line 90 in 7036133

func (gpuCollector *GPUCollector) UpdateGPUStatus() error {

so updategpu will not found any slave pod.
then will not get any GPUresource in mounted gpu pod.
this error maybe only in k8s v1.23.1? or other version occur too?

cool9203 · 2022-02-09T19:50:08Z

#19 (comment)
maybe i solved this.

https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/
seem like from k8s v1.20+, owner pod and slave pod be need in same namespace.
if owner pod and slave pod not in same namespace, slave pod status will is Terminating.
so slave pod namespace need set to same as owner pod namespace.
but i need testing more, i will report testing result.

update

my test result:

kubectl get pod -n gpu-pool

NAME                    READY   STATUS    RESTARTS   AGE
test                    1/1     Running   0          3m12s
test-slave-pod-d34ea2   1/1     Running   0          19s

pod/test.yaml

apiVersion: v1
kind: Pod
metadata:
  name: test
  namespace: gpu-pool
  labels:
    app: test
spec:
  containers:
  - name: test
    image: [docker-image]
    resources:
      requests:
        memory: "1024M"
        cpu: "1"
    env:
      - name: NVIDIA_VISIBLE_DEVICES
        value: "none"

kubectl describe pod test-slave-pod-d34ea2 -n gpu-pool

Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  4s    default-scheduler  Successfully assigned gpu-pool/test-slave-pod-290964 to rtxws
  Normal  Pulling    3s    kubelet            Pulling image "alpine:latest"
  Normal  Pulled     1s    kubelet            Successfully pulled image "alpine:latest" in 2.563965249s
  Normal  Created    1s    kubelet            Created container gpu-container
  Normal  Started    1s    kubelet            Started container gpu-container

owner pod and slave pod not in same namespace pod event:

Events:
  Type     Reason                    Age   From                          Message
  ----     ------                    ----  ----                          -------
  Normal   Scheduled                 4s    default-scheduler             Successfully assigned gpu-pool/test460c04d4-slave-pod-22d29a to rtxws
  Warning  OwnerRefInvalidNamespace  5s    garbage-collector-controller  ownerRef [v1/Pod, namespace: gpu-pool, name: test460c04d4, uid: a55bc88b-60d1-460f-a7c7-4072fe6a9a2c] does not exist in namespace "gpu-pool"
  Normal   Pulling                   4s    kubelet                       Pulling image "alpine:latest"
  Normal   Pulled                    1s    kubelet                       Successfully pulled image "alpine:latest" in 2.568386225s
  Normal   Created                   1s    kubelet                       Created container gpu-container
  Normal   Started                   1s    kubelet                       Started container gpu-container
  Normal   Killing                   0s    kubelet                       Stopping container gpu-container

can see in this test, pod/test.yaml namespace change to gpu-pool.
now, slave pod status is running, not is terminating.
and check pod event, will get pod is running, not is stopping.
i have test for idle 15 minutes, slave pod will be running, not delete.
and can see to not in same namespace event log, show to does not exist in namespace gpu-pool.
so in k8s v1.20+, slave pod and owner pod must of same namespace.
if not same, slave pod status will be terminating.
and call RemoveGPU service, will show Invalid UUIDs error.

maybe don't use gpu-pool namespace in k8s v1.20+.
slave pod always use owner pod namespace, not use gpu-pool.
this is good idea or not? please give me advice, thanks!!

pokerfaceSad · 2022-02-10T13:17:57Z

@cool9203
Thank you for revealing this!
The reason why slave pod can't be created in owner pod namspace is #3.
Maybe need some modifications to adpat k8s v1.20+.

* add environment variable `CGROUP_DRIVER`(default: cgroupfs) to set cgroup driver * fix log format error in allocator.go L84

pokerfaceSad · 2022-02-18T10:05:05Z

@cool9203
The bug of constant cgroup driver has been fixed in 163ef7b.
cgroup driver can be set in /deploy/gpu-mounter-workers.yaml by environment variable CGROUP_DRIVER.

…Sad#19#issuecomment-1033637663 * add environment variable `GPU_POOL_NAMESPACE`(not have default value, must set this env var) to set slave pod namespace on create on worker

…Sad#19 (comment) * add environment variable `GPU_POOL_NAMESPACE`(not have default value, must set this env var) to set slave pod namespace on create on worker

cool9203 · 2022-02-21T22:29:41Z

@pokerfaceSad
sorry, i reply late.

@cool9203 The bug of constant cgroup driver has been fixed in 163ef7b. cgroup driver can be set in /deploy/gpu-mounter-workers.yaml by environment variable CGROUP_DRIVER.

thanks your fixed, pass a environment variable in worker.yaml is good idea!

@cool9203 Thank you for revealing this! The reason why slave pod can't be created in owner pod namspace is #3. Maybe need some modifications to adpat k8s v1.20+.

i show one solve method in #19 (comment)
in this solve, owner pod and slave pod must be same namespace, like gpu-pool, default, kube-system or other namespace.
and i not set any resource quota.
so like this solve showed, i think owner and slave pod must be same namespace in k8s v1.20+.
what do you think?

pokerfaceSad · 2022-02-22T05:46:55Z

@cool9203
In fact, slave pods were created in owner pod namespace before a378e39.

However, in a multi-tenant cluster scenario, cluster administrator may use resourse quota feature to limit the resource usage of users.

If GPUMounter create the slave pods in owner pod namespaces, slave pods will consume the resource quota of the user.

pokerfaceSad added the bug Something isn't working label Jan 16, 2022

pokerfaceSad added a commit that referenced this issue Feb 17, 2022

fix cgroup driver to configurable as mentioned in issue #19

163ef7b

* add environment variable `CGROUP_DRIVER`(default: cgroupfs) to set cgroup driver * fix log format error in allocator.go L84

pokerfaceSad mentioned this issue Feb 18, 2022

Can not use GPUMounter on k8s #20

Open

pokerfaceSad mentioned this issue Dec 7, 2022

mount成功之后Slave-pod 过一会被杀死，导致不能卸载GPU #22

Closed

pokerfaceSad mentioned this issue Nov 8, 2023

Insufficient GPU on Node: xxx #23

Open

pokerfaceSad mentioned this issue Apr 8, 2024

Does containerd support？ #24

Open

pokerfaceSad mentioned this issue May 11, 2024

The GPU cannot be mounted correctly #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPUMounter-worker error in k8s v1.23.1 #19

GPUMounter-worker error in k8s v1.23.1 #19

cool9203 commented Jan 16, 2022 •

edited

Loading

pokerfaceSad commented Jan 16, 2022

cool9203 commented Jan 20, 2022 •

edited

Loading

pokerfaceSad commented Jan 31, 2022

cool9203 commented Feb 7, 2022

cool9203 commented Feb 9, 2022

cool9203 commented Feb 9, 2022 •

edited

Loading

pokerfaceSad commented Feb 10, 2022

pokerfaceSad commented Feb 18, 2022

cool9203 commented Feb 21, 2022

pokerfaceSad commented Feb 22, 2022

GPUMounter-worker error in k8s v1.23.1 #19

GPUMounter-worker error in k8s v1.23.1 #19

Comments

cool9203 commented Jan 16, 2022 • edited Loading

pokerfaceSad commented Jan 16, 2022

cool9203 commented Jan 20, 2022 • edited Loading

pokerfaceSad commented Jan 31, 2022

cool9203 commented Feb 7, 2022

cool9203 commented Feb 9, 2022

cool9203 commented Feb 9, 2022 • edited Loading

pokerfaceSad commented Feb 10, 2022

pokerfaceSad commented Feb 18, 2022

cool9203 commented Feb 21, 2022

pokerfaceSad commented Feb 22, 2022

cool9203 commented Jan 16, 2022 •

edited

Loading

cool9203 commented Jan 20, 2022 •

edited

Loading

cool9203 commented Feb 9, 2022 •

edited

Loading