Skip to content
No description, website, or topics provided.
Go Shell Other
Branch: master
Clone or download
mYmNeo Use go mod as management tool (#4)
Signed-off-by: mYmNeo <thomassong2012@gmail.com>
Latest commit e81fdd7 Nov 14, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
build Init commit Nov 5, 2019
cmd Use go mod as management tool (#4) Nov 14, 2019
hack Use go mod as management tool (#4) Nov 14, 2019
pkg Use go mod as management tool (#4) Nov 14, 2019
staging/src/google/protobuf Init commit Nov 5, 2019
.gitignore
CONTRIBUTING.md Update CONTRIBUTING.md Nov 8, 2019
LICENSE Rename License to LICENSE Nov 12, 2019
MAINTAINERS
Makefile Init commit Nov 5, 2019
README.md
VERSION
go.mod Use go mod as management tool (#4) Nov 14, 2019
go.sum
gpu-manager-svc.yaml Init commit Nov 5, 2019
gpu-manager.yaml Init commit Nov 5, 2019
revive.toml Init commit Nov 5, 2019

README.md

GPU Manager

GPU Manager is used for managing the nvidia GPU devices in Kubernetes cluster. It implements the DevicePlugin interface of Kubernetes. So it's compatible with 1.9+ of Kubernetes release version.

To compare with the combination solution of nvidia-docker and nvidia-k8s-plugin, GPU manager will use native runc without modification but nvidia solution does. Besides we also support metrics report without deploying new components.

To schedule a GPU payload correctly, GPU manager should work with gpu-quota-admission which is a kubernetes scheduler plugin.

GPU manager also supports the payload with fraction resource of GPU device such as 0.1 card or 100MiB gpu device memory. If you want this kind feature, please refer to vcuda-controller project.

How to deploy GPU Manager

GPU Manager is running as daemonset, and because of the RABC restriction and hydrid cluster, you need to do the following steps to make this daemonset run correctly.

  • service account and clusterrole
kubectl create sa gpu-manager -n kube-system
kubectl create clusterrolebinding gpu-manager-role --clusterrole=cluster-admin --serviceaccount=kube-system:gpu-manager
  • label node with nvidia-device-enable=enable
kubectl label node <node> nvidia-device-enable=enable
  • change gpu-manager.yaml and submit

change --incluster-mode from false to true, change image field to <your repository>/public/gpu-manager:latest, add serviceAccount filed to gpu-manager-role

You can’t perform that action at this time.