-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add the vGPUScheduler to support Alnair Virtual GPUs (#104)
* add our vGPU scheduler to the main * clean against code
- Loading branch information
Showing
15 changed files
with
2,023 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
FROM debian:stretch-slim | ||
|
||
WORKDIR / | ||
|
||
COPY kube-scheduler /usr/local/bin | ||
|
||
CMD ["kube-scheduler"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
all: local | ||
|
||
local: | ||
GOOS=linux GOARCH=amd64 go build -o=kube-scheduler ./cmd/scheduler | ||
|
||
build: | ||
|
||
sudo docker build --no-cache . -t centaurusinfra/vgpu-scheduler:0.3.0 | ||
|
||
push: | ||
sudo docker push centaurusinfra/vgpu-scheduler:0.3.0 | ||
|
||
# Run go fmt against code | ||
fmt: | ||
sudo gofmt -l -w . | ||
|
||
clean: fmt vet | ||
sudo rm -f kube-scheduler |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
# vGPUScheduler: A customized scheduler for virtual GPUs | ||
|
||
vGPUScheduler is a customized kubernetes scheduler based on [scheduling-framework](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/20180409-scheduling-framework.md). The APIs of scheduling framework allow most scheduling features to be implemented as plugins, while keeping the scheduling core more maintainable. As shown in the diagram, the framework defines a few extension points in both the scheduling cycle and the binding cycle. Our design of plugins are registered and invoked at the Filter and Score extension points to change the scheduling decisions, respectively. | ||
|
||
#### Diagram of K8S scheduling-framework and our design | ||
![framework](./img/framework.png) | ||
|
||
#### Get Started | ||
- Make sure kubernetes cluster version is 1.17+, otherwise it may not fully support the K8S scheduling-framework | ||
- Git clone the alnair Repo: | ||
```shell | ||
git clone git@github.com:CentaurusInfra/alnair.git | ||
``` | ||
- In the vGPUScheduler folder, compile: | ||
```shell | ||
make local | ||
``` | ||
- Build the docker image, and push to your docker hub: | ||
```shell | ||
make build | ||
make push | ||
make clean | ||
``` | ||
- Backup your `kube-scheduler.yaml` usually located in `/etc/kubernetes/manifests/`. Then copy our `manifests/kube-scheduler.yaml` and `menifests/vGPUScheduler-config.yaml` to `/etc/kubernetes/manifests/`. Change the image link in `kube-scheduler.yaml` accordingly. | ||
- Check if the new scheduler is running in your cluster: | ||
```shell | ||
kubectl get pod -n kube-system | grep "scheduler" | ||
``` | ||
#### Deploy a Pod using vGPUScheduler | ||
- Create a pod which needs 4GB GPU memory | ||
```shell | ||
kubectl create -f pod.yaml | ||
``` | ||
- Then create a pot which needs 10G GPU memory | ||
```shell | ||
kubectl create -f pod2.yaml | ||
``` | ||
- Check the pods status and see which node and GPU they were bound to via the pod annotations. | ||
```shell | ||
kubectl get pod gpu-pod -o yaml | grep -A 8 annotations | ||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
package main | ||
|
||
import ( | ||
"fmt" | ||
"math/rand" | ||
"os" | ||
"time" | ||
|
||
"k8s.io/component-base/logs" | ||
"k8s.io/kubernetes/cmd/kube-scheduler/app" | ||
"vGPUScheduler/pkg/vGPUScheduler" | ||
|
||
_ "sigs.k8s.io/scheduler-plugins/pkg/apis/config/scheme" | ||
) | ||
|
||
func main() { | ||
rand.Seed(time.Now().UTC().UnixNano()) | ||
logs.InitLogs() | ||
defer logs.FlushLogs() | ||
|
||
cmd := app.NewSchedulerCommand( | ||
app.WithPlugin(vGPUScheduler.Name, vGPUScheduler.New), | ||
) | ||
|
||
if err := cmd.Execute(); err != nil { | ||
_, _ = fmt.Fprintf(os.Stderr, "%v\n", err) | ||
os.Exit(1) | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
module vGPUScheduler | ||
|
||
go 1.17 | ||
|
||
require ( | ||
k8s.io/api v0.22.6 | ||
k8s.io/apimachinery v0.22.6 | ||
k8s.io/client-go v0.22.6 | ||
k8s.io/component-base v0.22.6 | ||
k8s.io/klog/v2 v2.9.0 | ||
k8s.io/kubernetes v1.22.6 | ||
sigs.k8s.io/scheduler-plugins v0.22.6 | ||
) | ||
|
||
require ( | ||
github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1 // indirect | ||
github.com/NYTimes/gziphandler v1.1.1 // indirect | ||
github.com/PuerkitoBio/purell v1.1.1 // indirect | ||
github.com/PuerkitoBio/urlesc v0.0.0-20170810143723-de5bf2ad4578 // indirect | ||
github.com/beorn7/perks v1.0.1 // indirect | ||
github.com/bits-and-blooms/bitset v1.2.0 // indirect | ||
github.com/blang/semver v3.5.1+incompatible // indirect | ||
github.com/cespare/xxhash/v2 v2.1.1 // indirect | ||
github.com/coreos/go-semver v0.3.0 // indirect | ||
github.com/coreos/go-systemd/v22 v22.3.2 // indirect | ||
github.com/cyphar/filepath-securejoin v0.2.2 // indirect | ||
github.com/davecgh/go-spew v1.1.1 // indirect | ||
github.com/docker/distribution v2.7.1+incompatible // indirect | ||
github.com/emicklei/go-restful v2.9.5+incompatible // indirect | ||
github.com/evanphx/json-patch v4.11.0+incompatible // indirect | ||
github.com/felixge/httpsnoop v1.0.1 // indirect | ||
github.com/go-logr/logr v0.4.0 // indirect | ||
github.com/go-openapi/jsonpointer v0.19.5 // indirect | ||
github.com/go-openapi/jsonreference v0.19.5 // indirect | ||
github.com/go-openapi/swag v0.19.14 // indirect | ||
github.com/gogo/protobuf v1.3.2 // indirect | ||
github.com/golang/groupcache v0.0.0-20210331224755-41bb18bfe9da // indirect | ||
github.com/golang/protobuf v1.5.2 // indirect | ||
github.com/google/go-cmp v0.5.5 // indirect | ||
github.com/google/gofuzz v1.1.0 // indirect | ||
github.com/google/uuid v1.1.2 // indirect | ||
github.com/googleapis/gnostic v0.5.5 // indirect | ||
github.com/grpc-ecosystem/go-grpc-prometheus v1.2.0 // indirect | ||
github.com/grpc-ecosystem/grpc-gateway v1.16.0 // indirect | ||
github.com/imdario/mergo v0.3.5 // indirect | ||
github.com/inconshreveable/mousetrap v1.0.0 // indirect | ||
github.com/josharian/intern v1.0.0 // indirect | ||
github.com/json-iterator/go v1.1.11 // indirect | ||
github.com/mailru/easyjson v0.7.6 // indirect | ||
github.com/matttproud/golang_protobuf_extensions v1.0.2-0.20181231171920-c182affec369 // indirect | ||
github.com/moby/term v0.0.0-20210610120745-9d4ed1856297 // indirect | ||
github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect | ||
github.com/modern-go/reflect2 v1.0.1 // indirect | ||
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect | ||
github.com/opencontainers/go-digest v1.0.0 // indirect | ||
github.com/opencontainers/runc v1.0.2 // indirect | ||
github.com/opencontainers/selinux v1.8.2 // indirect | ||
github.com/pkg/errors v0.9.1 // indirect | ||
github.com/prometheus/client_golang v1.11.0 // indirect | ||
github.com/prometheus/client_model v0.2.0 // indirect | ||
github.com/prometheus/common v0.26.0 // indirect | ||
github.com/prometheus/procfs v0.6.0 // indirect | ||
github.com/spf13/cobra v1.1.3 // indirect | ||
github.com/spf13/pflag v1.0.5 // indirect | ||
go.etcd.io/etcd/api/v3 v3.5.0 // indirect | ||
go.etcd.io/etcd/client/pkg/v3 v3.5.0 // indirect | ||
go.etcd.io/etcd/client/v3 v3.5.0 // indirect | ||
go.opentelemetry.io/contrib v0.20.0 // indirect | ||
go.opentelemetry.io/contrib/instrumentation/google.golang.org/grpc/otelgrpc v0.20.0 // indirect | ||
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.20.0 // indirect | ||
go.opentelemetry.io/otel v0.20.0 // indirect | ||
go.opentelemetry.io/otel/exporters/otlp v0.20.0 // indirect | ||
go.opentelemetry.io/otel/metric v0.20.0 // indirect | ||
go.opentelemetry.io/otel/sdk v0.20.0 // indirect | ||
go.opentelemetry.io/otel/sdk/export/metric v0.20.0 // indirect | ||
go.opentelemetry.io/otel/sdk/metric v0.20.0 // indirect | ||
go.opentelemetry.io/otel/trace v0.20.0 // indirect | ||
go.opentelemetry.io/proto/otlp v0.7.0 // indirect | ||
go.uber.org/atomic v1.7.0 // indirect | ||
go.uber.org/multierr v1.6.0 // indirect | ||
go.uber.org/zap v1.17.0 // indirect | ||
golang.org/x/crypto v0.0.0-20210220033148-5ea612d1eb83 // indirect | ||
golang.org/x/net v0.0.0-20211209124913-491a49abca63 // indirect | ||
golang.org/x/oauth2 v0.0.0-20200107190931-bf48bf16ab8d // indirect | ||
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c // indirect | ||
golang.org/x/sys v0.0.0-20210616094352-59db8d763f22 // indirect | ||
golang.org/x/term v0.0.0-20210220032956-6a3ed077a48d // indirect | ||
golang.org/x/text v0.3.6 // indirect | ||
golang.org/x/time v0.0.0-20210723032227-1f47c861a9ac // indirect | ||
golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 // indirect | ||
google.golang.org/appengine v1.6.5 // indirect | ||
google.golang.org/genproto v0.0.0-20210602131652-f16073e35f0c // indirect | ||
google.golang.org/grpc v1.38.0 // indirect | ||
google.golang.org/protobuf v1.26.0 // indirect | ||
gopkg.in/inf.v0 v0.9.1 // indirect | ||
gopkg.in/natefinch/lumberjack.v2 v2.0.0 // indirect | ||
gopkg.in/yaml.v2 v2.4.0 // indirect | ||
gopkg.in/yaml.v3 v3.0.0-20210107192922-496545a6307b // indirect | ||
k8s.io/apiserver v0.22.6 // indirect | ||
k8s.io/cloud-provider v0.22.6 // indirect | ||
k8s.io/component-helpers v0.22.6 // indirect | ||
k8s.io/csi-translation-lib v0.22.6 // indirect | ||
k8s.io/kube-openapi v0.0.0-20211109043538-20434351676c // indirect | ||
k8s.io/kube-scheduler v0.22.6 // indirect | ||
k8s.io/mount-utils v0.22.6 // indirect | ||
k8s.io/utils v0.0.0-20210819203725-bdf08cb9a70a // indirect | ||
sigs.k8s.io/apiserver-network-proxy/konnectivity-client v0.0.27 // indirect | ||
sigs.k8s.io/structured-merge-diff/v4 v4.2.1 // indirect | ||
sigs.k8s.io/yaml v1.2.0 // indirect | ||
) | ||
|
||
replace ( | ||
k8s.io/api => k8s.io/api v0.22.6 | ||
k8s.io/apiextensions-apiserver => k8s.io/apiextensions-apiserver v0.22.6 | ||
k8s.io/apimachinery => k8s.io/apimachinery v0.22.6 | ||
k8s.io/apiserver => k8s.io/apiserver v0.22.6 | ||
k8s.io/cli-runtime => k8s.io/cli-runtime v0.22.6 | ||
k8s.io/client-go => k8s.io/client-go v0.22.6 | ||
k8s.io/cloud-provider => k8s.io/cloud-provider v0.22.6 | ||
k8s.io/cluster-bootstrap => k8s.io/cluster-bootstrap v0.22.6 | ||
k8s.io/code-generator => k8s.io/code-generator v0.22.6 | ||
k8s.io/component-base => k8s.io/component-base v0.22.6 | ||
k8s.io/component-helpers => k8s.io/component-helpers v0.22.6 | ||
k8s.io/controller-manager => k8s.io/controller-manager v0.22.6 | ||
k8s.io/cri-api => k8s.io/cri-api v0.22.6 | ||
k8s.io/csi-translation-lib => k8s.io/csi-translation-lib v0.22.6 | ||
k8s.io/kube-aggregator => k8s.io/kube-aggregator v0.22.6 | ||
k8s.io/kube-controller-manager => k8s.io/kube-controller-manager v0.22.6 | ||
k8s.io/kube-proxy => k8s.io/kube-proxy v0.22.6 | ||
k8s.io/kube-scheduler => k8s.io/kube-scheduler v0.22.6 | ||
k8s.io/kubectl => k8s.io/kubectl v0.22.6 | ||
k8s.io/kubelet => k8s.io/kubelet v0.22.6 | ||
k8s.io/kubernetes => k8s.io/kubernetes v1.22.6 | ||
k8s.io/legacy-cloud-providers => k8s.io/legacy-cloud-providers v0.22.6 | ||
k8s.io/metrics => k8s.io/metrics v0.22.6 | ||
k8s.io/mount-utils => k8s.io/mount-utils v0.22.6 | ||
k8s.io/pod-security-admission => k8s.io/pod-security-admission v0.22.6 | ||
k8s.io/sample-apiserver => k8s.io/sample-apiserver v0.22.6 | ||
) |
Oops, something went wrong.