Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRA: Integrates with DRA and CDI #3329

Merged
merged 13 commits into from
Apr 29, 2024
Merged
1 change: 1 addition & 0 deletions .github/codespell-ignorewords
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@ cyclinder
shouldnot
Requestor
passt
ro
1 change: 1 addition & 0 deletions .github/workflows/e2e-init.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ jobs:
RESULT=0
make ${{ matrix.e2e_test_mode }} -e E2E_CLUSTER_NAME=${{ env.E2E_CLUSTER_NAME }} \
-e E2E_GINKGO_LABELS=${E2E_LABELS} \
-e E2E_KIND_IMAGE_TAG=${{ inputs.k8s_version }} \
-e E2E_TIMEOUT=${{ env.E2E_TIME_OUT }} \
-e E2E_IP_FAMILY=${{ inputs.ip_family }} || RESULT=1
if ((RESULT==0)) ; then
Expand Down
8 changes: 5 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -326,7 +326,8 @@ e2e_init_cilium_with_ebpf:

.PHONY: e2e_init_calico
e2e_init_calico:
$(QUIET) make e2e_init -e INSTALL_OVERLAY_CNI=true -e INSTALL_CALICO=true -e INSTALL_CILIUM=false -e E2E_SPIDERPOOL_ENABLE_SUBNET=false
$(QUIET) make e2e_init -e INSTALL_OVERLAY_CNI=true -e INSTALL_CALICO=true -e INSTALL_CILIUM=false -e E2E_SPIDERPOOL_ENABLE_SUBNET=false \
E2E_SPIDERPOOL_ENABLE_DRA=true

.PHONY: e2e_init_cilium
e2e_init_cilium:
Expand All @@ -339,11 +340,12 @@ e2e_test:

.PHONY: e2e_test_underlay
e2e_test_underlay:
$(QUIET) make e2e_test -e INSTALL_OVERLAY_CNI=false -e E2E_SPIDERPOOL_ENABLE_SUBNET=true
$(QUIET) make e2e_test -e INSTALL_OVERLAY_CNI=false -e E2E_SPIDERPOOL_ENABLE_SUBNET=true

.PHONY: e2e_test_calico
e2e_test_calico:
$(QUIET) make e2e_test -e INSTALL_OVERLAY_CNI=true -e INSTALL_CALICO=true -e INSTALL_CILIUM=false -e E2E_GINKGO_LABELS=overlay
$(QUIET) make e2e_test -e INSTALL_OVERLAY_CNI=true -e INSTALL_CALICO=true -e INSTALL_CILIUM=false -e E2E_GINKGO_LABELS=overlay,dra \
E2E_SPIDERPOOL_ENABLE_DRA=true

.PHONY: e2e_test_cilium
e2e_test_cilium:
Expand Down
8 changes: 8 additions & 0 deletions charts/spiderpool/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,14 @@ helm install spiderpool spiderpool/spiderpool --wait --namespace kube-system \
| `multus.multusCNI.log.logLevel` | the multus-CNI daemonset pod log level | `debug` |
| `multus.multusCNI.log.logFile` | the multus-CNI daemonset pod log file | `/var/log/multus.log` |

### dra parameters

| Name | Description | Value |
| -------------------- | -------------------------- | -------------- |
| `dra.enabled` | to enable dra feature | `false` |
| `dra.cdiRootPath` | the dir of cdi root | `/var/run/cdi` |
| `dra.hostDevicePath` | the dir path of the device | `""` |

### plugins parameters

| Name | Description | Value |
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: (unknown)
name: spiderclaimparameters.spiderpool.spidernet.io
spec:
group: spiderpool.spidernet.io
names:
categories:
- spiderpool
kind: SpiderClaimParameter
listKind: SpiderClaimParameterList
plural: spiderclaimparameters
shortNames:
- scp
singular: spiderclaimparameter
scope: Namespaced
versions:
- name: v2beta1
schema:
openAPIV3Schema:
description: SpiderClaimParameter is the Schema for the spiderclaimparameters
API.
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: ClaimParameterSpec defines the desired state of SpiderClaimParameter.
properties:
ippools:
items:
type: string
type: array
multusNames:
items:
type: string
type: array
rdma:
default: false
type: boolean
resources:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: ResourceList is a set of (resource name, quantity) pairs.
type: object
type: object
type: object
served: true
storage: true
4 changes: 4 additions & 0 deletions charts/spiderpool/templates/configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,10 @@ data:
{{- else}}
clusterSubnetDefaultFlexibleIPNumber: 0
{{- end }}
dra:
enabled: {{ .Values.dra.enabled }}
cdiRootPath: {{ .Values.dra.cdiRootPath }}
hostDevicePath: {{ .Values.dra.hostDevicePath }}
{{- if .Values.multus.multusCNI.install }}
---
kind: ConfigMap
Expand Down
36 changes: 35 additions & 1 deletion charts/spiderpool/templates/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -243,10 +243,15 @@ spec:
{{- with .Values.spiderpoolAgent.extraEnv }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- with .Values.spiderpoolAgent.securityContext }}
{{- if or .Values.dra.enabled .Values.spiderpoolAgent.securityContext }}
securityContext:
{{- if .Values.dra.enabled }}
privileged: true
{{- end }}
{{- with .Values.spiderpoolAgent.securityContext }}
{{- toYaml . | nindent 8 }}
{{- end }}
{{- end }}
volumeMounts:
- name: config-path
mountPath: /tmp/spiderpool/config-map
Expand All @@ -259,6 +264,19 @@ spec:
- name: cni
mountPath: /host/etc/cni/net.d
{{- end }}
{{- if .Values.dra.enabled }}
- name: plugins-registry
mountPath: /var/lib/kubelet/plugins_registry
- name: plugins
mountPath: /var/lib/kubelet/plugins
mountPropagation: Bidirectional
- name: cdi
mountPath: /var/run/cdi
{{- if .Values.dra.hostDevicePath }}
- name: library
mountPath: {{ .Values.dra.hostDevicePath }}
{{- end }}
{{- end }}
{{- if .Values.spiderpoolAgent.extraVolumes }}
{{- include "tplvalues.render" ( dict "value" .Values.spiderpoolAgent.extraVolumeMounts "context" $ ) | nindent 8 }}
{{- end }}
Expand Down Expand Up @@ -289,6 +307,22 @@ spec:
- key: cni-conf.json
path: 00-multus.conf
{{- end }}
{{- if .Values.dra.enabled }}
- name: plugins-registry
hostPath:
path: /var/lib/kubelet/plugins_registry
- name: plugins
hostPath:
path: /var/lib/kubelet/plugins
- name: cdi
hostPath:
path: /var/run/cdi
{{- if .Values.dra.hostDevicePath }}
- name: library
hostPath:
path: {{ .Values.dra.hostDevicePath }}
{{- end }}
{{- end }}
{{- if .Values.spiderpoolAgent.extraVolumeMounts }}
{{- include "tplvalues.render" ( dict "value" .Values.spiderpoolAgent.extraVolumeMounts "context" $ ) | nindent 6 }}
{{- end }}
Expand Down
7 changes: 7 additions & 0 deletions charts/spiderpool/templates/resourceclass.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{{- if .Values.dra.enabled }}
apiVersion: resource.k8s.io/v1alpha2
kind: ResourceClass
metadata:
name: netresources.spidernet.io
driverName: netresources.spidernet.io
{{- end }}
27 changes: 27 additions & 0 deletions charts/spiderpool/templates/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,6 +122,33 @@ rules:
verbs:
- get
- list
- apiGroups:
- resource.k8s.io
resources:
- podschedulingcontexts
- podschedulingcontexts/status
- resourceclaims
- resourceclaims/status
- resourceclaimtemplates
- resourceclasses
verbs:
- get
- list
- patch
- update
- watch
- apiGroups:
- spiderpool.spidernet.io
resources:
- spiderclaimparameters
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- spiderpool.spidernet.io
resources:
Expand Down
10 changes: 10 additions & 0 deletions charts/spiderpool/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,16 @@ multus:
## @param multus.multusCNI.log.logFile the multus-CNI daemonset pod log file
logFile: "/var/log/multus.log"

## @section dra parameters
##
dra:
## @param dra.enabled to enable dra feature
enabled: false
## @param dra.cdiRootPath the dir of cdi root
cdiRootPath: "/var/run/cdi"
## @param dra.hostDevicePath the dir path of the device
hostDevicePath: ""

## @section plugins parameters
##
plugins:
Expand Down
4 changes: 4 additions & 0 deletions cmd/spiderpool-agent/cmd/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import (
"github.com/spidernet-io/spiderpool/pkg/subnetmanager"
spiderpooltypes "github.com/spidernet-io/spiderpool/pkg/types"
"github.com/spidernet-io/spiderpool/pkg/workloadendpointmanager"
"k8s.io/dynamic-resource-allocation/kubeletplugin"
)

var agentContext = new(AgentContext)
Expand Down Expand Up @@ -127,6 +128,9 @@ type AgentContext struct {
UnixServer *server.Server
MetricsHttpServer *http.Server

// dra
DraPlugin kubeletplugin.DRAPlugin

// client
unixClient *client.SpiderpoolAgentAPI

Expand Down
17 changes: 17 additions & 0 deletions cmd/spiderpool-agent/cmd/daemon.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,11 +17,13 @@ import (

"github.com/google/gops/agent"
"github.com/grafana/pyroscope-go"
"go.uber.org/zap"
apiruntime "k8s.io/apimachinery/pkg/runtime"
"k8s.io/client-go/rest"
"k8s.io/utils/ptr"
ctrl "sigs.k8s.io/controller-runtime"

draplugin "github.com/spidernet-io/spiderpool/pkg/dra/dra-plugin"
"github.com/spidernet-io/spiderpool/pkg/ipam"
"github.com/spidernet-io/spiderpool/pkg/ippoolmanager"
"github.com/spidernet-io/spiderpool/pkg/kubevirtmanager"
Expand Down Expand Up @@ -234,6 +236,16 @@ func DaemonMain() {
}
agentContext.unixClient = spiderpoolAgentAPI

if agentContext.Cfg.DraEnabled {
logger.Info("Begin to start dra-plugin Server")
agentContext.DraPlugin, err = draplugin.StartDRAPlugin(logger, agentContext.Cfg.DraCdiRootPath, agentContext.Cfg.DraHostDevicePath)
if err != nil {
logger.Fatal("failed to start dra-plugin server", zap.Error(err))
}
} else {
logger.Info("Dra feature is disable.")
}
cyclinder marked this conversation as resolved.
Show resolved Hide resolved

logger.Info("Set spiderpool-agent startup probe ready")
agentContext.IsStartupProbe.Store(true)

Expand Down Expand Up @@ -267,6 +279,11 @@ func WatchSignal(sigCh chan os.Signal) {
}
}

if agentContext.DraPlugin != nil {
logger.Debug("Stopping the dra-plugin server")
agentContext.DraPlugin.Stop()
}

// others...

}
Expand Down
15 changes: 15 additions & 0 deletions cmd/spiderpool-controller/cmd/daemon.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,15 @@ import (
"github.com/google/gops/agent"
"github.com/grafana/pyroscope-go"
"k8s.io/client-go/dynamic"
"k8s.io/client-go/informers"
"k8s.io/client-go/kubernetes"
ctrl "sigs.k8s.io/controller-runtime"

"github.com/spidernet-io/spiderpool/pkg/applicationcontroller"
"github.com/spidernet-io/spiderpool/pkg/applicationcontroller/applicationinformers"
"github.com/spidernet-io/spiderpool/pkg/constant"
"github.com/spidernet-io/spiderpool/pkg/coordinatormanager"
dracontroller "github.com/spidernet-io/spiderpool/pkg/dra/dra-controller"
"github.com/spidernet-io/spiderpool/pkg/election"
"github.com/spidernet-io/spiderpool/pkg/event"
"github.com/spidernet-io/spiderpool/pkg/gcmanager"
Expand Down Expand Up @@ -562,6 +564,19 @@ func setupInformers(k8sClient *kubernetes.Clientset) {
logger.Fatal(err.Error())
}
}

if controllerContext.Cfg.DraEnabled {
logger.Info("Begin to start DRA-Controller")
informerFactory := informers.NewSharedInformerFactory(k8sClient, 0 /* resync period */)
if err = dracontroller.StartController(controllerContext.InnerCtx,
time.Duration(controllerContext.Cfg.LeaseRetryGap)*time.Second,
crdClient, k8sClient, informerFactory,
controllerContext.Leader); err != nil {
logger.Fatal(err.Error())
}
} else {
logger.Info("the dra feature is disabled.")
}
cyclinder marked this conversation as resolved.
Show resolved Hide resolved
}

func checkWebhookReady() {
Expand Down
6 changes: 6 additions & 0 deletions docs/develop/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,3 +69,9 @@
| | support ipoib CNI for infiniband device | v0.9.0 | | |
| | support ib-sriov CNI for infiniband device | v0.9.0 | | |
| EgressGateway | egressGateway | v0.8.0 | | |
| Dynamic-Resource-Allocation | implement dra framework | v1.0.0 | | |
| | support for SpiderClaimParameter's rdmaAcc feature | v1.0.0 | | |
| | support for schedule pod by SpiderMultusConfig or SpiderIPPool | Todo | | |
| | unify the way device-plugin declares resources | Todo | | |


2 changes: 2 additions & 0 deletions docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,7 @@ nav:
- Node-based Topology: usage/network-topology.md
- RDMA with RoCE: usage/rdma-roce.md
- RDMA with Infiniband: usage/rdma-ib.md
- Dynamic-Resource-Allocation: usage/dra.md
- Multi-Cluster Networking: usage/submariner.md
- Access Service for Underlay CNI: usage/underlay_cni_service.md
- Bandwidth Manage for IPVlan CNI: usage/ipvlan_bandwidth.md
Expand All @@ -113,6 +114,7 @@ nav:
- CRD Spidercoordinator: reference/crd-spidercoordinator.md
- CRD SpiderEndpoint: reference/crd-spiderendpoint.md
- CRD SpiderReservedIP: reference/crd-spiderreservedip.md
- CRD SpiderClaimParameter: reference/crd-spiderclaimparameter.md
- Ifacer plugin: reference/plugin-ifacer.md
- IPAM plugin: reference/plugin-ipam.md
- Development:
Expand Down
Loading
Loading