Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: KEP-4381: DRA: network-attached resources #4612

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 36 additions & 3 deletions keps/sig-node/4381-dra-structured-parameters/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,8 @@ inside a container.

- Support node-local resources

- Support network-attached resources when using a single scheduler

- Support claim parameters that are specified in a vendor CRD as
an alternative to letting users directly specify parameters with
the in-tree type. This provides a user experience that is similar to
Expand All @@ -391,7 +393,7 @@ inside a container.
Proposal](https://docs.google.com/document/d/1qKiIVs9AMh2Ua5thhtvWqOqW0MSle_RV3lfriO1Aj6U/edit#heading=h.jzfmfdca34kj)
included such an approach.

* Support network-attached resources
* Support network-attached resources when using a multiple schedulers

## Proposal

Expand Down Expand Up @@ -546,6 +548,12 @@ the DRA drivers providing content for those objects. It might be possible to
support version skew (= keeping kubelet at an older version than the control
plane and the DRA drivers) in the future, but currently this is out of scope.

For network-attached resources, the DRA driver is responsible for discovering
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, ResourceSlice is generated with the name <node_name>-<driver_name>-<random_string>, but if setting the NodeSelector of ResourceSlice, would it be <driver_name>-<random_string>?
https://github.com/kubernetes/kubernetes/blob/v1.30.0/pkg/kubelet/cm/dra/plugin/noderesources.go#L470

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to name those ResourceSlices would be entirely up to the driver. What matters isn't the name, only the content.

I am moving the ResourceSlice controller out of kubelet into the k8s.io/dynamic-resource-allocation package as part of kubernetes/kubernetes#124274, so drivers could reuse that (eventually - right now in that PR it doesn't support network-attached resources yet).

available resources and publishing them in ResourceSlice object(s) where
`nodeSelector` is set. That selector determines which nodes have access to the
resources in each object. An empty selector can be used for resources that are
available in the entire cluster.

Embedded inside each `ResourceSlice` is the representation of the resources
managed by a driver according to a specific "structured model". In the example
seen below, the structured model in use is called `namedResources`:
Expand Down Expand Up @@ -1198,19 +1206,28 @@ needed and there is a single owner (kubelet).

```go
// ResourceSlice provides information about available
// resources on individual nodes.
// resources on individual nodes or in the cluster.
type ResourceSlice struct {
metav1.TypeMeta
// Standard object metadata
metav1.ObjectMeta

// NodeName identifies the node which provides the resources
// if they are local to a node.
// if they are local to a node. NodeName and NodeSelector
// are mutually exclusive. One of them must be set.
//
// A field selector can be used to list only ResourceSlice
// objects with a certain node name.
// +optional
NodeName string

// NodeSelector identifies all nodes that the resources
// could be accessed from if they are not local to a node.
// NodeName and NodeSelector are mutually exclusive. One of
// them must be set.
// +optional
NodeSelector *v1.NodeSelector

// DriverName identifies the DRA driver providing the capacity information.
// A field selector can be used to list only ResourceSlice
// objects with a certain driver name.
Expand Down Expand Up @@ -1684,6 +1701,22 @@ type AllocationResultModel struct {
}
```

At the moment, the content of the AllocationResult is the source of truth for
which resources in the cluster are allocated. The assumption is that
kube-scheduler owns those resources and thus can allocate them for a claim
without further coordination during a pod scheduling cycle.

For node-local resources, this assumption is reasonable. Running multiple
schedulers in the cluster which schedule to the same node is already
problematic because those different schedulers also assume that they own the
traditional resources of that node (CPU, RAM, extended resources).

For network-attached resources, this assumption is more problematic. There may
be valid reasons for running multiple schedulers for disjoint node sets. A
method how such schedulers can safely allocate network-attached resources would
be a useful future extension. For now, a cluster with network-attached resources
and multiple schedulers is not a supported use case.

##### ResourceClaimTemplate

```go
Expand Down