-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: KEP-4381: DRA: avoid kubelet API version dependency with REST proxy #4615
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -110,9 +110,11 @@ SIG Architecture for cross-cutting KEPs). | |
- [PreBind](#prebind) | ||
- [Unreserve](#unreserve) | ||
- [kubelet](#kubelet) | ||
- [Managing resources](#managing-resources) | ||
- [Communication between kubelet and resource kubelet plugin](#communication-between-kubelet-and-resource-kubelet-plugin) | ||
- [NodeListAndWatchResources](#nodelistandwatchresources) | ||
- [REST proxy](#rest-proxy) | ||
- [Security](#security) | ||
- [gRPC API](#grpc-api) | ||
- [Managing resources](#managing-resources) | ||
- [NodePrepareResource](#nodeprepareresource) | ||
- [NodeUnprepareResources](#nodeunprepareresources) | ||
- [Simulation with CA](#simulation-with-ca) | ||
|
@@ -531,20 +533,15 @@ the kubelet, as described below. However, the source of this data may vary; for | |
example, a cloud provider controller could populate this based upon information | ||
from the cloud provider API. | ||
|
||
In the kubelet case, each kubelet publishes kubelet publishes a set of | ||
`ResourceSlice` objects to the API server with content provided by the | ||
corresponding DRA drivers running on its node. Access control through the node | ||
authorizer ensures that the kubelet running on one node is not allowed to | ||
create or modify `ResourceSlices` belonging to another node. A `nodeName` | ||
field in each `ResourceSlice` object is used to determine which objects are | ||
managed by which kubelet. | ||
|
||
**NOTE:** `ResourceSlices` are published separately for each driver, using | ||
whatever version of the `resource.k8s.io` API is supported by the kubelet. That | ||
same version is then also used in the gRPC interface between the kubelet and | ||
the DRA drivers providing content for those objects. It might be possible to | ||
support version skew (= keeping kubelet at an older version than the control | ||
plane and the DRA drivers) in the future, but currently this is out of scope. | ||
In the kubelet case, each driver running on a node publishes a set of | ||
`ResourceSlice` objects to the API server for its own resources. | ||
These requests are [proxied by kubelet](#kubelet-rest-proxy) | ||
and thus seen as coming from the kubelet by the apiserver. | ||
Access control through the node | ||
authorizer ensures that the drivers running on one node are not allowed to | ||
create or modify `ResourceSlices` belonging to another node. The `nodeName` | ||
and `driverName` fields in each `ResourceSlice` object are used to determine which objects are | ||
managed by which driver instance. | ||
|
||
Embedded inside each `ResourceSlice` is the representation of the resources | ||
managed by a driver according to a specific "structured model". In the example | ||
|
@@ -931,7 +928,7 @@ Several components must be implemented or modified in Kubernetes: | |
ResourceClaim (directly or through a template) and ensure that the | ||
resource is allocated before the Pod gets scheduled, similar to | ||
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/volume/scheduling/scheduler_binder.go | ||
- Kubelet must be extended to retrieve information from ResourceClaims | ||
- Kubelet must be extended to manage ResourceClaims | ||
and to call a resource kubelet plugin. That plugin returns CDI device ID(s) | ||
which then must be passed to the container runtime. | ||
|
||
|
@@ -1188,13 +1185,13 @@ drivers are expected to be written for Kubernetes. | |
|
||
##### ResourceSlice | ||
|
||
For each node, one or more ResourceSlice objects get created. The kubelet | ||
publishes them with the node as the owner, so they get deleted when a node goes | ||
For each node, one or more ResourceSlice objects get created. The drivers | ||
on a node publish them with the node as the owner, so they get deleted when a node goes | ||
down and then gets removed. | ||
|
||
All list types are atomic because that makes tracking the owner for | ||
server-side-apply (SSA) simpler. Patching individual list elements is not | ||
needed and there is a single owner (kubelet). | ||
needed and there is a single owner. | ||
|
||
```go | ||
// ResourceSlice provides information about available | ||
|
@@ -2049,6 +2046,247 @@ Unreserve is called in two scenarios: | |
|
||
### kubelet | ||
|
||
#### Communication between kubelet and resource kubelet plugin | ||
|
||
Resource kubelet plugins are discovered through the [kubelet plugin registration | ||
mechanism](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#device-plugin-registration). A | ||
new "ResourcePlugin" type will be used in the Type field of the | ||
[PluginInfo](https://pkg.go.dev/k8s.io/kubelet/pkg/apis/pluginregistration/v1#PluginInfo) | ||
response to distinguish the plugin from device and CSI plugins. | ||
|
||
Under the advertised Unix Domain socket the kubelet plugin provides the | ||
k8s.io/kubelet/pkg/apis/dra gRPC interface. It was inspired by | ||
[CSI](https://github.com/container-storage-interface/spec/blob/master/spec.md), | ||
with “volume” replaced by “resource” and volume specific parts removed. | ||
|
||
#### REST proxy | ||
|
||
Previously, kubelet retrieved ResourceClaims and published ResourceSlices on | ||
behalf of DRA drivers on the node. The information included in those got passed | ||
between API server, kubelet, and kubelet plugin using the version of the | ||
resource.k8s.io used by the kubelet. Combining a kubelet using some older API | ||
version with a plugin using a new version was not possible because conversion | ||
of the resource.k8s.io types is only supported in the API server and an old | ||
kubelet wouldn't know about a new version anyway. | ||
|
||
Keeping kubelet at some old release while upgrading the control and DRA drivers | ||
is desirable and officially supported by Kubernetes. To support the same when | ||
using DRA, the kubelet now provides a REST proxy via gRPC that can be used by | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems generally useful for a range of version skew scenarios between on-node agents and the apiserver, as well as security controls. I'm slightly concerned as to the scalability of this approach (kubelet acting as gateway) but the upsides involved for managing the impact of per node agents is very interesting. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
OTOH, kube-apiserver will be rate-limiting requests anyway, so I doubt that it would really be kubelet that would be a gatekeeper here - in such case those would probably be throttled server side anyway. |
||
drivers to execute HTTP requests against the API server. The drivers determine | ||
the version of the resource.k8s.io. In those few cases where the kubelet needs | ||
to access the resource.k8s.io API itself, it does so with a dynamic client. | ||
|
||
The alternative to implementing a generic REST proxy would have been to create | ||
dedicated gRPC APIs for all operations that are needed by a driver. This is a | ||
larger API and prevents using the normal client-go APIs. The way it is | ||
implemented now, client-go in the plugin is instantiated with a HTTP | ||
roundtripper provided by | ||
k8s.io/dynamic-resource-allocation/restproxy. Implementations of that helper | ||
code in other languages than Go are possible and may provide similar benefits. | ||
|
||
##### Security | ||
|
||
Requests originating from a driver will be sent to the API server with | ||
credentials of the kubelet. This has the advantage that the node authorizer can | ||
be used to limit access to objects belonging to the node. The downside is that | ||
a driver might abuse this to access some other resources that kubelet has | ||
access to, like pods. To prevent this, the REST proxy filters requests by path | ||
and method and only passes requests through which the driver is meant to have | ||
access to (ResourceSlice and ResourceClaim). Access to ResourceClaims is further | ||
limited to read-only access by the node authorizer. | ||
|
||
In addition, the REST proxy adds `nodeName` and `driverName` field selectors to | ||
the header of requests that list ResourceSlice objects. This is mostly for | ||
convenience because for a PUT of a ResourceSlice, the driver has to be trusted | ||
to not create an object for some other driver on the node. This cannot be | ||
Comment on lines
+2100
to
+2101
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a downside of the proxy path, right? The unstructured / dynamic client aggregation approach could have checked the driver association field, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct. However, the "unstructured / dynamic client" then has other downsides. Right now, it can check this because we assume that all future ResourceSlices will be of the form "nodeName + driverName + other fields". Without knowing anything about "other fields", it's hard to write a controller that synchronizes the content in the ResourceSlice objects with the desired content - not impossible, but harder. The other downside if we consider apiserver traffic is the usage of JSON for the request bodies (both ways) instead of protobuf. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we want kubelet in the middle here if drivers are going to be doing more than write-only status reporting? If they are watching / getting / reading / reconciling status, that seems more like a normal client to me. Is the only reason for the node proxy so they can piggy-back on kubelet credentials and get authorized? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The advantage is indeed that the node authorizer continues to work. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. but... not really the way we want, since it lets devices stomp each others' status? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Drivers on the same node have to trust each other. They often run with elevated privileges, so they can already do much harm locally even without this shared access to ResourceSlice objects of the node. That's still better than having to trust all drivers anywhere in the cluster. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I feel like this is the ideal scenario for us to build out the authz capabilities of node bound SA tokens (possibly combined with #4600). We have a good story for node agents that want to perform write requests via VAP+SA node ref. Having something similar to that for reads seems broadly useful. Building an entire gRPC proxy as a one-off for DRA seems like the wrong approach. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The REST proxy could be reused. But I agree, figuring out how to do this through normal auth mechanisms is the better approach. That would be out-of-scope for this KEP, though. Instead, I'll put in some wording along the lines of:
Does that sound okay? Then I'll close this PR and create a different one with that approach. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @pohly I added this to the SIG Auth meeting for May 22nd. Hopefully we can hash out a concrete path forward. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It gets even more interesting with some uses cases that people have brought up for DRA: a DRA driver for a NIC runs on the node and must publish network-related information, ideally to the ResourceClaim status. It must do that itself because kubelet might not know what that information is due to version skew. The node authorizer can limit write access to ResourceClaims that are actually in use on the node because it builds the graph (node -> pod -> claim). In contrast to ResourceSlice, there isn't a specific field in a ResourceClaim that can be checked. |
||
checked by the REST proxy because it would have to decode the opaque request | ||
body. | ||
|
||
##### gRPC API | ||
|
||
What makes implementation of the REST proxy complicated is that the kubelet | ||
acts as gRPC client and the plugins as gRPC server. gRPC itself doesn't support | ||
requests from a server to a client. The REST proxy emulates that change of | ||
direction by creating a gRPC stream. Each message sent through that stream by | ||
the plugin represents one REST request. The entire request body is included | ||
directly. This works because requests are small enough. Each request has a | ||
unique ID. | ||
|
||
The response from the API server is delivered through multiple gRPC calls which | ||
contain that same unique ID and incrementally provide the response headers and | ||
the response body data as it comes in from the API server. These methods block | ||
if the API server delivers data too fast, which in turn slows down the API | ||
server. Delivery of the data continues until either side closes their end of | ||
the stream. Long-running requests, like watching a resource, are supported. | ||
|
||
This sequence diagram shows the initialization of the different components and | ||
the execution of one REST request: | ||
|
||
```mermaid | ||
sequenceDiagram | ||
participant apiserver | ||
box kubelet | ||
participant plugins as plugin manager | ||
participant manager as DRA manager | ||
participant proxy as REST proxy | ||
participant proxyreader as REST proxy reader | ||
end | ||
box DRA driver plugin | ||
participant grpc as gRPC server | ||
participant roundtripper as REST roundtripper | ||
participant restclient as REST client | ||
end | ||
Note over grpc, roundtripper: handle gRPC via registration socket<br>and plugin socket | ||
plugins -->> grpc: pluginregistration.Registration/GetInfo | ||
grpc -->> plugins: GetInfo response: PluginInfo{Type:DRAPlugin} | ||
plugins -->> grpc: pluginregistration.Registration/NotifyRegistrationStatus | ||
grpc -->> plugins: | ||
plugins ->> manager: add plugin | ||
Note over roundtripper,restclient: The REST client can<br>be used immediately,<br>but the roundtripper blocks until<br>it has a stream. | ||
loop while plugin is registered | ||
manager -->> grpc: REST/NodeObject{Name, UID} | ||
grpc -->> manager: | ||
manager ->>+ proxy: start | ||
%% Strictly speaking, the proxy gets created here (one for each plugin). | ||
%% Mermaid cannot model that when using boxes (https://github.com/mermaid-js/mermaid/issues/5023) | ||
Note right of proxy: one proxy per plugin | ||
proxy -->> roundtripper: REST/Proxy | ||
restclient ->> roundtripper: GET /api/... | ||
roundtripper -->> proxy: Request{Id:1,Method:GET, ...} | ||
Note right of proxy: checks allow list for methods+path,<br>adds field filter (driverName, nodeName) | ||
proxy ->>+ proxyreader: start | ||
Note right of proxyreader: one per request | ||
proxy ->> apiserver: HTTP GET | ||
par Response delivery | ||
loop while there is response data | ||
apiserver ->> proxyreader: write response | ||
proxyreader -->> roundtripper: ReplyMessage{ID:1,Header:...,Body:...} | ||
roundtripper -->> proxyreader: | ||
end | ||
and | ||
loop while there is response data | ||
roundtripper ->> restclient: Read | ||
end | ||
end | ||
deactivate proxyreader | ||
roundtripper -->> proxy: close REST/Proxy stream | ||
deactivate proxy | ||
end | ||
``` | ||
|
||
The corresponding gRPC API is defined in | ||
`k8s.io/dynamic-resource-allocation/apis/restproxy`: | ||
|
||
``` | ||
// REST is the the gRPC service on the side which issues REST requests. | ||
// | ||
// Because a gRPC server cannot send requests to its client, | ||
// a stream gets established by the client where each response | ||
// is a REST request, which then gets handled by the client. | ||
service REST { | ||
// Proxy is called by the REST proxy to enable sending | ||
// REST requests. It gets called again after errors. | ||
// | ||
// Each stream response is a single REST request. The response | ||
// is returned by the proxy through one or more Reply | ||
// calls. | ||
rpc Proxy (ProxyMessage) | ||
returns (stream Request) {} | ||
|
||
// Reply provides part of the response for a REST request. | ||
rpc Reply (ReplyMessage) | ||
returns (ReplyResponse) {} | ||
|
||
// NodeObject is called as soon as kubelet has information | ||
// about its node object. It's not called when used elsewhere. | ||
rpc NodeObject(NodeObjectRequest) | ||
returns (NodeObjectResponse) {} | ||
} | ||
|
||
message ProxyMessage { | ||
// Intentionally empty. | ||
} | ||
|
||
message Request { | ||
// Id is used as identifier for all response messages for this | ||
// request. It is included in all ReplyMessages for this Request. | ||
int64 id = 1; | ||
|
||
string method = 2; | ||
string path = 3; | ||
string rawQuery = 4; | ||
map<string, RESTHeader> header = 5; | ||
|
||
// Body contains the entire request body data. | ||
bytes body = 6; | ||
} | ||
|
||
message RESTHeader { | ||
repeated string values = 1; | ||
} | ||
|
||
// ReplyMessage is one of many replies that are sent | ||
// by the proxy for each Request. If the error and/or close are set, | ||
// then the request has failed and no further replies are going to | ||
// be sent. | ||
// | ||
// The proxy waits for the ReplyResponse before sending the next | ||
// ReplyMessage. This ensures that the gRPC server receives | ||
// the body chunks in the right order. | ||
message ReplyMessage { | ||
// Id matches the Id in the Request that this reply belongs to. | ||
int64 id = 1; | ||
|
||
// Error is set if and only if executing the request encountered a problem. | ||
string error = 2; | ||
|
||
// Close indicates that the end of the body has been reached. | ||
bool close = 3; | ||
|
||
// Header contains the response from the REST server. It is | ||
// set in all reply messages. | ||
ResponseHeader header = 4; | ||
|
||
// BodyOffset is the index of the body data in the overall response body. | ||
int64 body_offset = 5; | ||
|
||
// Body contains some of the response body data. | ||
// The entire data is provided in chunks in multiple | ||
// replies. A reply may provide an error, indicate the | ||
// end of the response data, and contain some more data. | ||
bytes body = 6; | ||
} | ||
|
||
message ResponseHeader { | ||
string status = 1; // e.g. "200 OK" | ||
int32 status_code = 2; // e.g. 200 | ||
string proto = 3; // e.g. "HTTP/1.0" | ||
int32 proto_major = 4; // e.g. 1 | ||
int32 proto_minor = 5; // e.g. 0 | ||
map<string, RESTHeader> header = 6; | ||
|
||
// ContentLength is the total expect length of the response body. | ||
int64 content_length = 7; | ||
} | ||
|
||
message ReplyResponse { | ||
// Close is true if the client is not interested in receiving more reply data. | ||
bool close = 1; | ||
} | ||
|
||
message NodeObjectRequest { | ||
string name = 1; | ||
string uid = 2; | ||
} | ||
|
||
message NodeObjectResponse { | ||
} | ||
``` | ||
|
||
Adding `NodeObject` to this gRPC interface was done because it was | ||
convenient. The information about the node is used by the ResourceSlice | ||
controller, not the REST proxy itself. | ||
|
||
#### Managing resources | ||
|
||
kubelet must ensure that resources are ready for use on the node before running | ||
|
@@ -2068,38 +2306,13 @@ successfully before allowing the pod to be deleted. This ensures that network-at | |
for other Pods, including those that might get scheduled to other nodes. It | ||
also signals that it is safe to deallocate and delete the ResourceClaim. | ||
|
||
The kubelet uses a specific version of the resource.k8s.io API for these | ||
checks. Version skew between kubelet and the control plane is supported as long | ||
as the apiserver still provides ResourceClaim objects with the version needed | ||
by the kubelet. | ||
|
||
![kubelet](./kubelet.png) | ||
|
||
#### Communication between kubelet and resource kubelet plugin | ||
|
||
Resource kubelet plugins are discovered through the [kubelet plugin registration | ||
mechanism](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#device-plugin-registration). A | ||
new "ResourcePlugin" type will be used in the Type field of the | ||
[PluginInfo](https://pkg.go.dev/k8s.io/kubelet/pkg/apis/pluginregistration/v1#PluginInfo) | ||
response to distinguish the plugin from device and CSI plugins. | ||
|
||
Under the advertised Unix Domain socket the kubelet plugin provides the | ||
k8s.io/kubelet/pkg/apis/dra gRPC interface. It was inspired by | ||
[CSI](https://github.com/container-storage-interface/spec/blob/master/spec.md), | ||
with “volume” replaced by “resource” and volume specific parts removed. | ||
|
||
##### NodeListAndWatchResources | ||
|
||
NodeListAndWatchResources returns a stream of NodeResourcesResponse objects. | ||
At the start and whenever resource availability changes, the | ||
plugin must send one such object with all information to the kubelet. The | ||
kubelet then syncs that information with ResourceSlice objects. | ||
|
||
``` | ||
message NodeListAndWatchResourcesRequest { | ||
} | ||
|
||
message NodeListAndWatchResourcesResponse { | ||
repeated k8s.io.api.resource.v1alpha2.ResourceModel resources = 1; | ||
} | ||
``` | ||
|
||
##### NodePrepareResource | ||
|
||
This RPC is called by the kubelet when a Pod that wants to use the specified | ||
|
@@ -2155,20 +2368,16 @@ message Claim { | |
// The name of the Resource claim (ResourceClaim.meta.Name) | ||
// This field is REQUIRED. | ||
string name = 3; | ||
// Resource handle (AllocationResult.ResourceHandles[*].Data) | ||
// This field is OPTIONAL. | ||
string resource_handle = 4; | ||
// Structured parameter resource handle (AllocationResult.ResourceHandles[*].StructuredData). | ||
// This field is OPTIONAL. If present, it needs to be used | ||
// instead of resource_handle. It will only have a single entry. | ||
// | ||
// Using "repeated" instead of "optional" is a workaround for https://github.com/gogo/protobuf/issues/713. | ||
repeated k8s.io.api.resource.v1alpha2.StructuredResourceHandle structured_resource_handle = 5; | ||
} | ||
``` | ||
|
||
`resource_handle` and `structured_resource_handle` will be set depending on how | ||
the claim was allocated. See also KEP #3063. | ||
The allocation result is intentionally not included here. The content of that | ||
field is version-dependent. The kubelet would need to discover in which version | ||
each plugin wants the data, then potentially get the claim multiple times | ||
because only the apiserver can convert between versions. Instead, each plugin | ||
is required to get the claim itself using the REST proxy. In the most common | ||
case of one plugin per claim, that doubles the number of GETs for each claim | ||
(once by the kubelet, once by the plugin). | ||
|
||
``` | ||
message NodePrepareResourcesResponse { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/sig api-machinery auth
/cc @jpbetz @deads2k @enj
For visibility to a
kube-apiserver <-- kubelet <-- client
proxied API surface proposalThe read part of this is similar in some ways to what was proposed in https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4188-kubelet-pod-readiness-api#proposed-api
A lot of the discussion around that proposal seems relevant to this one (sig-arch 2023-08-24, https://docs.google.com/document/d/1BlmHq5uPyBUDlppYqAAzslVbAO8hilgjqZUTaNXUhKM/edit#bookmark=id.vud1o04xj4iv, https://www.youtube.com/watch?v=toN7t_y4zCk&t=22m45s)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One key difference is that I am not proposing to add a new kubelet socket. I was worried about security implications around that. Instead, kubelet connects to plugins the same way as before and takes requests through one stream per plugin which is kept open by each plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given a desire to read/write effectively from the kube-apiserver and the development of validatingadmissionpolicy + serviceaccount node claims + downward API for node name that can be used to self-restrict reads (for convenience) and enforce write requirements (must be from the same node), why is the proxy a better and more secure choice?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar enough with that alternative to do a comparison. Can you point me to documentation?
The ask is to ensure that a DRA driver deployed as daemonset gets RBAC permissions which allow it to read/create/update ResourceSlice objects where the "nodeName" field is the same as the node on which the pod is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ResourceClaim, the ask is to ensure that the pod only gets read permission for ResourceClaims referenced by pods which have been scheduled to the node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@deads2k: I looked through https://kubernetes.io/docs/reference/access-authn-authz/validating-admission-policy and https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/. It's not clear to me how I can connect the identity of the driver pod with the node that it is running on and that identity with rules that restricts what that pod can access.
The node authorizer uses hand-crafted Go code for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See kubernetes/kubernetes#124711 for the VAP+SA example David is referring to.
Note that admission only covers writes, so reads cannot be restricted in this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be the key part:
object.metadata.name == request.userInfo.extra[\"authentication.kubernetes.io/node-name\"][0])
Does
object
provide access to any field?Is it possible to extend what gets added to the credentials? If there was a
dra.kubernetes.io/driver-name
, then the same check could be done for thedriverName
field in ResourceSlice. That would be better than what the REST proxy can do.I'm fine with dropping the read check on ResourceClaim. Node authorizer also has a "loop hole" there because it does limit what data gets returned by watches.