Skip to content

Commit

Permalink
Add pros/cons for status. Switch to http local endpoint
Browse files Browse the repository at this point in the history
  • Loading branch information
dashpole committed Nov 1, 2018
1 parent 7194695 commit e5f056f
Showing 1 changed file with 22 additions and 30 deletions.
52 changes: 22 additions & 30 deletions keps/sig-node/compute-device-assignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,45 +55,33 @@ In this document we will discuss the motivation and code changes required for in

## Changes

Add a v1alpha1 Kubelet GRPC service, at `/var/lib/kubelet/pod-resources/kubelet.sock`, which returns information about the kubelet's assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager. The GRPC Service returns a single PodResourcesResponse, which is shown in proto below:
```protobuf
// PodResources is a service provided by the kubelet that provides information about the
// node resources consumed by pods and containers on the node
service PodResources {
rpc List(ListPodResourcesRequest) returns (ListPodResourcesResponse) {}
}
// ListPodResourcesRequest is the request made to the PodResources service
message ListPodResourcesRequest {}
// ListPodResourcesResponse is the response returned by List function
message ListPodResourcesResponse {
repeated PodResources pod_resources = 1;
}
// PodResources contains information about the node resources assigned to a pod
message PodResources {
string name = 1;
string namespace = 2;
repeated ContainerResources containers = 3;
Add a v1alpha1 Kubelet HTTP service, at `unix:///var/lib/kubelet/read-only/kubelet.sock` (for unix architectures; different for other architectures), `http:/podresources`, which returns information about the kubelet's assignment of devices to containers. It obtains this information from the internal state of the kubelet's Device Manager. The HTTP Service returns a single JSON PodResourcesList, which is shown in go code below:
```golang
// PodResourcesList is a list of PodResources
type PodResourcesList []PodResources

// PodResources contains information about the resources assigned to a pod
type PodResources struct {
Name string `json:"name"`
Namespace string `json:"namespace"`
Containers []ContainerResources `json:"containers,omitempty" patchStrategy:"merge" patchMergeKey:"name"`
}

// ContainerResources contains information about the resources assigned to a container
message ContainerResources {
string name = 1;
repeated ContainerDevices devices = 2;
type ContainerResources struct {
Name string `json:"name"`
Devices []ContainerDevices `json:"devices,omitempty" patchStrategy:"merge" patchMergeKey:"resourceName"`
}

// ContainerDevices contains information about the devices assigned to a container
message ContainerDevices {
string resource_name = 1;
repeated string device_ids = 2;
// ContainerDevices contains information about the resources assigned to a container
type ContainerDevices struct {
ResourceName string `json:"resourceName"`
DeviceIds []string `json:"deviceIds"`
}
```

### Potential Future Improvements

* Add `ListAndWatch()` function to the GRPC endpoint so monitoring agents don't need to poll.
* Add identifiers for other resources used by pods to the `PodResources` message.
* For example, persistent volume location on disk

Expand All @@ -109,7 +97,11 @@ message ContainerDevices {
* Does not include any reference to resource names. Monitoring agentes must identify devices by the device or environment variables passed to the pod or container.

### Add a field to Pod Status.
* Allows for observation of container to device bindings local to the node through the `/pods` endpoint
* Pros:
* Allows for observation of container to device bindings local to the node through the `/pods` endpoint
* Cons:
* Only consumed locally, which doesn't justify an API change
* Device Bindings are immutable after allocation, and are _debatably_ observable (they can be "observed" from the local checkpoint file). Device bindings are generally a poor fit for status.

### Use the Kubelet Device Manager Checkpoint file
* Allows for observability of device to container bindings through what exists in the checkpoint file
Expand Down

0 comments on commit e5f056f

Please sign in to comment.