Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create configuration level device plugin #565

Conversation

johnsonshih
Copy link
Contributor

@johnsonshih johnsonshih commented Mar 3, 2023

What this PR does / why we need it:
This PR is to implement the proposal of exposing resource at Configuration level. https://github.com/project-akri/akri-docs/blob/main/proposals/configuration-level-resources.md#configuration-level-resources
With Configuration level resource, users can select resource to use at configuration level without knowing the instance id beforehand.

Special notes for your reviewer:
Summary of changes in this PR
- CL resource and IL resource share the capacity pool. i.e. (# of allocated CL virtual devices + # of allocated IL virtual devices) <= capacity.
- The name of CL device plugin is the Akri Configuration name and follows the same convention of IL device plugin, i.e., replace ['.', '/'] with "-".
- The CL device plugin uses the same name schema as IL device plugin for virtual devices. The virtual device id reported by CL device plugin looks like _.
- ConfigurationDevicePlugin represent the behavior of CL device plugin.
- DevicePluginService contains a list_and_watch_message_sender to notify refreshing list_and_watch, used by the DevicePluginService internally, a copy of list_and_watch_message_sender is stored in the associated InstanceInfo, used by external entity to refresh the DPS's list_and_watch.
- Configuration DevicePluginService contains a list_and_watch_message_sender to notify refreshing list_and_watch, used by the Configuration DevicePluginService internally, a copy of list_and_watch_message_sender is store in the InstanceConfig, used by external entity to refresh the Configuration DPS's list_and_watch
- When IL DPS allocate a virtual device, it notify CL DPS to refresh list_and_watch ,and vice versa, CL DPS notify IL DPS to refresh list_and_watch when it allocates a virtual device.

If applicable:

  • this PR has an associated PR with documentation in akri-docs
  • this PR contains unit tests
  • added code adheres to standard Rust formatting (cargo fmt)
  • code builds properly (cargo build)
  • code is free of common mistakes (cargo clippy)
  • all Akri tests succeed (cargo test)
  • inline documentation builds (cargo doc)
  • all commits pass the DCO bot check by being signed off -- see the failing DCO check for instructions on how to retroactively sign commits

@rpieczon
Copy link

rpieczon commented Apr 4, 2023

Is there any chance to have it as a part of incoming release?

Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
…ge change

Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
@johnsonshih johnsonshih force-pushed the user/jshih/configuration-device-plugin branch from 5e3ed62 to bcc848e Compare April 4, 2023 16:21
@kate-goldenring
Copy link
Contributor

kate-goldenring commented Apr 7, 2023

@johnsonshih I haven't given this a thorough look through yet, but i spec-ed out this work last April to a working POC, so i wanted to share the diff with you: main...kate-goldenring:akri:config-level-resources-poc.

Rather than creating a separate ConfigurationDevicePluginService I added a internal_allocate_for_config call that is called from allocate depending on whether instance_name = config_name. This may be a simpler approach and result in less code duplication:

        if self.instance_name == self.config_name {
            match self
                .internal_allocate_for_config(requests, kube_interface)
                .await
            {
                Ok(resp) => Ok(resp),
                Err(e) => Err(e),
            }
        } else {
            match self.internal_allocate(requests, kube_interface).await {
                Ok(resp) => Ok(resp),
                Err(e) => Err(e),
            }
        }

Copy link
Contributor

@kate-goldenring kate-goldenring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for diving into this @johnsonshih! I think we can simplify this and still use one DevicePluginService per #565 (comment). You are also welcome to pull in anything from the POC I put together last year: main...kate-goldenring:akri:config-level-resources-poc. Sorry for not sharing it earlier 😵‍💫

Signed-off-by: Johnson Shih <jshih@microsoft.com>
@johnsonshih
Copy link
Contributor Author

Thanks for diving into this @johnsonshih! I think we can simplify this and still use one DevicePluginService per #565 (comment). You are also welcome to pull in anything from the POC I put together last year: main...kate-goldenring:akri:config-level-resources-poc. Sorry for not sharing it earlier face_with_spiral_eyes

Thanks for the information. I on purposely separate out the instance plugin service and configuration plugin service into two different struct as most of the fields used in DevicePluginService is specific to instance (e.g., instance_name), and the behavior of instance device plugin and configuration device plugin are different, too. Put both behaviors in the same DevicePluginService struct make it difficult to maintain.

I do have some follow up refactoring for CL resource after this PR committed. 'll definitely check the poc branch to see anything that we can leverage to improve the design and eliminate code duplication. Given the schedule, is it ok we focus on committing this PR? As long as the behavior is correct, we can improve our code base with follow up changes. Thanks!

Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Copy link
Contributor

@kate-goldenring kate-goldenring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this @johnsonshih. I still do not see the need for the ConfigurationDevicePluginService and the duplication of code. Can we aim to reduce the size of changes in this PR and maintain one DevicePluginService with conditional extra functionality if it is a Configuration level DP? I am concerned about maintainability and debugability and reviewability. I wonder if one approach would be to start with one DevicePluginService to reduce the size of this PR and then in a subsequent one if we feel like we want different services to add that. It is hard here to see what is changed verses relocated. Or maybe we can move the shared code of list_and_watch into shared methods or it could be argued that that consolidation could come in a separate PR, so I am talking in circles.

agent/src/util/device_plugin_service.rs Show resolved Hide resolved
agent/src/util/device_plugin_service.rs Outdated Show resolved Hide resolved
agent/src/util/device_plugin_service.rs Show resolved Hide resolved
agent/src/util/device_plugin_service.rs Outdated Show resolved Hide resolved
agent/src/util/device_plugin_service.rs Outdated Show resolved Hide resolved
agent/src/util/device_plugin_service.rs Outdated Show resolved Hide resolved
@johnsonshih
Copy link
Contributor Author

In this PR, each Configuration virtual devices are considered different, that's why you see the same instance is allocated to a pod multiple times. It's based on the proposal https://github.com/project-akri/akri-docs/blob/main/proposals/configuration-level-resources.md
The device plugin created for each Configuration will contain capacity * number of instances slots. Each slot will map to a "real" slot of an Instance device plugin. For example, after deploying a onvif Configuration with a capacity of 2, the NodeSpec of a node that could see both cameras would be:
Capacity:
akri.sh/akri-onvif: 4
akri.sh/akri-onvif-8120fe: 2
akri.sh/akri-onvif-a19705: 2

I'm going to create a switch in Configuration to support different behaviors for the Configuration virtual devices, i.e., introduce a switch in Configuration (switch name "configurationResourceFrom", possible values are "instance" and "deviceUsage". Default value of configurationResourceFrom : "instance")
configurationResourceFrom: instance => on a node, only one instance allocated.
configurationResourceFrom: deviceUsage => on a node, slots from an instance are considered different, so it's possible to have multiple slots (from the same instance) allocated as Configuration device plugin.

@kate-goldenring
Copy link
Contributor

I'm going to create a switch in Configuration to support different behaviors for the Configuration virtual devices, i.e., introduce a switch in Configuration (switch name "configurationResourceFrom", possible values are "instance" and "deviceUsage". Default value of configurationResourceFrom : "instance")

@johnsonshih I like the idea of making this configurable but i'd rather not expose users to the nuance of "deviceUsage" and IMO only contributors are familiar with that term. Maybe the switch can be uniqueDevices: true? I don't love that either but maybe we can brainstorm naming more

@kate-goldenring
Copy link
Contributor

@johnsonshih would you be able to demo this at the community meeting tomorrow?

@johnsonshih
Copy link
Contributor Author

I'm going to create a switch in Configuration to support different behaviors for the Configuration virtual devices, i.e., introduce a switch in Configuration (switch name "configurationResourceFrom", possible values are "instance" and "deviceUsage". Default value of configurationResourceFrom : "instance")

@johnsonshih I like the idea of making this configurable but i'd rather not expose users to the nuance of "deviceUsage" and IMO only contributors are familiar with that term. Maybe the switch can be uniqueDevices: true? I don't love that either but maybe we can brainstorm naming more

I'll change to use uniqueDevices.

@johnsonshih johnsonshih closed this Jun 7, 2023
@johnsonshih johnsonshih reopened this Jun 7, 2023
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
@johnsonshih johnsonshih marked this pull request as draft June 15, 2023 17:07
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
@johnsonshih johnsonshih marked this pull request as ready for review June 15, 2023 22:08
Copy link
Contributor

@kate-goldenring kate-goldenring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for continuing to iterate on this @johnsonshih. I have concerns about potential race cases and how it looks like the Configuration Device Plugin is reporting devices.

Comment on lines 75 to 79
#[derive(Clone, Debug)]
pub struct InstanceConfig {
pub usage_update_message_sender: Option<broadcast::Sender<ListAndWatchMessageKind>>,
pub instances: HashMap<String, InstanceInfo>,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add documentation around all public structs and fields (see InstanceInfo and InstanceConnectivityStatus)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add document comments

// Create a device plugin for the Configuration
let config_dp_name = get_device_configuration_name(&config_name);
trace!(
"start_discovery - create configuration device plugin {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"start_discovery - create configuration device plugin {}",
"internal_start_discovery - create configuration device plugin {}",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update message

}
Err(e) => {
error!(
"start_discovery - error {} building configuration device plugin",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"start_discovery - error {} building configuration device plugin",
"internal_start_discovery - error {} building configuration device plugin",

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update message

Comment on lines 81 to 88
impl InstanceConfig {
pub fn new() -> Self {
InstanceConfig {
usage_update_message_sender: None,
instances: HashMap::new(),
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the equivalent of what the InstanceConfig's fields default values would be. If you tag the struct with Default then you don't need this and can change all instantiations to InstanceConfig::default()

Suggested change
impl InstanceConfig {
pub fn new() -> Self {
InstanceConfig {
usage_update_message_sender: None,
instances: HashMap::new(),
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use Default trait

pub configuration_usage_slots: HashSet<String>,
}

#[derive(Clone, Debug)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tag with default instead of having constructor behave as default would (per comment below)

Suggested change
#[derive(Clone, Debug)]
#[derive(Clone, Debug, Default)]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add Default trait

.unwrap();
discovered_devices.insert(instance_name, virtual_devices);
}
// construct virtual device info list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consistency with comments (Capitalize the first letter of the comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update comments

);
Err(Status::new(
fn build_virtual_devices_list_for_instance(
device_map: HashMap<String, Vec<v1beta1::Device>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be able to take in a reference to avoid unnecessary cloning

Suggested change
device_map: HashMap<String, Vec<v1beta1::Device>>,
device_map: &HashMap<String, Vec<v1beta1::Device>>,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass reference instead of value

},
)
.await
.unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please propegate errors instead of unwrapping

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove unwrap()

device_usage_id
);
Err(Status::new(
fn build_virtual_devices_list_for_instance(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could use a different name. it isn't building virtual devices rather if i am reading it correctly, checking if there is a healthy slot. How about instances_with_healthy_devices and maybe a comment explaining the function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change function name to build_virtual_device_health_state_for_instance and add comments.

Err(Status::new(
fn build_virtual_devices_list_for_instance(
device_map: HashMap<String, Vec<v1beta1::Device>>,
) -> HashMap<String, String> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is behaving differently than i would expect. If I have a configuration with a capacity of 4 and 3 instances are discovered, i would expect 12 virtual devices to be reported. Why only 3 here? Kubelet keeps track of its allocate calls. It will not call allocate again on a device it has already allocated, therefor the other 3 capacity slots will never be used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a capacity of 4 and 3 instances are discovered, the Instance.deviceUsage is 12. These deviceUsage slots are shared by all IL and CL virtual devices from all nodes.
For CL resources on a node, every instance can only be allocated as CL virtual device once, hence CL DP reports 3, Instance-0, Instance-1, Instance-2. the CL virtual device-x is Healthy as long as the Instance-x has a free slot available or allocated as a CL virtual device.

If CL DP reports 12 virtual devices, when allocate one CL virtual device, CL DP need to refresh list_and_watch to flip the other virtual devices from the instance being allocated to Unhealthy, otherwise Kubernetes will call allocate (max) 12 times since Kubernetes think all virtual devices are healthy and can be allocated.

IMO, expose CL virtual devices from usage slot level is more difficult to track and easier to get out of sync since the list_and_watch is running in a different thread. For user's perspective, it's more intuitive to see the IL/CL resource availability if CL virtual devices are reported based on Instance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you successfully allocated N+1 CL devices when N instances are discovered with a capacity of 2? I do not believe it is possible to do this and for kubelet to be healthy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran this branch locally to test this out. You cannot use the full capacity of devices with the current implementation. For example, when deploying this helm chart (with the agent and controller running locally):

helm install akri akri-helm-charts/akri     $AKRI_HELM_CRICTL_CONFIGURATION     --set agent.enabled=false     --set controller.enabled=false    --set debugEcho.configuration.enabled=true     --set debugEcho.configuration.shared=false --set debugEcho.configuration.capcity=2

These devices are discovered

kagold@kagold-ThinkPad-X1-Carbon-6th:~$ kubectl get akrii
NAME                     CONFIG            SHARED   NODES        AGE
akri-debug-echo-a19705   akri-debug-echo   true     ["myNode"]   10m
akri-debug-echo-8120fe   akri-debug-echo   true     ["myNode"]   10m

The node is labeled with the following resources. Note how only 2 akri.sh/akri-debug-echo devices are "allocatable" despite each instance having a capacity of 2, so 4 should be available to pods:

Capacity:
  akri.sh/akri-debug-echo:         2
  akri.sh/akri-debug-echo-8120fe:  2
  akri.sh/akri-debug-echo-a19705:  2
...
Allocatable:
  akri.sh/akri-debug-echo:         2
  akri.sh/akri-debug-echo-8120fe:  2
  akri.sh/akri-debug-echo-a19705:  2

To illustrate the point, deploy a replicaset with 3 replicas:

 
apiVersion: apps/v1
kind: Deployment
metadata:
  name: debug-echo-deployment
  labels:
    app: debug-echo-broker
spec:
  replicas: 3
  selector:
    matchLabels:
      app: debug-echo-broker
  template:
    metadata:
      labels:
        app: debug-echo-broker
    spec:
      containers:
      - name: debug-echo-broker
        image: nginx
        resources:
          limits:                        
            akri.sh/akri-debug-echo: "1"
          requests:
            akri.sh/akri-debug-echo: "1"

Only two pods run with the third left pending due to no resource being available:

NAME                                     READY   STATUS    RESTARTS   AGE
debug-echo-deployment-57ccdc95d9-24726   0/1     Pending   0          3m59s
debug-echo-deployment-57ccdc95d9-2wl6b   1/1     Running   0          3m59s
debug-echo-deployment-57ccdc95d9-5zkvb   1/1     Running   0          3m59s
$ kubectl describe pod debug-echo-deployment-57ccdc95d9-24726
...
Events:
  Type     Reason            Age    From               Message
  ----     ------            ----   ----               -------
  Warning  FailedScheduling  4m18s  default-scheduler  0/1 nodes are available: 1 Insufficient akri.sh/akri-debug-echo. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a key reason why i did a direct mapping of CL slots to IL slots in the original implementation main...kate-goldenring:akri:config-level-resources-poc

@kate-goldenring kate-goldenring removed the version/patch Patch version change is needed label Jun 16, 2023
Signed-off-by: Johnson Shih <jshih@microsoft.com>
Signed-off-by: Johnson Shih <jshih@microsoft.com>
@johnsonshih johnsonshih marked this pull request as draft June 25, 2023 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants