-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orderly kernel module version upgrade #263
Comments
/assign @yevgeny-shnaidman |
As mentioned above, one of the main challenges of supporting seamless kernel module upgrade is if a workload is using the kernel module. In that case, the command Potential Solution |
Device plguin exports some specific resource to cluster, to use any of these resources in cluster, the pod has to claim the resources in the yaml to create the Pod, so it might be possible to just drain the pods claiming any resources that are related with the driver module we want to upgrade. After that it is safe to rmmod the kernel module and then insmode the new modules to upgrade the drivers. We can start from "kubetctl drain" command and figure out whether we can add the resource claiming as a parameter to drain the pods. |
@hershpa @uMartinXu KMM responsibility is to deal only with kernel modules and device plugins. It does deal with any workloads that are running after kernel modules are loaded. It is up to other customer's operator to managed the workload. So, it seems to me that those operators, and not KMM, should also decide when and how the workload should be removed from the node. In addition, allowing KMM to actually drain nodes is very problematic: KMM is not a core operator, it does not know what workloads are running on the nodes. Draining the nodes will remove ALL workloads from the nodes, including those that have no dependencies on the KMM. In most of the cases, upgrading kernel module does not require removal of all the workloads from a node. |
Issue summary
Kernel module version upgrade must allow user to control the order of the pods(nodes) upgrade, and the timing of the upgrade of a specific pod (node).
Current upgrade process
Current upgrade process is as following:
Current flow causes the following difficulty:
Proposed Solution
Upgrade Flow example
Initiating Module
Upgrading Module Version
The text was updated successfully, but these errors were encountered: