-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Life-cycle management for external features #857
Comments
K8s admin needing to create tmpfs on every node to make sure host mounted things will get cleaned out eventually, is quite ugly. One way to improve this would be kubelet supporting some kind of node-local / named "emptyDir" volume which could be shared between pods running the same node:
Kubelet would create a directory somewhere when first pod/container requests it, and share it to every pod on same node declaring "nodeDir" volume with same "storageClassName". Kubelet would ref-count it, and remove it when last pod referring to it goes away. Main advantages to what needs to be done now, i.e. sharing explicit host paths between pods, would be kubelet managing the directory access and life-cycle instead of each pod needing to do that. This sharing is similar to how PV / PVC do cluster-wide, except that content will not be persistent, nor shared cluster-wide. Does that sound like reasonable KEP subject? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale @marquiz Any comments on this proposal? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
I think the "solution" is to use NodeFeature and NodeFeatureRule objects. Hooks are going away and we might want to do that for feature files, too. Patches are of course welcome if you want to fix this. Documentation is something that would be good to improve so |
@marquiz: GuidelinesPlease ensure that the issue body includes answers to the following questions:
For more details on the requirements of such an issue, please see here and ensure that they are met. If this request no longer meets these requirements, the label can be removed In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Are you saying that because:
NFD itself does not need additional life-cycle management for them nor "best-before" timestamps? I think label conflict detection would still be needed. And is there option for specifying rules re-evaluation period? PS. NodeFeature being limited to a single node seems too constrained: https://kubernetes-sigs.github.io/node-feature-discovery/v0.12/usage/customization-guide#nodefeature-custom-resource IMHO it should be at least a list, so that one does not need to specify separate NodeFeature file for each node. |
Yes, "label/feature lifecycle" is basically managed by kubernetes. Also, with NodeFeature objects features are gone when you delete the namespace where the object is located.
I'm not so sure about this. There are cases where people want to override existing labels.
There is no need for such thing as NodeFeatureRule and NodeFeature objects are immediately evaluated when they are changed.
Yeah, I've thought about that, supporting some sort of node groups for example but haven't come up with any particular design yet. And there hasn't been much feedback on this experimental still-disbled-by-default API.
This needs to be carefully thought out. Possibly some node-group label. But then something needs to create those node group labels before NodeFeature rules are evaluated... |
Shouldn't that be done by updating the existing NodeFeature/NodeFeatureRule object, rather than creating new one with a different name but specifying the same labels? How do you then define which one takes precedence? [1]
[1] I did not mean parsing the objects, but check whether matches for their rules have changed due to system run-time changes. E.g. kernel modules and PCI devices can both be added and removed at run-time (if they are not in use). |
Put it differently, I think it's just about ordering. We have the inid.d-style ordering where the last one prevails. Changing the ordering doesn't resolve any problems in my opinion.
That's handled by nfd-worker just like before. (or whatever mechanism you have in case of 3rd party extensions doing NodeFeature objects) |
Ok, so NodeFeatureRules are evaluated by nfd-workers on every Overriding label with a different object still sounds more like an error to me though. Having multiple rules producing a more generic label or taint could be valid though. I.e. while error about overlaps is definitely too strict, (single) warning might still be warranted: |
NFD-Master is the one processing NodeFeatureRules. In practice, currently, it's like you described as nfd-worker sends the features to nfd-master over gRPC every sleep-interval. BUT, with the NodeFeature API enabled (
Agree, generally it really should not happen (unless you really know what you're doing and what you want and why)
Agree, a warning or at least some log message would probably be a good idea |
@marquiz what about adding an attribute in overlap: override | error |
/remove-lifecycle stale |
Maybe we should activate on this one. The documentation can be improved anytime by anyone (@eero-t? 😉) Support for an expiry-date field in the feature files (as suggested in the issued description) makes sense to me. We could try to get it in v0.15 |
/mileston v0.15 |
/milestone v0.15 |
/assign |
What would you like to be added:
Implementation and documentation for life-cycle management of external features (split from #855).
Why is this needed:
Issues / missing:
Proposed solution:
=> If pod changes the name of a feature file it installs, without changing all the label names, it must remove old feature file before installing the new one
Note: Above ignores life-cycle management for hooks, as they're going away (#856).
The text was updated successfully, but these errors were encountered: