-
Notifications
You must be signed in to change notification settings - Fork 232
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document supported k8s versions #1257
Comments
Hi @Agalin. NFD really should practically work on any kubernetes version that is still being used as we're only using stable apis. Bumping the patch version of the k8s deps really shouldn't break anything Could you share more details about your deployment and the failure? |
It seems to be some kind of an incompatibility between the latest version of the Kubernetes API library and K8s 1.23 (1.23.10 to be precise). Master deployed from 0.13.2 crashloops with the following error in logs:
Same config works just fine on k8s 1.27. Looking at logs, it breaks somewhere between line 689 and line 710. It reaches single node update function but doesn't print NodeFeature retrieval error nor |
Hmm, strange, it segfaults in nfd-master on line 637 🤔 What exact instructions do you use for deploying nfd? Do you have the NodeFeature API enabled? |
/kind bug |
I'm using a Helm chart as a subchart deployed through Skaffold (it's the only subchart, it's done like that so we can add our CI and possibly additional subcharts in the future) with the following values applied: node-feature-discovery:
nameOverride: nfd
master:
replicaCount: 3
podSecurityContext: &podSecurityContext
runAsUser: 10000
runAsGroup: 10000
fsGroup: 10000
securityContext: &securityContext
runAsUser: 10000
runAsGroup: 10000
resources: &resources
requests:
memory: 256Mi
cpu: 1
limits:
memory: 256Mi
cpu: 1
tolerations:
- key: "node-role.kubernetes.io/etcd"
operator: "Equal"
value: "true"
effect: "NoExecute"
- key: "node-role.kubernetes.io/controlplane"
operator: "Equal"
value: "true"
effect: "NoSchedule"
- key: "node-role.kubernetes.io/control-plane"
operator: "Equal"
value: "true"
effect: "NoSchedule"
worker:
podSecurityContext: *podSecurityContext
securityContext: *securityContext
resources: *resources
config:
core:
featureSources:
- cpu
sources:
cpu:
cpuid:
attributeBlacklist: []
attributeWhitelist: []
and nodeSelector defined in a cluster-specific file. |
I'm not sure if it's worth debugging, k8s 1.23 is deprecated and even 1.24 will reach EOL next week. |
There is some bug in the codebase that v1.23 reveals. I'd like to understand what |
Thanks! Hopefully I'll be able to test it next week. |
What would you like to be added:
Document currently compatible versions of k8s in readme or docs and last verison of NFD working on a particular k8s version.
Why is this needed:
I'm currently working on deploying NFD in our environment. We have multiple clusters using k8s 1.23 and 1.27. To my surprise, Helm deployment working just fine on 1.27 cluster failed with master pod crashlooping on 1.23 clusters. It seems that recent kubernetes API library version bump is the problem - master works fine on 0.13.1.
Ensuring compatibility with old (I'm aware of 1.23 not being supported any more) k8s versions makes no sense but compatibility matrix would be a welcome addition and would save me a few hours of debugging. 😄
The text was updated successfully, but these errors were encountered: