node-problem-detector does not discover OOMs in newer Linux kernels #96746
Labels
kind/bug
Categorizes issue or PR as related to a bug.
needs-triage
Indicates an issue or PR lacks a `triage/foo` label and requires one.
sig/node
Categorizes an issue or PR as relevant to SIG Node.
sig/scalability
Categorizes an issue or PR as relevant to SIG Scalability.
What happened:
In a cluster with NPD addon installed and Linux 5.1+ as the kernel OS of the nodes OOMs are not being captured by the NPD.
What you expected to happen:
NPD should inform about OOMs by emitting appropriate K8s events.
How to reproduce it (as minimally and precisely as possible):
Create a K8s cluster with NPD addon and Linux 5.1+ installed on the nodes. Schedule a pod which leaks memory. Try to list events with
reason: OOMKilling
field after an OOM occurs to see there's none.Anything else we need to know?:
Fix is already merged into newest (v0.8.5) NPD release: kubernetes/node-problem-detector#481.
We're currently waiting for the release of Docker image of NPD v0.8.5 - we can proceed with fixing this in k/k afterwards. After it gets merged to the master branch, it should be then cherry-picked to older releases as well.
Environment:
Any Kubernetes version with NPD addon installed and Linux as the node OS with the kernel version 5.1 or newer.
The text was updated successfully, but these errors were encountered: