Expose cgroup v2 memory.events as Prometheus metrics#3870
Conversation
921423a to
1a98acf
Compare
|
@dims could you take a look at it? |
|
@sohankunkerkar no tests at all? :(
|
1a98acf to
9d965e7
Compare
@dims I addressed your comments. Could you take a look at it again? Thanks! |
Kubernetes KEP-2570 (MemoryQoS) uses cgroup v2 memory.high for throttling and memory.min/memory.low for memory protection. To observe the effect of these settings, operators need visibility into memory pressure events. cadvisor currently does not read the memory.events cgroup file — the existing container_oom_events_total metric comes from kernel log parsing, not cgroup counters. Read memory.events on cgroup v2 and expose two new Prometheus counter metrics: - container_memory_events_high_total: times the container was throttled for breaching memory.high - container_memory_events_max_total: times the container's usage hit memory.max Signed-off-by: Sohan Kunkerkar <sohank2602@gmail.com>
9d965e7 to
cc66235
Compare
|
thanks @sohankunkerkar |
|
@dims Thanks for reviewing this PR. I had one quick question: do we have any plans to cut a new release of cAdvisor anytime soon? We might need it for testing the MemoryQoS feature in Kubernetes. |
|
@sohankunkerkar i try to do at least one release of cadvisor to support k8s. Will take stock soon-ish. Do you need this in short order? (weeks? days?) |
Thanks for the update! It would be ideal if we could get that once the v1.37 branch opens, or before the feature freeze maybe? |
Kubernetes KEP-2570 (MemoryQoS) uses cgroup v2 memory.high for throttling and memory.min/memory.low for memory protection. To observe the effect of these settings, operators need visibility into memory pressure events. cadvisor currently does not read the memory.events cgroup file. The existing
container_oom_events_totalmetric comes from kernel log parsing, not cgroup counters.Read memory.events on cgroup v2 and expose two new Prometheus counter metrics:
xref: kubernetes/enhancements#2570