-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
node level memory state from root cgroup is different from /proc/meminfo #2042
Comments
I also see the same thing and would also like to understand the relation between the root cgroup memory metrics and that of /proc/meminfo |
Any update on this? I have also noticed huge difference in memory usage from cgroup and memory usage of Memory usage from cgroup is showing 83%, |
Also super curious about this. |
Still seeing this kind of huge difference in Azure AKS 1.16.13. There is a difference of around 33% between the metric |
I think there is the difference because "memory usage" in cgroup and /proc/meminfo that you mentioned are 2 different metrics. The one taken from Please correct me if I'm wrong, since I'm facing similar thing and not pretty sure about the cause. |
Did anyone figure this out? Specifically how to monitor the root cgroup memory usage in prometheus/grafana... |
I also have the same problem. Reading memory via Moreover which is the "correct" one? |
Afaik elastic/apm-agent-java#1197 (comment) says the memory reported via /proc/meminfo (which is read by prometheus) is wrong. |
so to get something closer to the |
I guess you can try this, (node_memory_MemTotal_bytes - node_memory_MemFree_bytes - (node_memory_Buffers_bytes + node_memory_Cached_bytes - node_memory_Shmem_bytes), giving me the close results related to kubectl top nodes. |
maybe you can try this |
we are having the same problem. prometheus metrics are accurate for us. |
Also the same for me - I can't progress a card around this because I literally can't give accurate numbers... I also notice these values are available from the k8s api so it potentially the owners of prometheus not forwarding the correct data - however there is no clear query that would fetch this data which should exist. |
In kubernetes 1.9, the node level memory is from root cgroup:
rootStats, networkStats, err := sp.provider.GetCgroupStats("/", updateStats)
While I found the node level memory from "/proc/meminfo" is different:
cat /proc/meminfo
MemTotal: 263774064 kB
MemFree: 59887272 kB
MemAvailable: 221831940 kB
Buffers: 1097668 kB
Cached: 154110176 kB
SwapCached: 0 kB
Active: 79424372 kB
Inactive: 98926152 kB
Active(anon): 23219964 kB
Inactive(anon): 171352 kB
Active(file): 56204408 kB
Inactive(file): 98754800 kB
Unevictable: 0 kB
Mlocked: 0 kB
SwapTotal: 0 kB
SwapFree: 0 kB
Dirty: 219692 kB
Writeback: 0 kB
AnonPages: 23143444 kB
Mapped: 2724656 kB
Shmem: 248384 kB
Slab: 11127012 kB
SReclaimable: 7229012 kB
SUnreclaim: 3898000 kB
KernelStack: 64432 kB
PageTables: 101444 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 131887032 kB
Committed_AS: 63167748 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 1520832 kB
VmallocChunk: 34223804980 kB
HardwareCorrupted: 0 kB
AnonHugePages: 0 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 15480156 kB
DirectMap2M: 202510336 kB
DirectMap1G: 52428800 kB
the node level memory used:
used = MemTotal - MemFree = 263774064 kB - 59887272 kB = 203886792KB
the real used:
real used = MemTotal - MemFree - (Buffers + Cached - Shmem)
= 263774064 kB - 59887272 kB - (1097668 kB + 154110176 kB - 248384 kB)
= 48927332kB
while from root cgroup:
cat /sys/fs/cgroup/memory/memory.usage_in_bytes
183140081664
cat /sys/fs/cgroup/memory/memory.stat
cache 39511924736
rss 267431936
rss_huge 0
mapped_file 424710144
swap 0
pgpgin 18514489
pgpgout 8802732
pgfault 12272911
pgmajfault 188
inactive_anon 651264
active_anon 268210176
inactive_file 33488736256
active_file 6021758976
unevictable 0
hierarchical_memory_limit 9223372036854775807
hierarchical_memsw_limit 9223372036854775807
total_cache 158941466624
total_rss 24187408384
total_rss_huge 0
total_mapped_file 1903247360
total_swap 0
total_pgpgin 5077787329
total_pgpgout 5033078131
total_pgfault 12592699800
total_pgmajfault 85677
total_inactive_anon 175456256
total_active_anon 24265732096
total_inactive_file 101087215616
total_active_file 57599930368
total_unevictable 0
the node level memory used:
used = memory.usage_in_bytes = 183140081664 = 178847736kB
the real used:
real used = memory.usage_in_bytes - total_inactive_file
= 183140081664 - 101087215616
= 82052866048
= 80129752kB
The text was updated successfully, but these errors were encountered: