add support for monitoring thp, ballooning, zswap, ksm cow by ktsaou · Pull Request #15000 · netdata/netdata

ktsaou · 2023-05-02T22:07:29Z

Modified the vmstat module of proc module to monitor:

transparent huge pages (THP, many charts, including allocations, compaction, splitting, swapout, etc)
memory ballooning (when Linux runs as guest VM on a host that supports memory ballooning)
zswap operations
KSM copy-on-write operations (COW)

Especially the THP is very important since it seems enabled on most Linux distros by default.

ilyam8

lgtm, but a small note:

Usually having "events/s" as units is a red flag because it means we are aggregating (chart level) non-aggregatable metrics - grouping by host/system w/o selecting a dimension (the Cloud UI) will give useless values. E.g.

thp_swpout
is incremented every time a huge page is swapout in one
piece without splitting.

thp_swpout_fallback
is incremented if a huge page has to be split before swapout.
Usually because failed to allocate some continuous swap space
for the huge page.

The sum of these events doesn't seem useful/correct to me.

And this PR is 15000th contribution to this repo 🎉

ktsaou · 2023-05-03T09:39:13Z

Usually having "events/s" as units is a red flag because it means we are aggregating (chart level) non-aggregatable metrics - grouping by host/system w/o selecting a dimension (the Cloud UI) will give useless values

How do you suggest to do it?
I am not sure why they are useless when grouped. Why they are?

ilyam8 · 2023-05-03T09:52:45Z

I propose to go with the current implementation, which is why I approved the PR. We do such aggregations pretty often. By "such aggregations" I mean correct when grouped by dimension, wrong when grouped by anything else without selecting a dimension (some obvious examples: system load average, system pressure).

cc @ralphm

Why they are?

I don't find them useful because the sum of:

huge page swapout events
huge page split events

doesn't seem to make sense to me, they look like 2 completely different metrics. But my understanding can be wrong because it is based only on metrics description, I am not a specialist.

I am not a specialist

If thp_swpout_fallback means a huge page swapout event and not only split, and this swapout is not accounted in thp_swpout then all good.

This reverts commit 54b9464.

ktsaou · 2023-05-03T13:36:08Z

@cakrit this needs to go into the release notes. The description is in the document I shared with you about hugepages.

add support for monitoring thp, ballooning, zswap, ksm cow

107fbe7

ktsaou requested a review from thiagoftsm as a code owner May 2, 2023 22:07

github-actions Bot added area/collectors Everything related to data collection collectors/proc labels May 2, 2023

update proc metrics.csv

1226ae7

ilyam8 previously approved these changes May 3, 2023

View reviewed changes

ktsaou added 2 commits May 3, 2023 12:56

updated metrics.csv

54b9464

Merge branch 'vmstat-thp' of github.com:ktsaou/netdata into vmstat-thp

5c1d94e

ktsaou dismissed ilyam8’s stale review via 5c1d94e May 3, 2023 09:57

ktsaou added 2 commits May 3, 2023 12:58

Revert "updated metrics.csv"

43826cc

This reverts commit 54b9464.

replaced prog.plugin with proc.plugin

1eacb97

ilyam8 approved these changes May 3, 2023

View reviewed changes

thiagoftsm approved these changes May 3, 2023

View reviewed changes

ktsaou merged commit 50183dc into netdata:master May 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for monitoring thp, ballooning, zswap, ksm cow#15000

add support for monitoring thp, ballooning, zswap, ksm cow#15000
ktsaou merged 6 commits intonetdata:masterfrom
ktsaou:vmstat-thp

ktsaou commented May 2, 2023

Uh oh!

ilyam8 left a comment •

edited

Loading

Uh oh!

ktsaou commented May 3, 2023

Uh oh!

ilyam8 commented May 3, 2023 •

edited

Loading

Uh oh!

ktsaou commented May 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ktsaou commented May 2, 2023

Uh oh!

ilyam8 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ktsaou commented May 3, 2023

Uh oh!

ilyam8 commented May 3, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ktsaou commented May 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ilyam8 left a comment •

edited

Loading

ilyam8 commented May 3, 2023 •

edited

Loading