Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Hawkular memory tag working-set instead of usage #305

Merged
merged 2 commits into from Nov 20, 2018

Conversation

yaacov
Copy link
Member

@yaacov yaacov commented Nov 18, 2018

Description

ManageIQ includes cached memory when showing used memory in the dashboard and reports, some users want to see only the working set values as memory usage in the dashboard.

Definitions of memory tags used in Hawkular:
memory/usage - Total memory usage, including cache.
memory/working_set - Total working set usage. Working set is the memory being used and not easily dropped by the kernel.

This PR replace the memory value used in reports in dashboard.

Testing
In rails c:

Before:

2.4.0 :001 > prov = ExtManagementSystem.find(6);node = prov.container_nodes.last; context = ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture::HawkularCaptureContext.new(node, 10.minutes.ago, 5.minutes.ago, 60);
2.4.0 :002 > pp context.collect_metrics

...
2018-11-18 14:14:37 UTC=>
  {"cpu/usage_rate"=>142.0,
   "memory/usage"=>7406342144.0,
   "memory/working_set"=>5029109760.0,
   "network/rx_rate"=>2128.1669921875,
   "network/tx_rate"=>12143.740234375,
   "cpu_usage_rate_average"=>7.1,
   "mem_usage_absolute_average"=>38.09846781893004,
   "net_usage_rate_average"=>14.2719072265625}}
...

After:

2.4.0 :001 > prov = ExtManagementSystem.find(6);node = prov.container_nodes.last; context = ManageIQ::Providers::Kubernetes::ContainerManager::MetricsCapture::HawkularCaptureContext.new(node, 10.minutes.ago, 5.minutes.ago, 60);
2.4.0 :002 > pp context.collect_metrics

...
2018-11-18 14:30:30 UTC=>
  {"cpu/usage_rate"=>135.0,
   "memory/usage"=>7422861312.0,
   "memory/working_set"=>5045604352.0,
   "network/rx_rate"=>2566.94287109375,
   "network/tx_rate"=>10321.9169921875,
   "cpu_usage_rate_average"=>6.75,
   "mem_usage_absolute_average"=>25.95475489711934,
   "net_usage_rate_average"=>12.88885986328125}}
...

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1650351
BZ (old): https://bugzilla.redhat.com/show_bug.cgi?id=1456856

@yaacov
Copy link
Member Author

yaacov commented Nov 18, 2018

@cben @Ladas @agrare @oourfali please review

@yaacov yaacov force-pushed the optional-working-set-as-memory-tag branch from 6d53366 to edd145c Compare November 18, 2018 15:03
@cben
Copy link
Contributor

cben commented Nov 18, 2018

Technically I'd prefer adding a new metric column, adding a new dashboard widget using it, and letting customer choose which widget they want to see.

  • One concrete reason overloading a single metric is bad: flipping this setting mid-way would make rollups a meaningless mix of old and new meaning.

But I don't know what it takes to add a new metric, and I'm not the one that'd implement it :-)

@yaacov yaacov force-pushed the optional-working-set-as-memory-tag branch 2 times, most recently from aeb74d4 to 3f63a90 Compare November 18, 2018 15:44
@agrare
Copy link
Member

agrare commented Nov 18, 2018

Hey @yaacov, I realize you're trying to get this done in a way that can be backported but I'm also 👎 on overloading an existing column.

@yaacov yaacov force-pushed the optional-working-set-as-memory-tag branch from 3f63a90 to 23ec957 Compare November 18, 2018 15:53
@yaacov
Copy link
Member Author

yaacov commented Nov 18, 2018

@agrare @cben Hi,

👎 on overloading an existing column.

As @agrare mentioned in the BZ the "correct" way will be to add a new column to the metrics table. I agree with that observation.

My issues with the "correct" solution are:

  • It's not what users asked for in the BZ, they want the memory usage to be the portion of memory occupied by a process that is held in main memory (RAM). To them current behaviour (ram + swap) (working_set + cache) is a bug.
  • OpenShift show memory usage using working-set , so to be consistent with OpenShift we need to change the value used for memory-usage and not add a new field.
  • We already have ~40 fields in the metrics table, 7 are dedicated to real time memory, none for RSS or working-set, so we will have to add a new field called mem_working_set_absolute_average, this mean a schema change + rollups + aggregation changes in ManageIQ core.
  • If we decide to add a new metrics field, we will also need to change / add reports.
  • We will also need to update the dashboards to show the new metrics and document what this new values mean.

p.s.
This PR purpose is to [ add option to ] show in ManageIQ dashboard / reports what users expect to see in "memory-usage", it may not be the "correct" value, but it is what users expect.

EDIT NOV-19: usage = working_set + cache not ram + swap

@Ladas
Copy link
Contributor

Ladas commented Nov 19, 2018

@yaacov I get what you are saying, but as @cben says, we can't do this since it would mess up rollups, historic reports etc. @agrare @Fryguy did we deal with something like this before?

We already have field for swap mem_swapped_absolute_average.

For the memory usage, it depends, for OpenStack infra the customer expected the used memory is
-cache and -buffer. Normally it has RSS in there, but since we didn't put it there, it'll be very hard(impossible) to migrate the existing data.

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

we can't do this since it would mess up rollups, historic reports

Correct, as @cben and @Ladas mentioned overloading _usage_ is wrong, we all agree about that, but we need to give the user a fix, adding a new field is the best solution, but it also has it's cons.

@Ladas, how fast do you think you or @adam can push the necessary schema changes and the necessary aggregations, roll-ups and reports changes ?

Depending on the state of the BZ, I am for waiting with this PR until schema changes get merged.

My two cents on waiting for schema changes:

  • If we close the BZ as "not a bug" because we show what we say we show (e.g. "usage"), than we do not need this PR at all.
  • If we decide this BZ can wait for schema changes, than we we can wait with this PR until then.
  • Other wise, we need to get creative, I think this PR is a good compromise for this case.

@Ladas
Copy link
Contributor

Ladas commented Nov 19, 2018

@yaacov so, most important question, how does this align to https://github.com/Ladas/manageiq-providers-kubernetes/blob/5565f3738fdbd49979eb2da202ee52dedf23a805/app/models/manageiq/providers/kubernetes/container_manager/refresh_parser.rb#L727

Does that align with memory/working_set or memory/usage? Since we show % value of the ^ inventory value, right? So is it correct now? Or you are saying having all pods on 100% memory will be again more than 100% of the node capacity? (This would be a bug)

(there is also unused code that should be aligned https://github.com/Ladas/manageiq-providers-kubernetes/blob/833e43eaaa0938f3e60d7ec8952a5ed7ecde0176/app/models/manageiq/providers/kubernetes/container_manager/metrics_capture/hawkular_capture_context.rb#L264 ? )

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

Hi all, to sum discussions on and off line -

Common wisdom now is to add a new field to schema, @agrare will try to push this forward.

I will change this PR to push data to a yet none existent field called mem_working_set_absolute_average if needed we will push data to one of the not used fields in the metrics table.

@Ladas @agrare @cben WDYT ?


Not used fields in metrics table:

 mem_vmmemctl_absolute_average: nil,
 mem_vmmemctltarget_absolute_average: nil,
 mem_swapin_absolute_average: nil,
 mem_swapout_absolute_average: nil,
 mem_swapped_absolute_average: nil,
 mem_swaptarget_absolute_average: nil,

We will also try to see if we can add more memory related fields on the way,

The metrics we can get from Hawkular in Openshift 3.7.27 are:

memory/limit                    Memory hard limit in bytes.
memory/major_page_faults        Number of major page faults.
memory/major_page_faults_rate   Number of major page faults per second.
memory/node_capacity            Memory capacity of a node.
memory/node_allocatable         Memory allocatable of a node.
memory/node_reservation         Share of memory that is reserved on the node allocatable.
memory/node_utilization         Memory utilization as a share of memory allocatable.
memory/page_faults              Number of page faults.
memory/page_faults_rate         Number of page faults per second.
memory/request                  Memory request (the guaranteed amount of resources) in bytes.
memory/usage                    Total memory usage.
memory/working_set              Total working set usage. Working set is the memory being used and not easily dropped by the kernel.

@Fryguy
Copy link
Member

Fryguy commented Nov 19, 2018

I am -1 on the option...more options is bad for the product, and in this case would be a confusing option for customers as they a) would not understand why they would switch it and b) probably would never switch it (because I would choose the settings default to be the more accurate working_set).

I'm also -1 on a new column to solve this issue. In my opinion, the original choice of "usage" was just a bug/oversight, and the new choice of "working set" is the correct value and is just a bug fix.

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

@Fryguy

👍 on just reading working_set instead of usage

Do we need to support pre 3.7.27 OpenShifts that do not have the working_set value ?

@agrare
Copy link
Member

agrare commented Nov 19, 2018

@yaacov so you mentioned that this was missing on 3.7.0-alpha.1, can we find out if it was available on 3.7.0 GA?

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

@yaacov so you mentioned that this was missing on 3.7.0-alpha.1, can we find out if it was available on 3.7.0 GA?

I will check ... it may take a little time :-)

@Ladas
Copy link
Contributor

Ladas commented Nov 19, 2018

@yaacov please respond to #305 (comment)

You are changing the used memory, but you are keeping mem_node_capacity that means one of the 2 options for used memory must be wrong. Which one is it?

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

@agrare it looks like it was in 3.7.0 GA (*) was in Heapster 1.3 used in OpenShift 3.7

(*) kubernetes-retired/heapster#1074

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

please respond to #305 (comment)

@Ladas I do not understand your question.

Why fixing a wrong metrics, should have affect on mem_node_capacity ?

We are changing the value from being used ( e.g. working_set+cache ) to being just working_set, it's a smaller number and should have no effect on anything other than fixing the value shown in dashboards and reports.

@Ladas
Copy link
Contributor

Ladas commented Nov 19, 2018

@yaacov well, no, you are computing the mem_usage_absolute_average as https://github.com/ManageIQ/manageiq-providers-kubernetes/pull/305/files#diff-8e531dadfa2ffca2566fdd1332c47531R174

So it uses the used/total, and if you lower the used, you will get lower mem_usage_absolute_average.

So should the mem_usage_absolute_average be lower? Meaning we were doing the wrong computation before and you would see you are using over 100% of your memory if all your resource are consumed?

Or was it correct and now we'll never see 100%, even if all your resources are consumed?

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

@Ladas I see.

So should the mem_usage_absolute_average be lower?

Sometimes:
For example if we if we have 100 units of RAM total:
a.

50 used  (working set + cache)
50 working set
0   cache
mem_usage_absolute_average - will stay the same.

b.

50 used  (working set + cache)
40 working set
10   cache
mem_usage_absolute_average - will be a bit lower

c.

50 used  (working set + cache)
0 working set
50   cache
mem_usage_absolute_average - will be zero

Meaning we were doing the wrong computation, before and you would see you are using over 100% of your memory if all your resource are consumed?

No, we used usage that is a correct value to use, working set is better.
You can not get over 100% with usage
You can not get over 100% with working set

mem_usage_absolute_average will never be more than 100%
mem_usage_absolute_average can be 100% if all available RAM is used for working set.
mem_usage_absolute_average will never be less than 0%

Or was it correct and now we'll never see 100%, even if all your resources are consumed?

if all usage is because of working_set and cached is zero, and working_set is all of the total memory we will get 100% ( and the kernel will kill the process responsible before we can measure it :-) )

using working set instead of usage should have no affect on the top and bottom limits of mem_usage_absolute_average value.

Does that make seance to you ?

@Ladas
Copy link
Contributor

Ladas commented Nov 19, 2018

Can containers share cache (is this without shared/buffer)? If they would it could go over 100%, when summed.

Otherwise this mean we can't never reach 100% util, if there is a cache now?

@yaacov
Copy link
Member Author

yaacov commented Nov 19, 2018

Can containers share cache (is this without shared/buffer)? If they would it could go over 100%, when summed.

AFAIK containers should not share any memory used, cache or working set. but it's a good question.

Otherwise this mean we can't never reach 100% util, if there is a cache now?

We get our numbers from cAdvisor used by many projects, most use usage some use working_set to calculate utilization ( e.g. used / total ) if the ones using usage for utilization got above 100% I think we should find some issues open about that in some project ...

@agrare
Copy link
Member

agrare commented Nov 19, 2018

Okay @yaacov so lets drop the config option. Since we support versions older than when this was added we'll need to either add a version check on the provider or do something like working_set || usage

@agrare agrare self-assigned this Nov 19, 2018
@cben
Copy link
Contributor

cben commented Nov 19, 2018 via email

@agrare
Copy link
Member

agrare commented Nov 19, 2018

How will users know which of the 2 they're seeing?

We'll need to doc that on openshift 3.6 you get memory/usage and on 3.7+ you get memory/working_set.

@agrare
Copy link
Member

agrare commented Nov 19, 2018

@yaacov actually since we are dropping support for pre 3.7 on master and back to hammer you can just switch to working_set and we can deal with the version check when we backport to gaprindashvili

@yaacov yaacov force-pushed the optional-working-set-as-memory-tag branch from 23ec957 to ae07ba5 Compare November 20, 2018 08:23
Copy link
Contributor

@Ladas Ladas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@cben
Copy link
Contributor

cben commented Nov 20, 2018

Code LGTM 👍

What about prometheus?

@oourfali
Copy link

@cben Prometheus isn't supported.
We can change that, but not mandatory.

@agrare agrare merged commit aeda847 into ManageIQ:master Nov 20, 2018
@agrare agrare added this to the Sprint 99 Ending Nov 19, 2018 milestone Nov 20, 2018
@agrare
Copy link
Member

agrare commented Nov 20, 2018

@yaacov @simaishi we can backport this cleanly to hammer but will need to modify it for gaprindashvili to handle older versions. @yaacov can you start preparing that PR ?

@yaacov
Copy link
Member Author

yaacov commented Nov 20, 2018

gaprindashvili to handle older versions. @yaacov can you start preparing that PR ?

yes :-)

@yaacov
Copy link
Member Author

yaacov commented Nov 20, 2018

@yaacov can you start preparing that PR ?

@agrare @simaishi , submitted a PR on gaprindashvili,

That PR relay on falling back to the legacy code that still uses usage: #306

simaishi pushed a commit that referenced this pull request Nov 20, 2018
Use Hawkular memory tag working-set instead of usage

(cherry picked from commit aeda847)

https://bugzilla.redhat.com/show_bug.cgi?id=1650351
@simaishi
Copy link

Hammer backport details:

$ git log -1
commit 746ae7d41cb2f634bd09b548373eea965b07f7be
Author: Adam Grare <agrare@redhat.com>
Date:   Tue Nov 20 06:59:20 2018 -0500

    Merge pull request #305 from yaacov/optional-working-set-as-memory-tag
    
    Use Hawkular memory tag working-set instead of usage
    
    (cherry picked from commit aeda8479fe7d8caa9e17e325a14be4022d77b213)
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1650351

@agrare agrare modified the milestones: Sprint 99 Ending Nov 19, 2018, Sprint 100 Ending Dec 3, 2018 Nov 20, 2018
@simaishi
Copy link

Backported to Gaprindashvili via #306

agrare added a commit to agrare/manageiq-providers-kubernetes that referenced this pull request Jan 3, 2019
…emory-tag

Use Hawkular memory tag working-set instead of usage

(cherry picked from commit aeda847)
agrare added a commit to agrare/manageiq-providers-kubernetes that referenced this pull request Jan 3, 2019
…emory-tag

Use Hawkular memory tag working-set instead of usage

(cherry picked from commit aeda847)
simaishi added a commit that referenced this pull request Jan 4, 2019
…-set

[GAPRINDASHVILI] Revert #306 in favor of clean backport of #305
simaishi pushed a commit that referenced this pull request Jan 4, 2019
Use Hawkular memory tag working-set instead of usage

(cherry picked from commit aeda847)

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1663520
@simaishi
Copy link

simaishi commented Jan 4, 2019

Gaprindashvili backport details:

$ git log -1
commit 06f9d3c70fe1252eb0adf9b213a15c6d41a0de72
Author: Adam Grare <agrare@redhat.com>
Date:   Tue Nov 20 06:59:20 2018 -0500

    Merge pull request #305 from yaacov/optional-working-set-as-memory-tag
    
    Use Hawkular memory tag working-set instead of usage
    
    (cherry picked from commit aeda8479fe7d8caa9e17e325a14be4022d77b213)
    
    Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1663520

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants