Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metricsmap: fix Prometheus exporter #14220

Merged
merged 2 commits into from
Dec 4, 2020

Conversation

jibi
Copy link
Member

@jibi jibi commented Nov 30, 2020

Fix the Prometheus exporter as it was not aggregating correctly the
multiple entries that make up a value.

While at it, simplify the logic used to iterate the map by replacing it
with a call to Metrics.DumpWithCallback().

Signed-off-by: Gilberto Bertin gilberto@isovalent.com

@jibi jibi requested a review from a team November 30, 2020 15:51
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 30, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to In progress in 1.10.0 Nov 30, 2020
@jibi jibi added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Nov 30, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Nov 30, 2020
@jibi jibi added sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. area/metrics Impacts statistics / metrics gathering, eg via Prometheus. needs-backport/1.9 labels Nov 30, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.9.1 Nov 30, 2020
@jibi jibi force-pushed the pr/jibi/fix-prometheus-bpf-metrics branch from 89317b5 to 0a53df6 Compare November 30, 2020 15:56
@jibi
Copy link
Member Author

jibi commented Nov 30, 2020

test-me-please

@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.8.6 Nov 30, 2020
@jibi jibi force-pushed the pr/jibi/fix-prometheus-bpf-metrics branch from 0a53df6 to d5e8c5f Compare December 1, 2020 09:29
@jibi
Copy link
Member Author

jibi commented Dec 1, 2020

test-me-please

@pchaigno pchaigno removed the request for review from nathanjsweet December 3, 2020 10:37
@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Dec 3, 2020
Simplify the logic used to iterate the metrics map by replacing it with
a call to Metrics.DumpWithCallback().

Signed-off-by: Gilberto Bertin <gilberto@isovalent.com>
Fix the Prometheus exporter as it was not aggregating correctly the
multiple entries that make up a value.

This commit changes the logic from:

    func updatePrometheusMetrics(value) {
      updateMetric(value.Packets)
      updateMetric(value.Bytes)
    }

    func SyncMetrics() {
      for _, value := range values {
	updatePrometheusMetrics(value)
      }
    }

to:

    func updatePrometheusMetrics(values) {
      for _, value := range values {
	packets += value.Packets
	bytes   += value.Bytes
      }

      updateMetrics(packets)
      updateMetrics(bytes)
    }

    func SyncMetrics() {
      updatePrometheusMetrics(values)
    }

What updateMetric does is:

* diff the packets/bytes value it receive with the one stored in
  Prometheus
* increment the Prometheus counter by the delta it just calculated

Since the logic in updateMetric is not linear, it will work correctly
only if the packets/bytes counter it receives is the sum of all the
per-CPU values for a given key (rather than what we are doing now, which
is calling updateMetric() on every single per-CPU value for a given key)

Fixes: a591053 ("pkg/maps/metricsmap: userspace aggregation of BPF_PER_CPU_HASH_MAP metrics map.")

Signed-off-by: Gilberto Bertin gilberto@isovalent.com
@jibi jibi force-pushed the pr/jibi/fix-prometheus-bpf-metrics branch from d5e8c5f to 657c41f Compare December 3, 2020 10:53
@jibi
Copy link
Member Author

jibi commented Dec 3, 2020

No need to re-run the CI as I just amended a commit message to include the Fixes: .. mention

@joestringer joestringer merged commit d5d34ab into master Dec 4, 2020
@joestringer joestringer deleted the pr/jibi/fix-prometheus-bpf-metrics branch December 4, 2020 07:05
@aanm aanm added this to Needs backport from master in 1.8.7 Dec 4, 2020
@aanm aanm removed this from Needs backport from master in 1.8.6 Dec 4, 2020
@aanm aanm added this to Needs backport from master in 1.8.6 Dec 4, 2020
@aanm aanm removed this from Needs backport from master in 1.8.6 Dec 4, 2020
@aanm aanm added this to Needs backport from master in 1.8.7 Dec 4, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport done to v1.9 in 1.9.1 Dec 4, 2020
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport done to v1.9 in 1.9.1 Dec 4, 2020
@aanm aanm mentioned this pull request Dec 4, 2020
@pchaigno pchaigno removed their assignment Dec 14, 2020
@joestringer joestringer moved this from Needs backport from master to Backport done to v1.8 in 1.8.7 Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics Impacts statistics / metrics gathering, eg via Prometheus. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/bug This PR fixes an issue in a previous release of Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
No open projects
1.8.7
Backport done to v1.8
1.9.1
Backport done to v1.9
Development

Successfully merging this pull request may close these issues.

None yet

5 participants