Skip to content

Conversation

jotak
Copy link
Member

@jotak jotak commented Nov 14, 2023

Preparing blog post, and storing some examples

Preview available here: https://github.com/jotak/netobserv-documents/blob/acm/blogs/acm/leverage-metrics-in-acm.md

@jotak jotak changed the title ACM & netobserv metrics NETOBSERV-1322: ACM & netobserv metrics Nov 21, 2023
@jotak jotak marked this pull request as ready for review November 22, 2023 09:35
@jotak jotak requested a review from skrthomas November 22, 2023 09:35
Copy link
Contributor

@skrthomas skrthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, @jotak ! I added a few edits and questions :) Thanks!

acm.md Outdated
Comment on lines 9 to 11
1. Create 2 clusters (or more)
2. Choose one for being the main one / hub: install ACM operator on it; Create a default MultiClusterHub
3. In console top bar, select "all cluster" then start procedure to import an existing cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Create 2 clusters (or more)
2. Choose one for being the main one / hub: install ACM operator on it; Create a default MultiClusterHub
3. In console top bar, select "all cluster" then start procedure to import an existing cluster
1. Create 2 clusters (or more).
2. Choose one cluster as the main one or hub, and install the ACM operator on it.
3. Create a default MultiClusterHub.
4. In the console top bar, select "all cluster" then start the procedure to import an existing cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments about these steps:

  • I think "create a default MultiClusterHub" should be its own step, or a nested step rather than a continuation of the step 3 with a semicolon.
  • Is this a MultiClusterHub custom resource or Operator? I think specifying would be good.
  • I'm wondering, should ACM be spelled out? Or is it an approved acronym that all readers would be familiar with? If its new, I would suggest spelling it out and putting ACM in parentheses for this first mention, then elsewhere you can just use ACM.
  • Periods are needed at the end of these steps.
  • I also added some "the".

acm.md Outdated
3. In console top bar, select "all cluster" then start procedure to import an existing cluster

On each cluster:
1. Install netobserv downstream (user workload prometheus won't work)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Install netobserv downstream (user workload prometheus won't work)
1. Install network observability operator downstream (user workload Prometheus won't work).

@jotak
Copy link
Member Author

jotak commented Nov 22, 2023

oops I'm sorry @skrthomas I haven't been clear about that, but the acm.md file is like an internal draft recipe, I think you can ignore it , the actual blog is just what is in the blog directory , so mainly the file named leverage-metrics-in-acm.md

@@ -0,0 +1,62 @@
## Setup ACM with NetObserv metrics
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for reviewers: this file is just a recipe for internal purpose, not the blog post; for the blog, look at blogs/acm/leverage-metrics-in-acm.md

@skrthomas
Copy link
Contributor

skrthomas commented Nov 22, 2023

@jotak no worries at all; thanks for the context. You can disregard my ACM acronym comment in this case :)

Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good as is ! Don't forget your TODOs 😉
Thanks !

Co-authored-by: Julien Pinsonneau <91894519+jpinsonneau@users.noreply.github.com>
Copy link
Contributor

@skrthomas skrthomas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jotak I added a few more comments and suggestions. I think this looks great, and feel free to leave the suggestions if you'd rather not implement them.


### What is NetObserv?

Network Observability (NetObserv) is a Red Hat operator providing observability over all the network traffic on a cluster by installing eBPF agents per-node which generate flow logs. These flows are collected, stored, converted into metrics, queried from dashboards and so on. More observability blog posts [here](https://cloud.redhat.com/blog/tag/observability), and NetObserv documentation [there](https://docs.openshift.com/container-platform/4.14/network_observability/network-observability-overview.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Network Observability (NetObserv) is a Red Hat operator providing observability over all the network traffic on a cluster by installing eBPF agents per-node which generate flow logs. These flows are collected, stored, converted into metrics, queried from dashboards and so on. More observability blog posts [here](https://cloud.redhat.com/blog/tag/observability), and NetObserv documentation [there](https://docs.openshift.com/container-platform/4.14/network_observability/network-observability-overview.html).
Network Observability (NetObserv) is a Red Hat Operator providing observability over all the network traffic on a cluster by installing eBPF agents per-node which generate flow logs. These flows are collected, stored, converted into metrics, queried from dashboards and so on. More observability blog posts [here](https://cloud.redhat.com/blog/tag/observability), and NetObserv documentation [there](https://docs.openshift.com/container-platform/4.14/network_observability/network-observability-overview.html).

- By declaring metric names to pull
- Or by declaring such recording rules

The former is easier to configure but in many cases, this is probably not what you want. When pulling metrics from many sources, the key concept to have in mind is [metrics cardinality](https://www.robustperception.io/cardinality-is-key/). The more metrics you configure, the bigger is the impact on Prometheus and Thanos resource usage and performance. "Cardinality" here does not refer to the number of record rules or names that we declare in this configuration - these are called _metric families_ - after all, if you look closely, we only mention four distinct metric families in this config, which isn't a lot. No, what really matters with cardinality is the distinct count of all metric families _and all their combinations of label keys and values_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The former is easier to configure but in many cases, this is probably not what you want. When pulling metrics from many sources, the key concept to have in mind is [metrics cardinality](https://www.robustperception.io/cardinality-is-key/). The more metrics you configure, the bigger is the impact on Prometheus and Thanos resource usage and performance. "Cardinality" here does not refer to the number of record rules or names that we declare in this configuration - these are called _metric families_ - after all, if you look closely, we only mention four distinct metric families in this config, which isn't a lot. No, what really matters with cardinality is the distinct count of all metric families _and all their combinations of label keys and values_.
The former is easier to configure but in many cases, this is probably not what you want. When pulling metrics from many sources, the key concept to have in mind is [metrics cardinality](https://www.robustperception.io/cardinality-is-key/). The more metrics you configure, the bigger the impact on Prometheus and Thanos resource usage and performance. "Cardinality" here does not refer to the number of record rules or names that we declare in this configuration - these are called _metric families_ - after all, if you look closely, we only mention four distinct metric families in this config, which isn't a lot. What really matters with cardinality is the distinct count of all metric families _and all their combinations of label keys and values_.


Proceed until you have created a `MultiClusterObservability` resource.

Before going further, makes sure the observability stack is up and running:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to "make".

@@ -0,0 +1,24 @@
#!/bin/bash

if [[ "$#" -lt 1 || "$1" = "--help" ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since two arguments are required, it should be -lt 2.

@jotak jotak merged commit 85cdc6c into netobserv:main Dec 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants