-
Notifications
You must be signed in to change notification settings - Fork 15
NETOBSERV-1322: ACM & netobserv metrics #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
561c963
ACM & netobserv metrics
jotak 0f5a04e
Update ACM blog
jotak 986da51
Add an example of diy
jotak ac48b12
note on user workload metrics
jotak d03e47a
acm 2.9, more on cardinality, ....
jotak cbb6c47
typo
jotak 4f6f7b8
mention other metrics
jotak fdf5346
Update blog & yamls to cover install piloted from acm
jotak 53795f0
do not use default namespace
jotak 8bd4bbc
update
jotak 5de9967
Apply suggestions from code review
jotak a72ebb1
Apply suggestions from code review
jotak 0cb4517
typos
jotak 98a39b5
remove TODOs, add credits
jotak bf9b066
fix thanos arg validation
jotak File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,78 @@ | ||
## Setup ACM with NetObserv metrics | ||
|
||
cf also [blog post](./blogs/acm/leverage-metrics-in-acm.md). | ||
|
||
This is more a quick guide for the development teams. | ||
|
||
Quick guide: | ||
|
||
1. Create 2 clusters (or more) | ||
2. Choose one for being the main one / hub: install ACM operator on it; Create a default MultiClusterHub | ||
3. In console top bar, select "all cluster" then start procedure to import an existing cluster. You may define labels "netobserv=true" during import. | ||
|
||
You have two options, either you use ACM policies to automate the install, or you install manually netobserv or each cluster. | ||
|
||
### Option 1: with ACM policies | ||
|
||
Note that this doesn't cover Loki installation, so in this mode Loki & Console plugin will be disabled. Of course it is possible to also automate Loki installation, by creating new policy objects. Feel free to contribute! | ||
|
||
```bash | ||
oc apply -f ./examples/ACM/acm-policy-netobserv-1.4.yaml | ||
oc apply -f ./examples/ACM/acm-policy-flowcollector-v1beta1-noloki.yaml | ||
oc apply -f ./examples/ACM/acm-bindings.yaml | ||
``` | ||
|
||
Then on each cluster you want to include, add the label "netobserv=true" if you haven't already done so. It will enable the policies for it, triggering automated install. You can do it from the console under Infrastructure > Clusters > Edit labels (on each row / kebab menu). | ||
|
||
### Option 2: manual install | ||
|
||
On each cluster: | ||
jotak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1. Install netobserv downstream (user workload prometheus won't work the same way) | ||
2. Create a FlowCollector, with these metrics enabled (`spec.processor.metrics.includeList`) : | ||
jotak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```yaml | ||
includeList: | ||
- namespace_flows_total | ||
- node_ingress_bytes_total | ||
- workload_ingress_bytes_total | ||
- workload_egress_bytes_total | ||
- workload_egress_packets_total | ||
- workload_ingress_packets_total | ||
``` | ||
|
||
cf steps at https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.8/html/observability/observing-environments-intro#enabling-observability : | ||
|
||
```bash | ||
oc create namespace open-cluster-management-observability | ||
DOCKER_CONFIG_JSON=`oc extract secret/pull-secret -n openshift-config --to=-` | ||
oc create secret generic multiclusterhub-operator-pull-secret \ | ||
-n open-cluster-management-observability \ | ||
--from-literal=.dockerconfigjson="$DOCKER_CONFIG_JSON" \ | ||
--type=kubernetes.io/dockerconfigjson | ||
``` | ||
|
||
Setup S3, Thanos Secret and ACM observability: | ||
|
||
```bash | ||
./examples/ACM/thanos-s3.sh yourname-thanos us-east-2 | ||
oc apply -f examples/ACM/acm-observability.yaml | ||
oc get pods -n open-cluster-management-observability -w | ||
oc apply -f examples/ACM/netobserv-metrics.yaml | ||
``` | ||
|
||
To debug the above config, check logs here: | ||
jotak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```bash | ||
oc logs -n open-cluster-management-addon-observability -l component=metrics-collector | ||
``` | ||
|
||
Deploying dashboards: | ||
jotak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```bash | ||
oc apply -f examples/ACM/dashboards | ||
``` | ||
|
||
Metrics resolution = 5 minutes | ||
|
||
Designing dashboards: https://access.redhat.com/documentation/en-us/red_hat_advanced_cluster_management_for_kubernetes/2.8/html/observability/using-grafana-dashboards#setting-up-the-grafana-developer-instance | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
apiVersion: observability.open-cluster-management.io/v1beta2 | ||
kind: MultiClusterObservability | ||
metadata: | ||
name: observability | ||
spec: | ||
observabilityAddonSpec: {} | ||
storageConfig: | ||
metricObjectStorage: | ||
name: thanos-object-storage | ||
key: thanos.yaml |
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,33 @@ | ||
kind: ConfigMap | ||
apiVersion: v1 | ||
metadata: | ||
name: observability-metrics-custom-allowlist | ||
namespace: open-cluster-management-observability | ||
data: | ||
metrics_list.yaml: | | ||
rules: | ||
# Namespaces | ||
- record: namespace:netobserv_workload_egress_bytes_total:src:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_egress_bytes_total[5m]),\"namespace\",\"$1\",\"SrcK8S_Namespace\",\"(.*)\")) by (namespace) | ||
- record: namespace:netobserv_workload_ingress_bytes_total:dst:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_ingress_bytes_total[5m]),\"namespace\",\"$1\",\"DstK8S_Namespace\",\"(.*)\")) by (namespace) | ||
- record: namespace:netobserv_workload_egress_packets_total:src:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_egress_packets_total[5m]),\"namespace\",\"$1\",\"SrcK8S_Namespace\",\"(.*)\")) by (namespace) | ||
- record: namespace:netobserv_workload_ingress_packets_total:dst:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_ingress_packets_total[5m]),\"namespace\",\"$1\",\"DstK8S_Namespace\",\"(.*)\")) by (namespace) | ||
|
||
# Namespaces / cluster ingress|egress | ||
- record: namespace:netobserv_workload_egress_bytes_total:src:unknown_dst:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_egress_bytes_total{DstK8S_OwnerType=\"\"}[5m]),\"namespace\",\"$1\",\"SrcK8S_Namespace\",\"(.*)\")) by (namespace) | ||
- record: namespace:netobserv_workload_ingress_bytes_total:dst:unknown_src:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_ingress_bytes_total{SrcK8S_OwnerType=\"\"}[5m]),\"namespace\",\"$1\",\"DstK8S_Namespace\",\"(.*)\")) by (namespace) | ||
- record: namespace:netobserv_workload_egress_packets_total:src:unknown_dst:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_egress_packets_total{DstK8S_OwnerType=\"\"}[5m]),\"namespace\",\"$1\",\"SrcK8S_Namespace\",\"(.*)\")) by (namespace) | ||
- record: namespace:netobserv_workload_ingress_packets_total:dst:unknown_src:rate5m | ||
expr: sum(label_replace(rate(netobserv_workload_ingress_packets_total{SrcK8S_OwnerType=\"\"}[5m]),\"namespace\",\"$1\",\"DstK8S_Namespace\",\"(.*)\")) by (namespace) | ||
|
||
# Workloads | ||
- record: workload:netobserv_workload_egress_bytes_total:src:rate5m | ||
expr: sum(label_replace(label_replace(label_replace(rate(netobserv_workload_egress_bytes_total[5m]),\"namespace\",\"$1\",\"SrcK8S_Namespace\",\"(.*)\"),\"workload\",\"$1\",\"SrcK8S_OwnerName\",\"(.*)\"),\"kind\",\"$1\",\"SrcK8S_OwnerType\",\"(.*)\")) by (namespace,workload,kind) | ||
- record: workload:netobserv_workload_ingress_bytes_total:dst:rate5m | ||
expr: sum(label_replace(label_replace(label_replace(rate(netobserv_workload_ingress_bytes_total[5m]),\"namespace\",\"$1\",\"DstK8S_Namespace\",\"(.*)\"),\"workload\",\"$1\",\"DstK8S_OwnerName\",\"(.*)\"),\"kind\",\"$1\",\"DstK8S_OwnerType\",\"(.*)\")) by (namespace,workload,kind) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
apiVersion: apps.open-cluster-management.io/v1 | ||
kind: PlacementRule | ||
metadata: | ||
name: placement-policy-netobserv | ||
spec: | ||
clusterConditions: | ||
- status: "True" | ||
type: ManagedClusterConditionAvailable | ||
clusterSelector: | ||
matchExpressions: | ||
- {key: netobserv, operator: In, values: ["true"]} | ||
--- | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: PlacementBinding | ||
metadata: | ||
name: binding-policy-netobserv | ||
placementRef: | ||
name: placement-policy-netobserv | ||
kind: PlacementRule | ||
apiGroup: apps.open-cluster-management.io | ||
subjects: | ||
- name: netobserv | ||
kind: Policy | ||
apiGroup: policy.open-cluster-management.io | ||
- name: netobserv-flowcollector | ||
kind: Policy | ||
apiGroup: policy.open-cluster-management.io |
36 changes: 36 additions & 0 deletions
36
examples/ACM/policies/acm-policy-flowcollector-v1beta1-noloki.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: Policy | ||
metadata: | ||
name: netobserv-flowcollector | ||
spec: | ||
disabled: false | ||
dependencies: | ||
- apiVersion: policy.open-cluster-management.io/v1 | ||
kind: Policy | ||
name: netobserv | ||
compliance: Compliant | ||
policy-templates: | ||
- objectDefinition: | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
metadata: | ||
name: netobserv-flowcollector | ||
spec: | ||
remediationAction: enforce | ||
severity: medium | ||
object-templates: | ||
- complianceType: musthave | ||
objectDefinition: | ||
apiVersion: flows.netobserv.io/v1beta1 | ||
kind: FlowCollector | ||
metadata: | ||
name: cluster | ||
spec: | ||
processor: | ||
metrics: | ||
ignoreTags: | ||
- nodes-flows | ||
- workloads-flows | ||
- namespaces | ||
loki: | ||
enable: false |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,101 @@ | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: Policy | ||
metadata: | ||
name: netobserv | ||
spec: | ||
disabled: false | ||
policy-templates: | ||
- objectDefinition: | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
metadata: | ||
name: netobserv-operator-namespace | ||
spec: | ||
remediationAction: enforce | ||
severity: medium | ||
object-templates: | ||
- complianceType: musthave | ||
objectDefinition: | ||
apiVersion: v1 | ||
kind: Namespace | ||
metadata: | ||
name: openshift-netobserv-operator | ||
- extraDependencies: | ||
- apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
name: netobserv-operator-namespace | ||
namespace: "" | ||
compliance: Compliant | ||
objectDefinition: | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
metadata: | ||
name: netobserv-operatorgroup | ||
spec: | ||
remediationAction: enforce | ||
severity: medium | ||
object-templates: | ||
- complianceType: musthave | ||
objectDefinition: | ||
apiVersion: operators.coreos.com/v1 | ||
kind: OperatorGroup | ||
metadata: | ||
name: netobserv | ||
namespace: openshift-netobserv-operator | ||
spec: | ||
upgradeStrategy: Default | ||
- extraDependencies: | ||
- apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
name: netobserv-operatorgroup | ||
namespace: "" | ||
compliance: Compliant | ||
objectDefinition: | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
metadata: | ||
name: netobserv-subscription | ||
spec: | ||
remediationAction: enforce | ||
severity: medium | ||
object-templates: | ||
- complianceType: musthave | ||
objectDefinition: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: Subscription | ||
metadata: | ||
name: netobserv-operator | ||
namespace: openshift-netobserv-operator | ||
spec: | ||
channel: stable | ||
installPlanApproval: Automatic | ||
name: netobserv-operator | ||
source: redhat-operators | ||
sourceNamespace: openshift-marketplace | ||
startingCSV: network-observability-operator.v1.4.2 | ||
- extraDependencies: | ||
- apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
name: netobserv-subscription | ||
namespace: "" | ||
compliance: Compliant | ||
objectDefinition: | ||
apiVersion: policy.open-cluster-management.io/v1 | ||
kind: ConfigurationPolicy | ||
metadata: | ||
name: netobserv-csv-check | ||
spec: | ||
remediationAction: inform | ||
severity: medium | ||
object-templates: | ||
- complianceType: musthave | ||
objectDefinition: | ||
apiVersion: operators.coreos.com/v1alpha1 | ||
kind: ClusterServiceVersion | ||
metadata: | ||
namespace: openshift-netobserv-operator | ||
spec: | ||
displayName: Network Observability | ||
status: | ||
phase: Succeeded | ||
reason: InstallSucceeded |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
#!/bin/bash | ||
|
||
if [[ "$#" -lt 2 || "$1" = "--help" ]]; then | ||
echo "Syntax: $0 S3_NAME AWS_REGION" | ||
echo "" | ||
echo "Create S3 bucket and the related secret to use with Thanos" | ||
echo "You need to have the AWS CLI installed and configured." | ||
echo "" | ||
echo " e.g: $0 yourname-thanos eu-west-1" | ||
echo "" | ||
exit | ||
fi | ||
|
||
export YOUR_S3_BUCKET="$1" | ||
export YOUR_S3_REGION="$2" | ||
export YOUR_ACCESS_KEY=$(aws configure get aws_access_key_id) | ||
export YOUR_SECRET_KEY=$(aws configure get aws_secret_access_key) | ||
export YOUR_S3_ENDPOINT="s3.${YOUR_S3_REGION}.amazonaws.com" | ||
|
||
aws s3api create-bucket --bucket $YOUR_S3_BUCKET --region $YOUR_S3_REGION --create-bucket-configuration LocationConstraint=$YOUR_S3_REGION | ||
|
||
curl -s -L "https://raw.githubusercontent.com/netobserv/documents/main/examples/ACM/thanos-secret.yaml" | envsubst | kubectl apply -f - |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: thanos-object-storage | ||
namespace: open-cluster-management-observability | ||
type: Opaque | ||
stringData: | ||
thanos.yaml: | | ||
type: s3 | ||
config: | ||
bucket: $YOUR_S3_BUCKET | ||
endpoint: $YOUR_S3_ENDPOINT | ||
insecure: true | ||
access_key: $YOUR_ACCESS_KEY | ||
secret_key: $YOUR_SECRET_KEY |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for reviewers: this file is just a recipe for internal purpose, not the blog post; for the blog, look at
blogs/acm/leverage-metrics-in-acm.md