-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: velero metrics stripped #769
Comments
@felicianmv Thank you for reaching out and raising the concerns. We added a service monitor to filter out the metrics exposed via Velero because most of the metrics exposed did not have bounded cardiniality and we were concerned that this might affect openshift's in-cluster monitoring stack adversely. |
@shubham-pampattiwar Can we please have a |
Then you can have your own prometheus instance record everything. |
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
Documentation for the OADP to use User Workload Monitoring and sample Alerting Rule. Depends-On: openshift#1081 Fixes: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 Signed-off-by: Michal Pryc <mpryc@redhat.com>
Documentation for the OADP to use User Workload Monitoring and sample Alerting Rule. Depends-On: openshift#1081 Fixes: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
Documentation for the OADP to use User Workload Monitoring and sample Alerting Rule. Depends-On: openshift#1081 Fixes: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 Signed-off-by: Michal Pryc <mpryc@redhat.com>
Documentation for the OADP to use User Workload Monitoring and sample Alerting Rule. Depends-On: openshift#1081 Fixes: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 Signed-off-by: Michal Pryc <mpryc@redhat.com>
Documentation for the OADP to use User Workload Monitoring and sample Alerting Rule. Depends-On: #1081 Fixes: #769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. (#1081) First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: #769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: openshift#769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com>
…the Velero deployment. (#1092) First change to remove cluster monitoring of OADP that will be replaced by the user workload monitoring (UWM). Related Issues: #769 https://issues.redhat.com/browse/OADP-1887 https://issues.redhat.com/browse/OADP-661 The openshift-adp-velero-metrics-svc is left to easy process of enabling UWM. Once UWM is enabled, it will require setting up ServiceMonitor and configuring alerts or dashboards that are crucial for a particular use-case. Enablement of user workload monitoring with additional documentation is not part of this PR to easilly re-add the cluster monitoring in the future by reverting this change. Signed-off-by: Michal Pryc <mpryc@redhat.com> Co-authored-by: Michal Pryc <mpryc@redhat.com>
Contact Details
felician.moldovan@flex.com
Describe bug
Hi guys,
I'm trying to setup alerting and reporting for our OADP instance ( 1.0.3 ) and Prometheus/Grafana is the logical solution, but from what I see data related to Velero are only 2: velero_backup_total and velero_restore_total.
Checking upstream Velero docs and /metrics URL shows that Velero actually is exposing a lot more data ( vanila_metrics.txt attached ), but Prometheus doesn't pick them up.
vanilla_metrics.txt
On the OADP namespace I found a ServiceMonitor object ( openshift-adp-velero-metrics-sm ) which seems to be filtering data recorded by Prometheus, specifically this section:
metricRelabelings: - action: keep regex: velero_backup|velero_restore sourceLabels: - __name__
I changed the above section into
metricRelabelings: - action: keep regex: velero_backup.*|velero_restore.* sourceLabels: - __name__
and now I have all the data to setup monitoring.
What happened?
Is there a specific reason for that filter in the ServiceMonitor?
Is it fine to use my solution above for production?
Any plans to remove the filter in a future version of OADP?
OADP Version
1.0.3 (Stable)
OpenShift Version
4.8
Velero pod logs
No response
Restic pod logs
No response
Operator pod logs
No response
New issue
The text was updated successfully, but these errors were encountered: