Not able to fetch snapscheduler metrics #116

prasanjit-enginprogam · 2021-03-30T03:19:04Z

Describe the bug
Wanted to scrape the snapscheduler metrics from prometheus , metrics seems to be not working.

Steps to reproduce
Had vanilla install of snapscheduler

$ kubectl describe service/snapscheduler-metrics -n abcns
Name:              snapscheduler-metrics
Namespace:         cloudops
Labels:            name=snapscheduler
Annotations:       <none>
Selector:          name=snapscheduler
Type:              ClusterIP
IP Families:       <none>
IP:                172.20.219.85
IPs:               <none>
Port:              http-metrics  8383/TCP
TargetPort:        8383/TCP
Endpoints:         <none>
Port:              cr-metrics  8686/TCP
TargetPort:        8686/TCP
Endpoints:         <none>
Session Affinity:  None
Events:            <none>
$

Using port-forwarding, it's giving me a timeout error.

kubectl --kubeconfig ABC.config port-forward svc/snapscheduler-metrics -n cloudops 9100:8686
kubectl --kubeconfig ABC.config port-forward svc/snapscheduler-metrics -n cloudops 9100:8383

Expected behavior
I should be able to see metrics

Actual results
getting below error

error: timed out waiting for the condition

Please help

The text was updated successfully, but these errors were encountered:

JohnStrunk · 2021-03-30T14:05:15Z

I can confirm the issue, at least w/ the Helm chart. Is that how you deployed?

prasanjit-enginprogam · 2021-03-30T17:31:28Z

@JohnStrunk : Yes i used Helm to deploy.

prasanjit-enginprogam · 2021-03-30T19:04:47Z

@JohnStrunk : Let me know what version/tag should I use after your fix.

JohnStrunk · 2021-03-30T21:03:37Z

Ok. I'll put this into a release in the next couple days.

JohnStrunk · 2021-04-05T19:03:17Z

A new release has been published: https://artifacthub.io/packages/helm/backube-helm-charts/snapscheduler
The metrics port should be accessible.

prasanjit-enginprogam · 2021-04-05T20:42:44Z

@JohnStrunk : are you going to add a new tag in https://quay.io/repository/cloudops/snapscheduler?tab=tags .. currently i can only see 1.1.1

JohnStrunk · 2021-04-05T20:51:37Z

cloudops isn't an official source. I don't know what those images are.
The snapscheduler container repo is in the backube org on quay:

$ skopeo list-tags docker://quay.io/backube/snapscheduler
{
    "Repository": "quay.io/backube/snapscheduler",
    "Tags": [
        "1.0.0",
        "1.1.0",
        "1.1.1",
        "latest",
        "1.2.0"
    ]
}

v1.2.0 is from today, as is the helm chart v1.3.0 on artifacthub.

prasanjit-enginprogam · 2021-04-05T21:00:00Z

Great thanks

prasanjit-enginprogam · 2021-04-05T21:04:22Z

@JohnStrunk: i was thinking if there is a way to add backup status in the metrics endpoint.. so that we can hook this up to grafana dashboard and hook up pagerduty(oncall) alerts based on it. what are your thoughts?

if you feel this makes sense and is something which is doable, I can create a new feature request for the same.

JohnStrunk · 2021-04-05T21:14:01Z

If there are particular metrics your looking for, please open an issue describing them (or a discussion thread).

Right now, snapscheduler doesn't monitor the snapshots that it creates. It just creates the VolumeSnapshot object and walks away. I'm guessing you'd want something to ensure it becomes readyToUse within some timeframe, but I have no idea how to bound that (AWS can take a looong time for big volumes).

Simple metrics about how many snapshots were created are much easier, but I'm not sure how useful that is.

prasanjit-enginprogam · 2021-04-05T23:04:45Z

@JohnStrunk: yes, correct, have some sort of a watcher pod that keeps track of backups happening and checks the readyToUse flag and exposes it as metrics from the endpoint which then can be scrapped by Prometheus or other observability tools.

prasanjit-enginprogam added the bug Something isn't working label Mar 30, 2021

JohnStrunk self-assigned this Mar 30, 2021

JohnStrunk mentioned this issue Mar 30, 2021

Fix metrics service #117

Merged

JohnStrunk closed this as completed in #117 Mar 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to fetch snapscheduler metrics #116

Not able to fetch snapscheduler metrics #116

prasanjit-enginprogam commented Mar 30, 2021

JohnStrunk commented Mar 30, 2021

prasanjit-enginprogam commented Mar 30, 2021

prasanjit-enginprogam commented Mar 30, 2021 •

edited

JohnStrunk commented Mar 30, 2021

JohnStrunk commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021

JohnStrunk commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021 •

edited

JohnStrunk commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021

Not able to fetch snapscheduler metrics #116

Not able to fetch snapscheduler metrics #116

Comments

prasanjit-enginprogam commented Mar 30, 2021

JohnStrunk commented Mar 30, 2021

prasanjit-enginprogam commented Mar 30, 2021

prasanjit-enginprogam commented Mar 30, 2021 • edited

JohnStrunk commented Mar 30, 2021

JohnStrunk commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021

JohnStrunk commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021 • edited

JohnStrunk commented Apr 5, 2021

prasanjit-enginprogam commented Apr 5, 2021

prasanjit-enginprogam commented Mar 30, 2021 •

edited

prasanjit-enginprogam commented Apr 5, 2021 •

edited