Skip to content

Seeing lot of drops in metrics #850

@shreyaspandey

Description

@shreyaspandey

Hi,

I have 1 gnmic instance and around 150 devices configured with gnmi.
Out of that around 70 are sending metrics.
I am seeing lot of drop in metrics for working devices and grafana dashboard panels randomly shows no data for different metrics.

the pipeline is gnmic --> prometheus--> grafana.

My gnmi target device looks like this

"hqa-7060x6-4":
address: "10.2.33.119:6030"
insecure: true
username: "xxxx"
password: "xxxxx"
subscriptions: ["fec-ber", "cpu", "memory", "storage", "intf-counters", "crc-errors", "temperature", "fan-status", "intf-oper-status", "intf-admin-status", "bgp-neighbors", "bgp-session-state", "xcvrs-input-power", "xcvrs-output-power", "xcvrs-bias-current", "psu-status", "qos-queues", "mac-pause", "eos-system-state", "eos-ecn-buffer"]

I am trying to find a solution where I see minimal metrics drop.

I am trying to figure out if having so many individual subscription per device is problem?
And grouping them up will solve it?

Or I should think about gnmic clustering?

We might scale to around 300-400 devices in future.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions