Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add prometheus exporter. #139

Merged
merged 85 commits into from Nov 30, 2019
Merged

Add prometheus exporter. #139

merged 85 commits into from Nov 30, 2019

Conversation

davidnewhall
Copy link
Member

@davidnewhall davidnewhall commented Nov 12, 2019

Almost complete. Feedback required. Closes #88.

The initial contribution will include support for storing metrics in Prometheus. I'm not updating any repo documentation beyond the few config file changes and manual updates you see here. The feature will be mostly silent, but available for advanced users and beta testers.

Later I will build some simple dashboards to get people started. We can iterate on them in time. Once I'm satisfied this implementation doesn't suck and we have at least 3 or 4 dashboards with the basic data displayed, I'll rebrand the product as "including prometheus support." This will take time as there are a lot of things that need to be reworded.

Completed:

  • Client
  • Site
  • USW
  • USG
  • UAP
  • UDM

Todo:

  • Make more dashboards. Clients is done.
2019/11/29 01:09:06 [INFO] UniFi Measurements Exported. Sites: 1, Clients: 36, Wireless APs: 2, Gateways: 1, Switches: 1, Descs: 196, Metrics: 1448, Errors: 0, Zeros: 317, Elapsed: 41ms

labels := []string{"name", "mac", "site_name", "gw_mac", "gw_name", "sw_mac", "sw_name", "vlan", "ip", "oui", "network", "sw_port", "wired"}
labelWireless := append([]string{"ap_mac", "ap_name", "radio_name", "radio", "radio_proto", "channel", "essid", "bssid", "radio_desc"}, labels...)
return &uclient{
Anomalies: prometheus.NewDesc(ns+"anomalies_total", "Client Anomalies", labelWireless, nil),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this isn't a counter.

Suggested change
Anomalies: prometheus.NewDesc(ns+"anomalies_total", "Client Anomalies", labelWireless, nil),
Anomalies: prometheus.NewDesc(ns+"anomalies", "Client Anomalies", labelWireless, nil),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a counter that resets. very odd. lets have a comment chain here. unless you're on discord?

I just uploaded a new clients dashboard. Please grab it again. It'll be easier if you tell me what changes you make so i can make them on my dash. I'm editing a few things at once.....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also just realized some labels are screwed up in UAP :( gonna have to fix that up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I wonder what triggers the resets. It doesn't seem like a normal counter to me.

My changes:

  • [$__interval] for rate, and setting a min step to 1m. This makes for nice zooming.
  • Graph null as null, not connected
  • Graph fill to 0
  • Multiply byte rates by 8 to make them bps, since network people like to see Mbps not MBps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, also, you're probably going to want to run 1m interval. newer unifi controller doesn't update every 30s...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of those options sound good except the part where we do math. I'd like to avoid that and just present the data as received from the controller. I'm trying to keep this as similar to the existing dashboard as possible. Let me see if I can find all those thingies you tweaked..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work? it just produces no data points. rate(unifipoller_client_transmit_bytes_total[$__interval]) but using [5m] does. This is the range vector it uses to find two points to calculate the rate. 5m seems reasonable and will overcome any missing points because the poller went down for a few minutes. no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to set a "min step" in order to make sure $__interval doesn't get smaller than your scrape interval.

image

It's too bad the Unifi data updates are so slow. I'm used to having much more data in Promehteus. 😁 I'm polling the APs ever 15s via SNMP right now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, that really doesn't sounds like the right thing to put into the range vector. Even 10 or 15m seems more reasonable than locking it to the artificial interval you're setting at 2m. 2m doesn't provide the best resolution: 1m does, and if you set the min step to 1m and try to use $__interval as the rate range vector it doesn't work because it can't find two points in 1m.

Copy link
Contributor

@SuperQ SuperQ Nov 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, if your scrape interval is 1m, 2m in the "min step" is the minimum setting you can do. The $__interval is in Grafana for exactly the use case of auto-scaling the range vector based on the graph. The min step is the minimum, not a lock on the step.

If the graph is scaled out to a week, $__interval will be adjusted accordingly, and min step will no longer apply.

@SuperQ
Copy link
Contributor

SuperQ commented Nov 29, 2019

I've setup the latest beta dashboard here. I made some changes to a few things.

@davidnewhall davidnewhall merged commit d052e69 into master Nov 30, 2019
@davidnewhall davidnewhall deleted the dn2_prometheus branch November 30, 2019 08:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Prometheus Exporter
2 participants