Add a prometheus collectior for api metrics; drive by fixed to reduce… #7487

Merged
merged 4 commits into from Jun 12, 2017

Conversation

Projects
None yet
3 participants
Owner

wallyworld commented Jun 12, 2017

Description of change

Add an prometheus collector for apiserver metrics:

  • current connection count
  • connection rate (per second)
  • connection pause time (ms)
  • concurrent login attempts

As a drive by, make logs quieter.
Also tweak the minimum connection pause to 0 - a pause
will only start after reaching the (default) lower threshold of 1000 conns/sec.

QA steps

bootstrap
ssh into controller
juju-introspect metrics

Documentation changes

Not really - the general introspection command is documented.

Bug reference

Fixes some bugs about log noise
https://bugs.launchpad.net/juju/+bug/1695853
https://bugs.launchpad.net/juju/+bug/1695851

apiserver/apiservermetrics.go
+ Name: "connection_count",
+ Help: "Current number of apiserver connections",
+ }),
+ connectionRateGauge: prometheus.NewGauge(prometheus.GaugeOpts{
@axw

axw Jun 12, 2017

Member

if we also exported a counter metric for connections, users could use prometheus's builtin rate() function for computing the per-second rate of change...

apiserver/apiservermetrics.go
+ }),
+ connectionPauseTimeGauge: prometheus.NewGauge(prometheus.GaugeOpts{
+ Namespace: apiserverMetricsNamespace,
+ Name: "connection_pause_time",
@axw

axw Jun 12, 2017

Member

according to https://prometheus.io/docs/practices/naming/, this should have the unit as a suffix, and use the base unit (seconds)

e.g. "connection_pause_seconds"

and then divide by 1000 when setting the value

apiserver/apiservermetrics.go
+ }),
+ concurrentLoginsGauge: prometheus.NewGauge(prometheus.GaugeOpts{
+ Namespace: apiserverMetricsNamespace,
+ Name: "concurrent_login_attempts",
@axw

axw Jun 12, 2017

Member

this name isn't very clear to me. how about "active" instead of "concurrent"? that's what it's measuring, right? logins that are underway, but not yet complete?

axw approved these changes Jun 12, 2017

LGTM, but I'd prefer if the rate metric were dropped. It's considered bad practice to record derived metrics, as opposed to recording the raw metrics and deriving within Prometheus.

apiserver/apiservermetrics.go
+ src: src,
+ connectionCounter: prometheus.NewCounter(prometheus.CounterOpts{
+ Namespace: apiserverMetricsNamespace,
+ Name: "connection_counter",
@axw

axw Jun 12, 2017

Member

connections_total

Owner

wallyworld commented Jun 12, 2017

Ok, I've dropped the rate metric

Owner

wallyworld commented Jun 12, 2017

$$merge$$

Contributor

jujubot commented Jun 12, 2017

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

@jujubot jujubot merged commit 4a640f2 into juju:2.2 Jun 12, 2017

1 check passed

github-check-merge-juju Ran tests against PR. Use !!.*!! to request another build. IE, !!build!!, !!retry!!
Details

wallyworld added a commit to wallyworld/juju that referenced this pull request Jun 12, 2017

Revert "Merge pull request #7487 from wallyworld/apiserver-metrics"
This reverts commit 4a640f2, reversing
changes made to f87ccfe.

wallyworld added a commit to wallyworld/juju that referenced this pull request Jun 13, 2017

Merge pull request #7487 from wallyworld/apiserver-metrics
Add a prometheus collectior for api metrics; drive by fixed to reduce…

Add an prometheus collector for apiserver metrics:
- current connection count
- connection rate (per second)
- connection pause time (ms)
- concurrent login attempts

As a drive by, make logs quieter.
Also tweak the minimum connection pause to 0 - a pause
will only start after reaching the (default) lower threshold of 1000 conns/sec.

bootstrap
ssh into controller
juju-introspect metrics

Not really - the general introspection command is documented.

Fixes some bugs about log noise
https://bugs.launchpad.net/juju/+bug/1695853
https://bugs.launchpad.net/juju/+bug/1695851

jujubot added a commit that referenced this pull request Jun 13, 2017

Merge pull request #7493 from wallyworld/apiserver-metrics-again
Add a prometheus collectior for api metrics

Merge pull request #7487 but with a fix for metrics registration when the api server is restarted if an error occurs.

Add a prometheus collectior for api metrics

Add an prometheus collector for apiserver metrics:
- current connection count
- connection rate (per second)
- connection pause time (ms)
- concurrent login attempts

bootstrap
ssh into controller
juju-introspect metrics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment