-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document metrics generated within OpenWhisk #3884
Conversation
@vvraskin @mhenke1 would you review this? @chetanmeh this is very useful thanks for doing this. |
Codecov Report
@@ Coverage Diff @@
## master #3884 +/- ##
==========================================
- Coverage 75.78% 71.05% -4.73%
==========================================
Files 145 145
Lines 6897 6921 +24
Branches 418 410 -8
==========================================
- Hits 5227 4918 -309
- Misses 1670 2003 +333
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neatly done, thank you!
Just a few minor comments.
docs/metrics.md
Outdated
@@ -75,6 +75,183 @@ The docker image exposes StatsD via the (standard) port 8125 and a Grafana dashb | |||
|
|||
The address of your docker host has to be configured in the `metrics_kamon_statsd_host` configuration property. | |||
|
|||
### Metric Names | |||
|
|||
All metric names have to prefixed by a prefix that you specify and are subject to modification by graphite, datadog, or statsd. For example if prefix used is `openwhisk` then metric names would be like `openwhisk.counter.controller_activation_start`. This document assumes that metric name prefix is `openwhisk` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some typos, have to be prefixed
probably
docs/metrics.md
Outdated
|
||
#### Controller metrics | ||
|
||
Metrics below are emitted from with a Controller instance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from within
, maybe?
docs/metrics.md
Outdated
|
||
##### Controller Startup | ||
|
||
* `openwhisk.counter.controller_startup<controller_id>_count` (counter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its openwhisk.counter.controller<controller_id>_startup_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked the stats in our setup and they are named like openwhisk.counter.controller_startup0_count
. Also in code the prefix is startup$id
docs/metrics.md
Outdated
|
||
Metrics below are emitted per kafka topic. | ||
|
||
* `openwhisk.histogram.kafka_<topic name>.delay_start` - Time delay between when a message was pushed to kafka and when it is read within a consumer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also mention that Delay is being emitted for each pool by Invoker, while Queue metric is emitted every 10 seconds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missed that it was be done via a scheduled task. Per default config it should be emitted every 60 seconds
@vvraskin Thanks for review. One aspect I am struggling a bit in our monitoring is the variable metric names i.e. name which are function of invoker or controller id. They make it hard to aggregate metrics across invoker or controller. It would be better if we record the id as kamon tag and then use it on observation side to segregate metrics per specific invoker/controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@chetanmeh |
@vvraskin We use Datadog and it does not seem to support regex in metric names. Which makes it tricky to aggregate metrics across various invokers. Does Grafana+Influx support tags? Looking for common denominator here |
Our statsd integration with influx won't support tagging, but I think that we already have tagging option for Kamon introduced in this PR 6f1a445 |
Opened #3917 to discuss possible options for supporting setups which do not support regex |
* Document metrics * Trim trailing space * Update based on Vadim's review.
OpenWhisk generates quite a few metrics via Kamon. This PR aims to document all currently generated metrics such that system administrators can configure the right metrics for monitoring
See here for the rendered HTML version of doc
Related issue and scope
My changes affect the following components
Types of changes
Checklist: