-
Notifications
You must be signed in to change notification settings - Fork 34
NETOBSERV-579 prefixing all metrics, add new ones #314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6b19d3f to
04954c0
Compare
- Allow to prefix operational metrics (it introduces a global settings for metrics, which will probably be extended later) - Unify metrics usage across stages: e.g. channel size was tracked only in netflow ingester; now they share a common struct that generate metrics - Adding new gauges for tracking channel sizes - Some renaming - Updated metrics doc
04954c0 to
d4157c0
Compare
|
Here's the details of the operational metrics conntrack_input_records (unchanged)
conntrack_memory_connections (unchanged)
conntrack_output_records (unchanged)
encode_prom_errors (unchanged)
ingest_flows_processed (renamed from "ingest_collector_flow_logs_processed", added stage label)
metrics_processed (renamed from "encode_prom_metrics_processed", added stage label)
records_written (renamed from "loki_records_written", added stage label, now works also with kafka write)
stage_duration_ms (unchanged)
stage_in_queue_size (renamed from ingest_collector_queue_length, generalized for all stages)
stage_out_queue_size (new metric)
|
| } | ||
| } | ||
|
|
||
| // TODO / FIXME / FIGUREOUT: seems like we have 2 input channels for Loki? (this one, and see also pipeline_builder.go / getStageNode / StageWrite) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This channel seems to act as a buffer to avoid blocking the previous pipeline stage. Probably we'd need to verify that removing the in channel wouldn't affect performance.
When we are able to use Go 1.18, maybe we could upgrade to an upper-upperstream version of the go-pipes library that allows defining the buffer length between pipeline stages.
Codecov Report
@@ Coverage Diff @@
## main #314 +/- ##
==========================================
+ Coverage 69.65% 70.11% +0.46%
==========================================
Files 82 84 +2
Lines 4729 4832 +103
==========================================
+ Hits 3294 3388 +94
+ Misses 1240 1229 -11
- Partials 195 215 +20
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
pkg/pipeline/encode/encode_prom.go
Outdated
| if val, found := flow[info.Filter.Key]; found { | ||
| sVal, ok := val.(string) | ||
| if !ok { | ||
| sVal = fmt.Sprintf("%v", val) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe fmt.Sprint(val) is slightly faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
08d3dbe to
82dd179
Compare
| type MetricDefinition struct { | ||
| Name string | ||
| Help string | ||
| Type metricType | ||
| Labels []string | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jotak IIUC, Type metricType is only used by the auto-documentation.
So, theoretically, one can create a MetricDefinition of type Counter and pass it to NewGauge() causing a misleading documentation.
If I'm correct, is it worth adding a check on NewGauge()/NewCounter()/NewHistogram() to validate the type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct, sure we can do what you suggest
for metrics, which will probably be extended later)
in netflow ingester; now they share a common struct that generate
metrics. More metrics now include the "stage" label.
Some refactoring to make doc generation work with metrics that are not defined as globals (need to decouple metrics definition from its instantiation in prom registry) . Update doc to include labels, and sort alphabetically.
Also I had to move health & operationalMetrics into the same package to avoid repetitions like
operationalmetrics.Metrics, which resulted also in move health_test into the pipeline package (to avoid cycle)Breaking changes: the breaking changes are only about metric names (not API), see below