You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now we serve little information on Processing Engine performance, so this ticket looks to add some basic metrics to track.
Update the /metrics endpoint to serve the following metrics:
New Processing Engine Metrics:
plugin_execution_duration_seconds_bucket: Amount of time spent executing a plugin, per plugin, per trigger, with trigger type, bucketed into 0.001, 0.0025, 0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 10, inf seconds
--- Plugin name should be entered as the file name without .py on the end.
plugin_execution_duration_seconds_sum: Total amount of time spent executing a plugin, per plugin, per trigger, with trigger type
plugin_execution_duration_seconds_count: Total number of times a plugin is executed, per plugin, per trigger, with trigger type
processing_engine_memory_size_bytes: Total size of Processing Engine memory in bytes
--- If this can be broken down further into threads, that'd be great, but not required
processing_engine_plugin_errors: Total number of errors, per plugin, per trigger
processing_engine_memory_size_bytes: Total size of Processing Engine memory in bytes
Labelling the metrics per plugin and per trigger may cause too high of a cardinality, especially for the duration histograms. Would you consider db label to group them by database as we have done for other metrics as a starting point?
Trigger type is one that we can label because the cardinality of that is bounded (there are 5 or so types).
I guess cardinality would be Ntriggers in this case, regardless of how many plugins or databases or trigger types (each trigger has a single type).
There are 15 lines emitted by the /metrics API for each duration histogram, so there would be worst case 15,000 lines.
I'm not very familiar with the limitations of prometheus or what is considered high cardinality, only that they recommend against unbounded cardinality for labels. If this is acceptable then I won't block it.
Uh oh!
There was an error while loading. Please reload this page.
Processing Engine Metrics
Right now we serve little information on Processing Engine performance, so this ticket looks to add some basic metrics to track.
Update the
/metrics
endpoint to serve the following metrics:New Processing Engine Metrics:
--- Plugin name should be entered as the file name without .py on the end.
--- If this can be broken down further into threads, that'd be great, but not required
E.g.
The text was updated successfully, but these errors were encountered: