Metrics support#159
Conversation
* Add num_connections metric as initial metric for testing
|
This patch adds a competing statistic for one that already exists. Going forward it would require a completely redundant set and every statistic would be counted twice. That's not desirable for the long run. Another approach might be to hook up the http-libwebsockets to the existing statistics and then to export them in the desired format. The existing management database has configuration data and runtime statistics together. See https://github.com/apache/qpid-dispatch/blob/master/python/qpid_dispatch/management/qdrouter.json.readme.txt This would be more work to hook up the the metric_manager as the code would have to hunt for all the values where they exist today. But in the end the resulting reports would have identical descriptions and values reported by qdstat and the other management tools. |
|
Thank you for your comments. My example metric was perhaps too simplicic, apologies for that. Here are more examples of the metrics I'd like to be able to provide:
Is this information that I can retrieve in the same way as qdstat or specify in the management.json? My impression was that qdstat only allowed for 'current state' type of metrics. If we can represent the above using management, I agree it could be better to use that. However, there is also the question of the impact of collecting metrics (tradeoff between quering for the metrics vs. the cost of reporting the metrcs). Reporting a metric explicitly may scale better, since a counter can be incremented and queried without locking in the common case. Correct me if I'm wrong, but my impression was that management operations either has to run in the core thread or do some locking? Would an alternative here be to have management/qdstat retrieve the metrics from the metrics manager? This would resolve the concerns of reporting metrics twice, and could also potentially enrich the qdstat output with values from histograms etc. For reference, here is the document describing the format: https://prometheus.io/docs/instrumenting/exposition_formats/ |
|
I share Chuck's concern that this PR is introducing an alternative and redundant management access mechanism. I would certainly prefer to add the desired metrics to the AMQP-based access mechanism that is already in place. |
|
At the risk of contradicting my previous post, I would add that having an HTTP front end on management data might be a worthwhile enhancement for some use cases as long as it used the same data sources as the AMQP front end. Mind would have to be paid to the security model for such HTTP accesses to management data. |
@ganeshmurthy @ted-ross This is a proposal of metrics in the router and expose them to a prometheus agent.
This work would allow anyone running the router on openshift to easily integrate it with hawkular or any other store that supports reading prometheus endpoints.
The PR consists of:
The libraries are written with the idea that incrementing a metric should be as cheap as possible (though there are several optimizations to be made for efficiently looking up labels). A metric is registered in the metric manager, which will store it in its internal database. A handle to the metric (num_connections for instance) is stored for efficient metric reporting. The handle is used whenever the metric should be incremented, decremented or set.
The API also contains a set of macros that makes passing labels less verbose.