feat(monitoring): add prometheus metrics support to the OpenAPI specs of plugins & API server #458
Labels
API_Server
dependencies
Pull requests that update a dependency file
enhancement
New feature or request
Milestone
Is your feature request related to a problem? Please describe.
Right now, if we were to run Cactus in production, there'd have to be custom logging/metrics implemented in order to be able to respond to (seemingly) simple questions such as "what is the current, effective TPS on my Cactus node that I operate?"
Or how many transactions have succeeded/failed in the last 24 hours? etc..
Describe the solution you'd like
.../metrics
endpoint exposed by all plugins and the API server as well.Describe alternatives you've considered
The other big player in the monitoring/APM game is the ELK stack, I'm not opposed to baking that in as well, but what I definitely do not want is making them mutually exclusive. People should be able to choose and without much hassle or confusion.
Additional context
Kubernetes is (probably) the most popular container orchestration platform out there with a steady growth in adoption. [1] [2]
Because of this, longer term we should have ops capabilities that make life easier for people who run Cactus nodes on Kubernetes. Prometheus (anecdotally, seems) to be the monitoring solution of choice for people using Kubernetes in production and that's why I figured we should make some effort here, eventually: At the time of this writing, I'm not putting this down as a 1.0 milestone requirement, but will put it in 1.2 because I believe it will become very important as soon as we have production deployments of Cactus out there in the wild.
[1] https://enterprisersproject.com/article/2020/6/kubernetes-statistics-2020
[2] https://www.datadoghq.com/container-report/
cc: @takeutak @jonathan-m-hamilton @hartm @sfuji822
The text was updated successfully, but these errors were encountered: