Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(monitoring): add prometheus metrics support to the OpenAPI specs of plugins & API server #458

Closed
petermetz opened this issue Jan 6, 2021 · 3 comments
Assignees
Labels
API_Server dependencies Pull requests that update a dependency file enhancement New feature or request
Milestone

Comments

@petermetz
Copy link
Contributor

petermetz commented Jan 6, 2021

Is your feature request related to a problem? Please describe.

Right now, if we were to run Cactus in production, there'd have to be custom logging/metrics implemented in order to be able to respond to (seemingly) simple questions such as "what is the current, effective TPS on my Cactus node that I operate?"
Or how many transactions have succeeded/failed in the last 24 hours? etc..

Describe the solution you'd like

  1. A prometheus .../metrics endpoint exposed by all plugins and the API server as well.
  2. Would also be nice to have a pre-built grafana Dashboard so that people who deploy Cactus can get a detailed and visual overview of what their node(s) are doing.
  3. It's important that we have each plugin the ability to host it's own metrics because it will be difficult to define a common set of metrics that can work for all of them (I think?). Also, we do want to have the ability to monitor the plugins individually, e.g. I don't want to only able to see the summed up transaction per second values for a collection of plugins that talk to different ledgers, I want to see each plugin by themselves as well. Granularity is important

Describe alternatives you've considered

The other big player in the monitoring/APM game is the ELK stack, I'm not opposed to baking that in as well, but what I definitely do not want is making them mutually exclusive. People should be able to choose and without much hassle or confusion.

Additional context

Kubernetes is (probably) the most popular container orchestration platform out there with a steady growth in adoption. [1] [2]
Because of this, longer term we should have ops capabilities that make life easier for people who run Cactus nodes on Kubernetes. Prometheus (anecdotally, seems) to be the monitoring solution of choice for people using Kubernetes in production and that's why I figured we should make some effort here, eventually: At the time of this writing, I'm not putting this down as a 1.0 milestone requirement, but will put it in 1.2 because I believe it will become very important as soon as we have production deployments of Cactus out there in the wild.

[1] https://enterprisersproject.com/article/2020/6/kubernetes-statistics-2020
[2] https://www.datadoghq.com/container-report/

cc: @takeutak @jonathan-m-hamilton @hartm @sfuji822

@petermetz petermetz added enhancement New feature or request API_Server dependencies Pull requests that update a dependency file labels Jan 6, 2021
@petermetz petermetz added this to the v1.2.0 milestone Jan 6, 2021
@jagpreetsinghsasan
Copy link
Contributor

I would like to work on this (currently working).

@jagpreetsinghsasan
Copy link
Contributor

Created issue #531 for the fabric plugin

@jagpreetsinghsasan
Copy link
Contributor

jagpreetsinghsasan commented Mar 4, 2021

Acceptance Criteria: (open to modifications)

  1. A prometheus ../metrics endpoint exposed by all the plugins.
  2. Each plugin should host its own set of metrics and shall not add up. Like total transaction count for besu and quorum should be shown differently and shall not be summed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API_Server dependencies Pull requests that update a dependency file enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants