Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus metrics service #522

Merged
merged 11 commits into from
Jan 12, 2024
Merged

Prometheus metrics service #522

merged 11 commits into from
Jan 12, 2024

Conversation

bacv
Copy link
Member

@bacv bacv commented Nov 9, 2023

Implementation of nomos metrics service using prometheus_client. Before adding more features and metrics the concept validation is needed.

Similar to the tracing crate, nomos metrics service does not own the "main" registry that is used by all other service to put their metrics in. The main registry is created before initializing the node and passed to services to register them selves.

Nomos metrics service is responsible for accessing the registry whenever remote prometheus server queries new metrics (via nomos-api).

Every service that wants to publish metrics need to register them selves during the initialization. After this point individual services are responsible how they collects the metrics. The easiest approach is to track service messages by type, but more advanced metrics can by added.

If this approach looks viable, then current MempoolMsg::Metrics message could be renamed and moved under "{cl, da}/stats".

This implementation is currently located in root directory/nomos-metrics but should eventually replace nomos-services/metrics. Old implementation is left for readablity and would be removed.

Example respose when querying localhost:8080/metrics:

curl localhost:8080/metrics
# HELP mempool-cl_messages Messages emitted by the Mempool.
# TYPE mempool-cl_messages counter
# HELP mempool-da_messages Messages emitted by the Mempool.
# TYPE mempool-da_messages counter
# EOF

@bacv bacv added the metrics Changes related to the metrics label Nov 9, 2023
@bacv bacv added this to the Nomos testnet (playground) milestone Nov 9, 2023
@bacv bacv self-assigned this Nov 9, 2023
Copy link
Contributor

@youngjoon-lee youngjoon-lee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 The overall concept looks great and will fit well with the libp2p metrics that I had tried before: #431 (I'll work on it again after this PR is merged).

@danielSanchezQ danielSanchezQ marked this pull request as draft November 10, 2023 08:36
Comment on lines 206 to 208
if let Some(metrics) = &self.metrics {
metrics.record(&msg);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all metrics stuff on services should be feature gated as well 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can feature gate it, everything could be contained inside the metrics.rs in each service, but I don't thing that it's necessary - just like in our previous approach, we didn't feature gate the individual messages, here we don't record anything and disable the service responsible for gathering what's collected.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I understand it. I just have an itch on having a if check for something it may never be true running on production. Let's see if someone else back up or we just leave it like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I get it. I was betting on cpu branch prediction, but rust could just eliminate the problem, will add a feature gate.

Copy link
Collaborator

@danielSanchezQ danielSanchezQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a really good approach. Thanks for the effort!

@bacv bacv removed the DO NOT MERGE label Nov 10, 2023
@@ -317,4 +334,5 @@ where
pub struct Settings<B, N> {
pub backend: B,
pub network: N,
pub registry: Option<NomosRegistry>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing gate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One approach would be to feature gate everything completely, another one would only target the functionality.
In this case I was targeting only unnecessary functionality, reasoning being that Settings can have everything in them, but the service would act on those only if the feature is enabled.

If we want complete feature gate, eliminate dependencies etc then bit more changes are needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok. I get the approach now. Uhm, I do not see the harm on having it in the settings, although it feels a bit of a leftover. We can go with this for now. We can harden it later if needed (probably good to have an issue for tracking this).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created here: nomos-node#523

@bacv bacv marked this pull request as ready for review November 13, 2023 10:13
@bacv bacv merged commit 46d5347 into master Jan 12, 2024
7 checks passed
@bacv bacv deleted the metrics-service branch January 12, 2024 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metrics Changes related to the metrics
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants