-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: expose metrics for prometheus to scrape #67
Conversation
Few quick questions
|
|
What is the approach you guys use to collect the metrics exposed by the different NFs (currently |
Our approachIn the namespace in which we have the control plane, we also deploy Grafana Agent which scrapes all of the network functions that expose metrics. We then integrate Grafana Agent with our Observability stack (Grafana, Prometheus, Loki) which runs in a separate namespace. This allows us to centralise observability (logs, metrics, alert rules, and dashboards). Here's a crude visualisation: Why we don't use metricsfunc1. It prevents metrics being tied to their originatorsIt's important that metrics are tied to their originating network function, especially for system metrics (ex. 2. We don't benefit from itMetricsfunc is an additional workload and we would not benefit from maintaining it. For us it'd be added effort for no benefits as the same metrics are already exposed by the individual network functions. |
I am fine..Fix the conflict |
@gruyaume, please rebase this PR. |
There will be another PR at some point to add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Description
Here we expose a
metrics
endpoint for Prometheus to scrape the network function. Right now, we are only exposing the default Go metrics, allowing users to know whether the service is running or not in addition to valuable information (ex. memory usage, num. of goroutines, etc).Screenshot
For example, we can now have a dashboard that tells us the status of the network function:
Implementation
We take the same approach to metrics as is done in the AMF, we create a
metrics/telemetry.go
file and we instantiate the server during the service startup.Notes
If approved, we will make similar PR's in every network function.
Future Considerations
With this in place, it will be straightforward to add bespoke metrics to the network function.
Reference