-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Support Rancher Server Internal Metrics #20341
Comments
By the way, actually last part of work to enable rancher internal metrics is not available on PR yet.
Is that worth working for this? |
@ukinau Yes, I am actively reviewing this PR. |
Make sure @ibuildthecloud reviews the associated PRs before merging |
I'm going to make my best attempt at getting this merged before v2.3 ships. We are getting close to our code freeze date and Its kind of the bubble to be honest. |
@daxmc99 will do a detailed review of the rancher PR and then to meet our timeline, it would be great if you could address his comments by end of next week. |
Unfortunately, this isnt going to make it into v2.3, but we are planning to invest a lot more effort into scaling post v2.3 and I want to take this up then as such detailed monitoring will be critical for that effort. |
PR is here #23181 |
We should merge the PR containing all of the metrics, and add documentation on how to enable them. We should setup a scrape config so that if monitoring is deployed into a 'local' cluster, Rancher can be scraped by that instance of prometheus. A grafana dashboard should be present in the Rancher managed Grafana instance that shows:
This view should be documented so that users should know what information they are looking at, and how it might be useful in viewing the behaviors of Rancher. |
@dramich please make sure there is a docs issue opene dfor this |
Metrics have been added, opened new issue around work to be done for a dashboard #24393 |
I've covered this in today's Rancher version
Added setup and test steps details to the Internal Metrics test plan under the 2.4 test plans section. |
Metrics are present in 2.4 and QA'd with the coverage described above. Dashboard will be covered in a separate issue: #24393 |
What kind of request is this (question/bug/enhancement/feature request):
feature request
Idea
I want to have /metrics endpoint in rancher server to expose rancher's internal state to make it easy to operate Rancher for bunch of clusters, nodes.
When user use/operate Rancher for multiple kubernetes cluster with many nodes like more than 50 cluster, 1000 nodes. it's difficult for operator to grasp the internal situation of Rancher like checking if all agent establish websocket session, check if frequency of websocket session disconnected, check owner ship of cluster for user controller and so on. To help operator monitor ranher's detail's behaviour. I hope we can have metrics endpoint.
Feature: Endpoint
Add "https://<rancher-server>/metrics". This endpoint should return metrics information in prometheus format.
Feature: Support Metrics
Metrics type I want to support is followings
Generic Controller Related in Norman (Already in Norman rancher/norman#202 )
Session Manager(remotedialer) Related in Norman (PR has been submitted rancher/norman#285)
=> Total Count of adding websocket session
=> Total Count of removing websocket session
=> Total count of adding connection
=> Total count of removing connection
=> Total bytes of transmiting
=> Total bytes of transmiting error
=> Total bytes of receiving
=> Total count of attempt to establish websocket session to other rancher-server
=> Total count of connected websocket session to other rancher-server
=> Total count of dis-connected websocket session from other rancher-server
ClusterOwner in Rancher
Feature: Provide control(New settings) to enable metrics endpoint or not
Enabling Metrics Endpoint cause some performance overhead, and memory consumption. That's why it's better to give a user choice to enable or disable
The text was updated successfully, but these errors were encountered: