-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug 2089716: Downstream fix for OVN-Kube node cardinality #1135
Bug 2089716: Downstream fix for OVN-Kube node cardinality #1135
Conversation
Remove time.Tick usages since "without a way to shut it down the underlying Ticker cannot be recovered by the garbage collector; it "leaks"." [https://pkg.go.dev/time#Tick] Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com> (cherry picked from commit d04852a)
1. change time.After() usage in loops to once-created NewTicker. time.After() will create a new timer instance on every iteration and is not efficient: "The underlying Timer is not recovered by the garbage collector until the timer fires. If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed." [https://pkg.go.dev/time#After] 2. make sure to Stop() all tickers to release associated resources. [https://pkg.go.dev/time#NewTicker] Signed-off-by: Nadia Pinaeva <npinaeva@redhat.com> (cherry picked from commit 0c4d3d9)
This metric produces a time series for each interface and this becomes problematic when the number of nodes in a cluster scales. For example, with a 50 node cluster and 200 pods per node, we are looking at 10000 time series. In order to protect the prometheus database, I want to remove this. Signed-off-by: Martin Kennelly <mkennell@redhat.com> (cherry picked from commit 145128e)
The cardinality of OVS interface metrics is undefined and will cause a huge increase in ovnkube-node memory and prom db size. This PR reduces this cardinality. It also removes interface metrics that have no obvious value but leaves the statistics and link_resets. Renames and reduces cardinality of: * ovs_vswitchd_interface_resets_total * ovs_vswitchd_interface_rx_dropped_total * ovs_vswitchd_interface_tx_dropped_total * ovs_vswitchd_interface_rx_errors_total * ovs_vswitchd_interface_tx_errors_total * ovs_vswitchd_interface_collisions_total Removes: * ovs_vswitchd_interface_driver_name * ovs_vswitchd_interface_driver_version * ovs_vswitchd_interface_firmware_version * ovs_vswitchd_interface_rx_packets * ovs_vswitchd_interface_tx_packets * ovs_vswitchd_interface_rx_bytes * ovs_vswitchd_interface_tx_bytes * ovs_vswitchd_interface_rx_frame_err * ovs_vswitchd_interface_rx_over_err * ovs_vswitchd_interface_rx_crc_err * ovs_vswitchd_interface_name * ovs_vswitchd_interface_duplex * ovs_vswitchd_interface_type * ovs_vswitchd_interface_admin_state * ovs_vswitchd_interface_link_state * ovs_vswitchd_interface_ifindex * ovs_vswitchd_interface_link_speed * ovs_vswitchd_interface_mtu * ovs_vswitchd_interface_ofport * ovs_vswitchd_interface_ingress_policing_burst * ovs_vswitchd_interface_ingress_policing_rate Signed-off-by: Martin Kennelly <mkennell@redhat.com> (cherry picked from commit cc90e34)
@martinkennelly: This pull request references Bugzilla bug 2089716, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: martinkennelly, trozet The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@martinkennelly: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@martinkennelly: All pull requests linked via external trackers have merged: Bugzilla bug 2089716 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Needed to
cherry-pick -x
two further commits (cc7aa38, f92dcbe) that are needed because the fix was developed & tested on said commits.Commits were cleanly cherry-picked. No changes versus upstream commits was needed.