Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dashboard for monitoring conntrack race failures. #6329

Merged

Conversation

ScheererJ
Copy link
Contributor

How to categorize this PR?

/area networking
/area monitoring
/kind enhancement

What this PR does / why we need it:
Add dashboard for monitoring conntrack race failures.

Connection tracking is an important optimization of the linux kernel to reduce the overhead
of applying network filter rules to all network packets.
Unfortunately, the connection tracking code in the linux kernel has some race conditions,
which can lead to connection attempts loosing their connection tracking entry. This is
especially possible when network address translation is used. The result is a failed
connection attempt.
This is why this change introduces two dashboards for the corresponding metric.

Which issue(s) this PR fixes:
None.

Special notes for your reviewer:

Release note:

Additional dashboards for monitoring conntrack insertion failures most likely due to conntrack races

/cc @wyb1 @DockToFuture

Connection tracking is an important optimization of the linux kernel to reduce the overhead
of applying network filter rules to all network packets.
Unfortunately, the connection tracking code in the linux kernel has some race conditions,
which can lead to connection attempts loosing their connection tracking entry. This is
especially possible when network address translation is used. The result is a failed
connection attempt.
This is why this change introduces two dashboards for the corresponding metric.
@gardener-prow gardener-prow bot added area/networking Networking related area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jul 14, 2022
@ScheererJ
Copy link
Contributor Author

Node Details:
image
(bottom graph)

Node Overview:
image

Copy link
Contributor

@wyb1 wyb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jul 14, 2022
Copy link
Member

@rfranzke rfranzke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@gardener-prow
Copy link
Contributor

gardener-prow bot commented Jul 14, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: rfranzke, wyb1

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 14, 2022
@gardener-prow gardener-prow bot merged commit d31e21c into gardener:master Jul 14, 2022
krgostev pushed a commit to krgostev/gardener that referenced this pull request Sep 8, 2022
* 'master' of github.com:gardener/gardener: (51 commits)
  Switch extension controller to `logr` and streamline/cleanup logs (gardener#6332)
  Switch `./test/...` packages to `logr` and drop `github.com/sirupsen/logrus` dependency (gardener#6316)
  Only check shoot conditions during hibernation integration test (gardener#6325)
  Add dashboard for monitoring conntrack race failures. (gardener#6329)
  Reconcile quota before rbac (gardener#6326)
  Update istio to v1.14.1 (gardener#6271)
  Update gardenlet's base image to alpine:3.16.0 (gardener#6321)
  Update envoy proxy to v1.21.4 (gardener#6320)
  Deploy the metrics server to the kind cluster (gardener#6301)
  Fix tools download for aarch64 (arm64) (gardener#6314)
  update with latest CA releases (gardener#6295)
  Add missing unit tests for the predicates provided by the extensions library (gardener#6249)
  [GEP-19] Monitoring Stack - Migrating to the `prometheus-operator` (gardener#6151)
  Revert "Recreate DWD deployment if needed" (gardener#6307)
  Update to golang 1.18.4 (gardener#6300)
  Cleaned up imports in vpn-seed-server (gardener#6315)
  Prepare next Dev Cycle v1.52.0-dev
  Release v1.51.0
  Add pre/post reconciliation/deletion hooks for the Worker resource (gardener#6290)
  Update the supported values in the usage text of the `--leader-election-resource-lock` flag (gardener#6304)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/monitoring Monitoring (including availability monitoring and alerting) related area/networking Networking related cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants