Skip to content

K8s ‐ Verify network diagnostics disabled

Elyasaf Halle edited this page May 7, 2026 · 3 revisions

Description

This rule verifies that no pods exist in the openshift-network-diagnostics namespace on edge/SNO (Single Node OpenShift) clusters where network diagnostics has been disabled. It checks two conditions:

  1. The network operator disableNetworkDiagnostics is explicitly set to true
  2. No pods remain running in the openshift-network-diagnostics namespace

On edge clusters, network diagnostics should be disabled to reduce resource usage and minimize unnecessary workloads on resource-constrained nodes.

Prerequisites

  • Access to the OpenShift cluster with permissions to read the network operator configuration and list pods
  • The oc command-line tool configured and authenticated
  • The network operator must have disableNetworkDiagnostics explicitly set to true

If the network operator does not have disableNetworkDiagnostics: true, the rule is marked as Not Applicable since network diagnostics is intentionally enabled.

Impact

If network diagnostics is supposed to be disabled but pods remain in the openshift-network-diagnostics namespace:

  • Resource waste: Network diagnostics pods (network-check-source, network-check-target) consume CPU and memory on resource-constrained edge/SNO nodes
  • Unnecessary network traffic: Diagnostic pods continuously generate network check traffic between nodes
  • Configuration drift: Indicates the network diagnostics disablement did not take effect properly

Root Cause

Common scenarios that may lead to leftover network diagnostics pods:

  • The disableNetworkDiagnostics was set to true but the cluster-network-operator failed to clean up existing pods
  • A cluster upgrade or rollback reverted the network diagnostics state without cleaning up pods
  • The cluster-network-operator pod was restarted and the setting was overridden during reconciliation (known issue in OCP < 4.12)
  • Network or API server issues prevented the operator from completing the teardown

Diagnostics

1. Check network operator disableNetworkDiagnostics setting

oc get network.operator.openshift.io cluster -o jsonpath='{.spec.disableNetworkDiagnostics}'

Expected output for disabled diagnostics: true

2. Check for pods in the openshift-network-diagnostics namespace

oc get pods -n openshift-network-diagnostics

Expected output when diagnostics is properly disabled: No resources found in openshift-network-diagnostics namespace.

3. Check cluster-network-operator status

oc get pods -n openshift-network-operator
oc logs -n openshift-network-operator deployment/network-operator --tail=50

Look for:

  • Errors related to network diagnostics reconciliation
  • Events indicating failure to remove diagnostics resources

4. Check for other resources in the namespace

oc get all -n openshift-network-diagnostics

Solution

If network diagnostics should be disabled and pods remain:

Verify the setting is correctly applied:

oc get network.operator.openshift.io cluster -o jsonpath='{.spec.disableNetworkDiagnostics}'

If the setting is not applied, disable network diagnostics:

oc patch network.operator.openshift.io cluster --type merge \
  -p '{"spec":{"disableNetworkDiagnostics":true}}'

If pods persist after disabling, force removal:

# Delete remaining pods
oc delete pods --all -n openshift-network-diagnostics

# Verify pods are gone and don't reappear
oc get pods -n openshift-network-diagnostics -w

If pods keep reappearing, check the network operator:

# Restart the cluster-network-operator to force reconciliation
oc delete pods -n openshift-network-operator -l name=network-operator

# Monitor the operator logs
oc logs -n openshift-network-operator deployment/network-operator -f

If network diagnostics should actually be enabled:

# Re-enable network diagnostics
oc patch network.operator.openshift.io cluster --type merge \
  -p '{"spec":{"disableNetworkDiagnostics":false}}'

# Verify diagnostics pods come up
oc get pods -n openshift-network-diagnostics -w

Verify the fix:

# Confirm no pods in openshift-network-diagnostics namespace
oc get pods -n openshift-network-diagnostics

# Confirm operator setting
oc get network.operator.openshift.io cluster -o jsonpath='{.spec.disableNetworkDiagnostics}'

Resources

Clone this wiki locally