Skip to content

Network ‐ Verify all NodeNetworkConfigurationPolicies are Available

kchennag edited this page Jun 9, 2026 · 12 revisions

This rule validates the health of all NodeNetworkConfigurationPolicy (NNCP) resources managed by the NMState operator. It checks that all NNCP resources are in a healthy state by verifying:

  • Available condition must be True (with reason and message)
  • Degraded condition must NOT be True (with reason and message)
  • Progressing condition must NOT be True (with reason and message)
  • Upgradeable condition must NOT be False (with reason and message)

This is a critical health check after node reboots or network configuration changes to ensure network policies have been properly applied across the cluster.

Severity: Fail (critical)

Prerequisites

  • NMState operator must be installed
  • At least one NodeNetworkConfigurationPolicy resource must exist
  • Cluster-level access to query NNCP resources

Impact

Unhealthy NNCPs can cause:

  • Network connectivity loss: Nodes unable to communicate if network configuration failed
  • Pod networking failures: Pods unable to reach cluster services or external networks
  • SR-IOV/bond failures: Advanced networking features not functioning correctly
  • Upgrade blocking: Cluster upgrades blocked if NNCPs are not upgradeable
  • Degraded cluster state: Partial or complete network outages affecting workloads

Root Cause

Common reasons for NNCP health issues:

  • Configuration errors: Invalid network interface configurations in NNCP spec
  • Hardware mismatch: Specified interfaces don't exist on target nodes
  • Conflicting policies: Multiple NNCPs targeting the same interface
  • NMState operator issues: Operator degraded or not running
  • Node reboot in progress: Network reconfiguration still being applied
  • Resource constraints: Insufficient memory or CPU affecting NMState daemon

Diagnostics

List all NodeNetworkConfigurationPolicies:

oc get nodenetworkconfigurationpolicies -A
Check detailed NNCP status:


oc get nncp <nncp-name> -o yaml
View NNCP conditions with reason and message:


oc get nncp <nncp-name> -o jsonpath='{.status.conditions[*]}' | jq
Check NMState operator status:


oc get pods -n openshift-nmstate
oc logs -n openshift-nmstate deployment/nmstate-operator
Check node network state:


oc get nodenetworkstate -A
oc get nns <node-name> -o yaml
Solution
Review NNCP condition details:

oc describe nncp <nncp-name>
Look at the reason and message fields for each condition to understand the root cause.

Fix configuration errors: Update the NNCP spec with correct interface names and settings

oc edit nncp <nncp-name>
Verify NMState operator health:

oc get csv -n openshift-nmstate
oc get pods -n openshift-nmstate
Check node network interfaces:

oc debug node/<node-name> -- ip link show
Delete and recreate problematic NNCP if configuration is correct but status remains unhealthy:

oc delete nncp <nncp-name>
oc apply -f <nncp-definition.yaml>
For Progressing state: Wait for configuration to complete (may take several minutes during node reboot)

For Upgradeable=False: Check the reason field and resolve blocking issues before attempting cluster upgrades

Resources

Clone this wiki locally