Network ‐ Verify all NodeNetworkConfigurationPolicies are Available

This rule validates the health of all NodeNetworkConfigurationPolicy (NNCP) resources managed by the NMState operator. It checks that all NNCP resources are in a healthy state by verifying:

Available condition must be True (with reason and message)
Degraded condition must NOT be True (with reason and message)
Progressing condition must NOT be True (with reason and message)
Upgradeable condition must NOT be False (with reason and message)

This is a critical health check after node reboots or network configuration changes to ensure network policies have been properly applied across the cluster.

Severity: Fail (critical)

Prerequisites

NMState operator must be installed
At least one NodeNetworkConfigurationPolicy resource must exist
Cluster-level access to query NNCP resources

Impact

Unhealthy NNCPs can cause:

Network connectivity loss: Nodes unable to communicate if network configuration failed
Pod networking failures: Pods unable to reach cluster services or external networks
SR-IOV/bond failures: Advanced networking features not functioning correctly
Upgrade blocking: Cluster upgrades blocked if NNCPs are not upgradeable
Degraded cluster state: Partial or complete network outages affecting workloads

Root Cause

Common reasons for NNCP health issues:

Configuration errors: Invalid network interface configurations in NNCP spec
Hardware mismatch: Specified interfaces don't exist on target nodes
Conflicting policies: Multiple NNCPs targeting the same interface
NMState operator issues: Operator degraded or not running
Node reboot in progress: Network reconfiguration still being applied
Resource constraints: Insufficient memory or CPU affecting NMState daemon

Diagnostics

List all NodeNetworkConfigurationPolicies:

oc get nodenetworkconfigurationpolicies -A
Check detailed NNCP status:


oc get nncp <nncp-name> -o yaml
View NNCP conditions with reason and message:


oc get nncp <nncp-name> -o jsonpath='{.status.conditions[*]}' | jq
Check NMState operator status:


oc get pods -n openshift-nmstate
oc logs -n openshift-nmstate deployment/nmstate-operator
Check node network state:


oc get nodenetworkstate -A
oc get nns <node-name> -o yaml
Solution
Review NNCP condition details:

oc describe nncp <nncp-name>
Look at the reason and message fields for each condition to understand the root cause.

Fix configuration errors: Update the NNCP spec with correct interface names and settings

oc edit nncp <nncp-name>
Verify NMState operator health:

oc get csv -n openshift-nmstate
oc get pods -n openshift-nmstate
Check node network interfaces:

oc debug node/<node-name> -- ip link show
Delete and recreate problematic NNCP if configuration is correct but status remains unhealthy:

oc delete nncp <nncp-name>
oc apply -f <nncp-definition.yaml>
For Progressing state: Wait for configuration to complete (may take several minutes during node reboot)

For Upgradeable=False: Check the reason field and resolve blocking issues before attempting cluster upgrades

Resources

NMState Operator Documentation: https://docs.openshift.com/container-platform/latest/networking/k8s_nmstate/k8s-nmstate-about-the-k8s-nmstate-operator.html
NodeNetworkConfigurationPolicy Examples: https://nmstate.io/examples.html
Troubleshooting NMState: https://docs.openshift.com/container-platform/latest/networking/k8s_nmstate/k8s-nmstate-troubleshooting-node-network.html

Network ‐ Verify all NodeNetworkConfigurationPolicies are Available

Prerequisites

Impact

Root Cause

Diagnostics

Resources

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally