Race condition between rollback checks and Node state #474

qinqon · 2020-03-31T08:33:31Z

What happened:
When applying a policy that deals with primary nic and to test rollback policy was a success state but enacment was progressing and ended with failure (sice a rollback was expected) the issue is related to node counting for policy conditions, since only ready nodes are being counted and playing with primary nic can render one node temporally at NotReady, the comparation between number of nodes and not matching enactments was passing.

To fix this we have to add another probe after apply and after rollback to check that Node is at Ready state so we block there until node is ok again.

What you expected to happen:
Policy to be a Failure state after a rollback from a bad primary nic change.

How to reproduce it (as minimally and precisely as possible):
Apply a bad primary nic policy at multinic env, it has to be exercise multiple time until race appear.

Anything else we need to know?:

Environment:

NodeNetworkState on affected nodes (use kubectl get nodenetworkstate <node_name> -o yaml):
Problematic NodeNetworkConfigurationPolicy:
kubernetes-nmstate image (use kubectl get pods --all-namespaces -l app=kubernetes-nmstate -o jsonpath='{.items[0].spec.containers[0].image}'):
NetworkManager version (use nmcli --version)
Kubernetes version (use kubectl version):
OS (e.g. from /etc/os-release):
Others:

The text was updated successfully, but these errors were encountered:

Closes nmstate#474 Signed-off-by: Quique Llorente <ellorent@redhat.com>

* Move probes to their own module Signed-off-by: Quique Llorente <ellorent@redhat.com> * Add Node Readiness probe Closes #474 Signed-off-by: Quique Llorente <ellorent@redhat.com>

Closes nmstate#474 Signed-off-by: Quique Llorente <ellorent@redhat.com>

* Move probes to their own module Signed-off-by: Quique Llorente <ellorent@redhat.com> * Add Node Readiness probe Closes #474 Signed-off-by: Quique Llorente <ellorent@redhat.com> Co-authored-by: Quique Llorente <ellorent@redhat.com>

qinqon added kind/bug priority/medium labels Mar 31, 2020

qinqon mentioned this issue Mar 31, 2020

Adapt to external k8s #465

Merged

qinqon added a commit to qinqon/kubernetes-nmstate that referenced this issue Mar 31, 2020

Add Node Readiness probe

a96a01f

Closes nmstate#474 Signed-off-by: Quique Llorente <ellorent@redhat.com>

qinqon mentioned this issue Mar 31, 2020

Add node ready state probe #480

Merged

kubevirt-bot closed this as completed in #480 Mar 31, 2020

kubevirt-bot pushed a commit that referenced this issue Mar 31, 2020

Add node ready state probe (#480)

ed8f206

* Move probes to their own module Signed-off-by: Quique Llorente <ellorent@redhat.com> * Add Node Readiness probe Closes #474 Signed-off-by: Quique Llorente <ellorent@redhat.com>

kubevirt-bot pushed a commit to kubevirt-bot/kubernetes-nmstate that referenced this issue Mar 31, 2020

Add Node Readiness probe

43e86a7

Closes nmstate#474 Signed-off-by: Quique Llorente <ellorent@redhat.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Race condition between rollback checks and Node state #474

Race condition between rollback checks and Node state #474

qinqon commented Mar 31, 2020

Race condition between rollback checks and Node state #474

Race condition between rollback checks and Node state #474

Comments

qinqon commented Mar 31, 2020