-
Notifications
You must be signed in to change notification settings - Fork 260
Description
Is this a request for help?: Yes
Is this an ISSUE or FEATURE REQUEST? (choose one): ISSUE
Which release version?: v1.0.18
Which component (CNI/IPAM/CNM/CNS): NPM
Which Operating System (Linux/Windows): Linux
For Linux: Include Distro and kernel version using "uname -a"
aks-engine, Ubuntu 16.04 image
Linux k8s-master-64980839-0 4.15.0-1040-azure #44-Ubuntu SMP Thu Feb 21 14:24:01 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Which Orchestrator and version (e.g. Kubernetes, Docker)
Kubernetes v1.13.2
What happened:
After rebooting the cluster, I was no longer able to communicate between applications in a namespace. I have a "default-deny" policy in that namespace, which was working up until a reboot.
What you expected to happen:
After a cluster reboot, pods in the namespace could communicate as allowed via network policies.
How to reproduce it (as minimally and precisely as possible):
- Add
npmto a running Azure Kubernetes cluster (we're using aks-engine) - Apply network policies, starting with a deny-all (Ingress and Egress) followed by explicit allows
- Test network traffic, and see that whitelisted traffic works as expected
- Reboot the cluster
- Test network traffic, observer that most/all traffic does not work as expected
Anything else we need to know:
In #258, a LIFO ordering was applied to rules in the cluster. This works fine when policies are applied after NPM has started.
However, after a reboot, NPM re-fills its Informer with current cluster network policies. The current policies are not sent in the same order that they were created in, but instead are sent in an undefined order. I believe this then results in the iptables entries being in the wrong order, causing the deny-all policy to apply before the allow policies.