Skip to content

Issues with network policies after cluster reboot #313

@zachomedia

Description

@zachomedia

Is this a request for help?: Yes


Is this an ISSUE or FEATURE REQUEST? (choose one): ISSUE


Which release version?: v1.0.18


Which component (CNI/IPAM/CNM/CNS): NPM


Which Operating System (Linux/Windows): Linux


For Linux: Include Distro and kernel version using "uname -a"

aks-engine, Ubuntu 16.04 image

Linux k8s-master-64980839-0 4.15.0-1040-azure #44-Ubuntu SMP Thu Feb 21 14:24:01 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux


Which Orchestrator and version (e.g. Kubernetes, Docker)

Kubernetes v1.13.2


What happened:

After rebooting the cluster, I was no longer able to communicate between applications in a namespace. I have a "default-deny" policy in that namespace, which was working up until a reboot.


What you expected to happen:

After a cluster reboot, pods in the namespace could communicate as allowed via network policies.


How to reproduce it (as minimally and precisely as possible):

  1. Add npm to a running Azure Kubernetes cluster (we're using aks-engine)
  2. Apply network policies, starting with a deny-all (Ingress and Egress) followed by explicit allows
  3. Test network traffic, and see that whitelisted traffic works as expected
  4. Reboot the cluster
  5. Test network traffic, observer that most/all traffic does not work as expected

Anything else we need to know:

In #258, a LIFO ordering was applied to rules in the cluster. This works fine when policies are applied after NPM has started.

However, after a reboot, NPM re-fills its Informer with current cluster network policies. The current policies are not sent in the same order that they were created in, but instead are sent in an undefined order. I believe this then results in the iptables entries being in the wrong order, causing the deny-all policy to apply before the allow policies.


Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions