Skip to content

[NPM] traffic break if pod IP is deallocated from previous pod #562

@mainred

Description

@mainred

Is this an ISSUE or FEATURE REQUEST? (choose one):

ISSUE

Which release version?:

azure-npm:v1.1.0

Which component (CNI/IPAM/CNM/CNS):

NPM

Which Operating System (Linux/Windows):

Linux

For Linux: Include Distro and kernel version using "uname -a"

Linux aks-nodepool1-85368712-vmss000000 4.15.0-1071-azure #76-Ubuntu SMP Wed Feb 12 03:02:44 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Which Orchestrator and version (e.g. Kubernetes, Docker)

Kubernetes

What happened:

This test is typically Kubernetes networkpolicy e2e test
I have a server with 80 and 81 ports exposed outside, and two egress policy applied on pods with label type=client .
two networkpolicies, one is to deny all ingress traffic to pods in namespace e2e, and the other one allows traffic from pods with app=client to pod app=server in namespace e2e

Before this test, we also have other network policy e2e tests just finished.

The client won't reach the server pod service

How to test:

kubectl run --rm -it --image busybox -l app=client -n e2e client wget svc-access-server.e2e:80

What you expected to happen:

Background:
When delete a pod, gracefully by default, two pod delete event will be sent to the pod event watchers, like Azure NPM, and Kubelet, etc.
One event happens when the first attempt to delete the pod comes, kubelet will stop the container gracefully with SIGTERM by docker, as our server pod container cannot handle SIGTERM, docker will send SIGKILL to the container process after grace period, 30s. Thus, Kubelet get the containerdied notification, it talks to apiserver to delete the pod by force, which generate second pod delete event from apiserver.

As we have also other network policy e2e tests running in a sequence, chances are that the IP of server pod, say pod1, used in this test is deallocated from pod, say pod2, of a previous test. pod1 and pod2 use the same pod labels.
When the pod1 creation event comes in the middle of the two pod deletion events, the ipset determined by the pod label will remove the ip address. Unfortunately, this cannot be recovered automatically. Also I apply the same scenarios in Calico, and the test passed.

image

Please feel free to reach out to me if my words is confusing.

How to reproduce it (as minimally and precisely as possible):

apiVersion: v1
kind: Pod
metadata:
  labels:
    app: server
  name: access-server
  namespace: e2e
spec:
  containers:
  - args:
    - python -m SimpleHTTPServer 80
    command:
    - sh
    - -c
    image: python:2.7.11-alpine
    name: pod-container-80
    ports:
    - containerPort: 80
      name: serve-80
      protocol: TCP
  - args:
    - python -m SimpleHTTPServer 81
    command:
    - sh
    - -c
    image: python:2.7.11-alpine
    name: pod-container-81
    ports:
    - containerPort: 81
      name: serve-81
      protocol: TCP

---
apiVersion: v1
kind: Service
metadata:
  name: svc-access-server
  namespace: e2e
spec:
  ports:
  - name: serve-80
    port: 80
    protocol: TCP
    targetPort: 80
  - name: serve-81
    port: 81
    protocol: TCP
    targetPort: 81
  selector:
    app: server
---

// network policy denies all ingress traffic to pods in namespace e2e by default
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: deny-all
  namespace: e2e
spec:
  podSelector: {}
  policyTypes:
  - Ingress


---
// network policy allows traffic from pods with `app=client` to pod `app=server` in namespace e2e
apiVersion: extensions/v1beta1
kind: NetworkPolicy
metadata:
  name: allow-from-client-pod-selector
  namespace: e2e
spec:
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: client
  podSelector:
    matchLabels:
      app: server
  policyTypes:
  - Ingress

Anything else we need to know:


Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions