ServiceEntry has stale endpoints #35404

yingzhuivy · 2021-09-28T18:16:30Z

Bug Description

We have a ServiceEntry in an external Istiod cluster that selects k8s pods in a remote cluster, when the pods are restarted, sometimes there will be stale endpoints in the cluster corresponding to the ServiceEntry. Looking at Istiod's debug/endpointz, all of the istiod pods have the same stale states, restarting the Istiod pods fix the problem.

To more easily reproduce, I just scale the deployment to 100 replicas, and then scale down to 1, sometimes some endpoints are not correctly deleted and there will be more than 1 endpoint under the cluster corresponding to the ServiceEntry. It do appear that it's easier to reproduce in more heavy loaded k8s clusters.

Here is the code for reproduction:
In the external Istiod cluster:

apiVersion: v1
kind: Namespace
metadata:
  name: airmesh-test-ying
---
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
  name: fortioserver
  namespace: airmesh-test-ying
spec:
  addresses:
  - 100.124.99.206
  hosts:
  - fortioserver.airmesh-test-ying.svc.ea1.us.a
  location: MESH_INTERNAL
  ports:
    - name: http-port
      number: 8080
      protocol: HTTP
  resolution: STATIC
  workloadSelector:
    labels:
      app: fortioserver

In the remote cluster:

---
apiVersion: v1
kind: Namespace
metadata:
  name: airmesh-test-ying
  labels:
    istio-injection: enabled
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: fortioclient
  name: fortioclient
  namespace: airmesh-test-ying
spec:
  selector:
    matchLabels:
      app: fortioclient
  template:
    metadata:
      labels:
        app: fortioclient
    spec:
      containers:
        - name: fortioclient
          image: fortio/fortio:latest_release
          args:
            - report
          volumeMounts:
            - name: shared-data
              mountPath: /var/lib/fortio
      volumes:
        - name: shared-data
          emptyDir: {}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: fortioserver
  name: fortioserver
  namespace: airmesh-test-ying
spec:
  replicas: 1
  selector:
    matchLabels:
      app: fortioserver
  template:
    metadata:
      labels:
        app: fortioserver
    spec:
      containers:
        - name: fortioserver
          image: fortio/fortio:latest_release
          volumeMounts:
            - name: shared-data
              mountPath: /var/lib/fortio
      volumes:
        - name: shared-data
          emptyDir: {}

Version

istio version: 1.10.3
k8s version: 1.17.14

Additional Information

If I turn on debug logging, for the stale endpoints, I didn't see these lines:

2021-09-28T18:02:19.160451Z debug Handle event delete for service instance (from 100.117.83.50) in namespace airmesh-test-ying

The log comes from here: https://github.com/istio/istio/blob/master/pilot/pkg/serviceregistry/serviceentry/servicediscovery.go#L444

So looks like it's either the delete event is missing or redundantEventForPod is true (which probably is not) ?

The text was updated successfully, but these errors were encountered:

yingzhuivy · 2021-09-29T00:50:01Z

I added some debug logs and found out that the issue is due to informer getting event out of order.

old pod shuts down, releases IP
new pod, from another service, starts up, get the same IP that just got released

however, istiod gets event in this order:

new pod starts up, because the pod is not ready yet, it's a delete event, and it will delete the field in workloadInstancesByIP
now when the old pod delete event comes, it's marked as a redundant event , so the delete never happens leaving stale endpoints

From my testing, the new pod event comes at 2021-09-28T23:52:10.266077Z and the old pod delete event comes at around 2021-09-28T23:52:25.650738Z so there is roughly 15s delay. I think we can not rely on the pod informer to have strict order. I think one possible fix is to store workloadInstances by name instead of by IP, so that we delete based on namespace/name. Thoughts? @hzxuzhonghu @howardjohn @iandyh

hzxuzhonghu · 2021-09-29T02:59:08Z

I think we can not rely on the pod informer to have strict order.

In theory, the informer events are in order. Thanks for your investigation. I need to have a closer look.

howardjohn · 2021-09-29T17:08:52Z

In theory, the informer events are in order

I suspect they are in order. But the new pod legitimately starts with the new IP before the old pod is fully deleted.

Just a guess though

yingzhuivy · 2021-09-29T17:28:46Z

Actually as long as the pod starts terminating, it is marked as a delete event here and should be deleted in the service registry. But that event comes later than the new pod event. I think IP is released after the pod starts terminating, so the terminating event should come before the new pod starts up event, which indicates to me that the informer events are out of order.

(by the way, to reproduce in a cluster with small load, I had another deployment with 300 pods that is consistently restarting to mimic load)

This CL fixes the problem described in this issue: istio#35404 workloadInstancesByIP is replaced by workloadInstancesByName, i.e. we don't assume one to one correspondance between wi and ip. workloadInstancesIPsByName is removed since it's covered by the new map. There is a refactor going on for the ServiceEntryStore which should also fix the problem: istio#35369. But that probably won't get backported to 1.11, so I created this smaller fix to be patched onto 1.11. Change-Id: I62992082979cb12ff5fecef4d85f20e3c9fd435a Reviewed-on: https://gerrit.musta.ch/c/public/istio/+/1690 Reviewed-by: Weibo He <weibo.he@airbnb.com>

* istio: fix service entry stale endpoints This CL fixes the problem described in this issue: #35404 workloadInstancesByIP is replaced by workloadInstancesByName, i.e. we don't assume one to one correspondance between wi and ip. workloadInstancesIPsByName is removed since it's covered by the new map. There is a refactor going on for the ServiceEntryStore which should also fix the problem: #35369. But that probably won't get backported to 1.11, so I created this smaller fix to be patched onto 1.11. Change-Id: I62992082979cb12ff5fecef4d85f20e3c9fd435a Reviewed-on: https://gerrit.musta.ch/c/public/istio/+/1690 Reviewed-by: Weibo He <weibo.he@airbnb.com> * lint * add release notes

This CL fixes the problem described in this issue: istio#35404 workloadInstancesByIP is replaced by workloadInstancesByName, i.e. we don't assume one to one correspondance between wi and ip. workloadInstancesIPsByName is removed since it's covered by the new map. There is a refactor going on for the ServiceEntryStore which should also fix the problem: istio#35369. But that probably won't get backported to 1.11, so I created this smaller fix to be patched onto 1.11. Change-Id: I62992082979cb12ff5fecef4d85f20e3c9fd435a Reviewed-on: https://gerrit.musta.ch/c/public/istio/+/1690 Reviewed-by: Weibo He <weibo.he@airbnb.com>

howardjohn · 2021-10-12T20:14:37Z

Fix is merged

* istio: fix service entry stale endpoints This CL fixes the problem described in this issue: #35404 workloadInstancesByIP is replaced by workloadInstancesByName, i.e. we don't assume one to one correspondance between wi and ip. workloadInstancesIPsByName is removed since it's covered by the new map. There is a refactor going on for the ServiceEntryStore which should also fix the problem: #35369. But that probably won't get backported to 1.11, so I created this smaller fix to be patched onto 1.11. Change-Id: I62992082979cb12ff5fecef4d85f20e3c9fd435a Reviewed-on: https://gerrit.musta.ch/c/public/istio/+/1690 Reviewed-by: Weibo He <weibo.he@airbnb.com> * lint * add release notes Co-authored-by: ying-zhu <ying.zhu@airbnb.com>

istio-policy-bot added area/networking feature/Multi-cluster issues related with multi-cluster support labels Sep 28, 2021

pangsq mentioned this issue Sep 30, 2021

ServiceEntry ignores inactive pods at EventAdd #35379

Closed

yingzhuivy mentioned this issue Oct 11, 2021

istio: fix service entry stale endpoints #35559

Closed

yingzhuivy mentioned this issue Oct 12, 2021

istio: fix service entry stale endpoints #35564

Merged

11 tasks

howardjohn closed this as completed Oct 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ServiceEntry has stale endpoints #35404

ServiceEntry has stale endpoints #35404

yingzhuivy commented Sep 28, 2021 •

edited

Loading

yingzhuivy commented Sep 29, 2021

hzxuzhonghu commented Sep 29, 2021

howardjohn commented Sep 29, 2021

yingzhuivy commented Sep 29, 2021

howardjohn commented Oct 12, 2021

ServiceEntry has stale endpoints #35404

ServiceEntry has stale endpoints #35404

Comments

yingzhuivy commented Sep 28, 2021 • edited Loading

Bug Description

Version

Additional Information

yingzhuivy commented Sep 29, 2021

hzxuzhonghu commented Sep 29, 2021

howardjohn commented Sep 29, 2021

yingzhuivy commented Sep 29, 2021

howardjohn commented Oct 12, 2021

yingzhuivy commented Sep 28, 2021 •

edited

Loading