Cilium upstream master branch breaks nodeport connection from external client #17192

vincentmli · 2021-08-19T18:56:54Z

Bug report

Cilium upstream master branch breaks nodeport connection from external client,

General Information

Cilium version (run cilium version)
Client: 1.10.90 c44ff1b 2021-08-03T00:35:29+05:30 go version go1.16.5 linux/amd64
Daemon: 1.10.90 c44ff1b 2021-08-03T00:35:29+05:30 go version go1.16.5 linux/amd64
Kernel version (run uname -a)
5.8.1-050801-generic #202008111432 SMP Tue Aug 11 14:34:42 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Orchestration system version in use (e.g. kubectl version, ...)

Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:10:43Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:07Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"}

Link to relevant artifacts (policies, deployments scripts, ...)

cat nginx_nodeport.yaml 
apiVersion: v1
kind: ReplicationController
metadata:
  name: nginx
spec:
  replicas: 1 
  selector:
    app: nginx
  template:
    metadata:
      name: nginx
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx
        ports:
        - containerPort: 80
---

apiVersion: v1
kind: Service
metadata:
  labels:
    name: nginxservice
  name: nginxservice
spec:
  ports:
    # The port that this service should serve on.
    - port: 80
      nodePort: 32506
  selector:
    app: nginx
  type: NodePort

Generate and upload a system zip:

can upload if needed

How to reproduce the issue

instruction 1

build docker image from upstream master branch

instruction

deploy cilium with attached cilium yaml file

instruction

deploy above nodeport service

5, access the nodeport service from external client

cilium-upstream.yaml.txt

The text was updated successfully, but these errors were encountered:

vincentmli · 2021-08-19T18:59:38Z

#17106 (comment) , this might break the https://jenkins.cilium.io/job/Cilium-PR-K8s-1.16-net-next/1289/ test

vincentmli · 2021-08-19T19:20:02Z

I did some debugging and the problem is the pod is not synced to cilium_ipcache map to another node, no encapulation happened and FIB lookup failure, this issue does not happen to v1.10.3 release

the cilium datapath log:

Ethernet        {Contents=[..14..] Payload=[..62..] SrcMAC=52:54:00:75:9d:fb DstMAC=52:54:00:3e:93:36 EthernetType=IPv4 Length=0}
IPv4    {Contents=[..20..] Payload=[..40..] Version=4 IHL=5 TOS=0 Length=60 Id=27010 Flags=DF FragOffset=0 TTL=64 Protocol=TCP Checksum=11207 SrcIP=10.169.72.14 DstIP=10.169.72.19 Options=[] Padding=[]}
TCP     {Contents=[..40..] Payload=[] SrcPort=28069 DstPort=32506 Seq=1659897463 Ack=0 DataOffset=10 FIN=false SYN=true RST=false PSH=false ACK=false URG=false ECE=false CWR=false NS=false Window=29200 Checksum=64099 Urgent=0 Options=[..5..] Padding=[]}
CPU 03: MARK 0x0 FROM 2626 from-network: 74 bytes (74 captured), state new, interface ens7, orig-ip 0.0.0.0
CPU 03: MARK 0x0 FROM 2626 DEBUG: Successfully mapped addr=10.169.72.14 to identity=2
CPU 03: MARK 0x0 FROM 2626 DEBUG: Conntrack lookup 1/2: src=10.169.72.14:28069 dst=10.169.72.19:32506
CPU 03: MARK 0x0 FROM 2626 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=4
CPU 03: MARK 0x0 FROM 2626 DEBUG: CT verdict: New, revnat=0
CPU 03: MARK 0x0 FROM 2626 DEBUG: Service backend slot lookup: slot=1, dport=32506
CPU 03: MARK 0x0 FROM 2626 DEBUG: Conntrack create: proxy-port=0 revnat=10 src-identity=0 lb=0.0.0.0
CPU 03: MARK 0x0 FROM 2626 DEBUG: Conntrack lookup 1/2: src=10.169.72.14:28069 dst=10.0.2.37:80
CPU 03: MARK 0x0 FROM 2626 DEBUG: Conntrack lookup 2/2: nexthdr=6 flags=1
CPU 03: MARK 0x0 FROM 2626 DEBUG: CT verdict: New, revnat=0
CPU 03: MARK 0x0 FROM 2626 DEBUG: Conntrack create: proxy-port=0 revnat=10 src-identity=1 lb=10.0.2.37
------------------------------------------------------------------------------
Ethernet        {Contents=[..14..] Payload=[..62..] SrcMAC=52:54:00:75:9d:fb DstMAC=52:54:00:3e:93:36 EthernetType=IPv4 Length=0}
IPv4    {Contents=[..20..] Payload=[..40..] Version=4 IHL=5 TOS=0 Length=60 Id=27010 Flags=DF FragOffset=0 TTL=64 Protocol=TCP Checksum=29273 SrcIP=10.169.72.19 DstIP=10.0.2.37 Options=[] Padding=[]}
TCP     {Contents=[..40..] Payload=[] SrcPort=51397 DstPort=80(http) Seq=1659897463 Ack=0 DataOffset=10 FIN=false SYN=true RST=false PSH=false ACK=false URG=false ECE=false CWR=false NS=false Window=29200 Checksum=25728 Urgent=0 Options=[..5..] Padding=[]}
CPU 03: MARK 0x0 FROM 2626 DROP: 74 bytes, reason FIB lookup failed <===========FIB LOOKUP FAILURE

root@cilium-demo-1:/home/vincent# kubectl get po -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP          NODE            NOMINATED NODE   READINESS GATES
nginx-xvnbn   1/1     Running   0          17m   10.0.2.37   cilium-demo-2   <none>           <none>
root@cilium-demo-1:/home/vincent# ./bpftool-ipcache.sh 
=====================================
Cilium node cilium-demo-1
=====================================
cilium ipcache map id: 5957
IP PREFIX/ADDRESS   IDENTITY
10.169.72.19/32     6 0 0.0.0.0        
0.0.0.0/0           2 0 0.0.0.0        
10.0.0.164/32       1 0 0.0.0.0        
10.0.1.211/32       6 0 10.169.72.19   
10.0.2.128/32       6 0 10.169.72.17   
10.3.72.9/32        1 0 0.0.0.0        
10.169.72.9/32      1 0 0.0.0.0        
10.0.0.10/32        37238 0 0.0.0.0    
10.0.0.21/32        37238 0 0.0.0.0    
10.169.72.17/32     6 0 0.0.0.0        
=====================================
Cilium node cilium-demo-2
=====================================
cilium ipcache map id: 4224
IP PREFIX/ADDRESS   IDENTITY
10.0.1.211/32       6 0 10.169.72.19      
10.0.2.37/32        3 0 0.0.0.0         <=====not synced to another two node  
10.3.72.17/32       1 0 0.0.0.0           
10.169.72.17/32     1 0 0.0.0.0           
10.169.72.19/32     6 0 0.0.0.0           
10.0.0.164/32       6 0 10.169.72.9       
10.0.0.21/32        37238 0 10.169.72.9   
10.0.2.128/32       1 0 0.0.0.0           
10.169.72.9/32      6 0 0.0.0.0           
0.0.0.0/0           2 0 0.0.0.0           
10.0.0.10/32        37238 0 10.169.72.9   
=====================================
Cilium node cilium-demo-3
=====================================
cilium ipcache map id: 6818
IP PREFIX/ADDRESS   IDENTITY
10.0.0.21/32        37238 0 10.169.72.9   
10.3.72.19/32       1 0 0.0.0.0           
10.169.72.9/32      6 0 0.0.0.0           
10.169.72.19/32     1 0 0.0.0.0           
0.0.0.0/0           2 0 0.0.0.0           
10.0.0.10/32        37238 0 10.169.72.9   
10.0.0.164/32       6 0 10.169.72.9       
10.0.1.211/32       1 0 0.0.0.0           
10.0.2.128/32       6 0 10.169.72.17      
10.169.72.17/32     6 0 0.0.0.0

vincentmli · 2021-08-20T19:08:58Z

trying to learn how the ipcache map sync works and here is what I find so far from cilium agent debug log.
for remote cilium node to sync new pod 10.0.2.126 ip in the cilium_ipcache map, we should see log message similar to :

level=debug msg="Upserting IP into ipcache layer" identity="{15120 custom-resource false}" ipAddr=10.0.2.126 k8sNamespace=default k8sPodName=nginx-hrsfd key=0 namedPorts="map[]" subsys=ipcache

level=debug msg="Daemon notified of IP-Identity cache state change" identity="{15120 custom-resource false}" ipAddr="{10.0.2.126 ffffffff}" modification=Upsert subsys=datapath-ipcache

I do not see above log with the master branch image build

vincentmli · 2021-08-20T22:01:49Z

git bisect found this


#git bisect start --term-new=unfixed --term-old=fixed
#git bisect fixed v1.10.3
#git bisect unfixed master

[root@centos-dev cilium]# git bisect fixed
0681343309ef15677c9335802bd724500f1d663d is the first unfixed commit
commit 0681343309ef15677c9335802bd724500f1d663d
Author: Weilong Cui <cuiwl@google.com>
Date:   Fri Apr 9 15:33:13 2021 -0700

    Removes CEP subresource.
    
    This is part 2/2 of trimmming CEP subresource to improve scalability.
    Part 1/2 is PR #15230.
    
    This will bump cilium CRD schema version and is only backward-compatible
    with agent that has part 1/2.
    
    Signed-off-by: Weilong Cui <cuiwl@google.com>

 .../cilium.io/client/crds/v2/ciliumendpoints.yaml  |  3 +-
 pkg/k8s/apis/cilium.io/v2/register.go              |  2 +-
 pkg/k8s/apis/cilium.io/v2/types.go                 |  1 -
 pkg/k8s/watchers/endpointsynchronizer.go           | 42 +++-------------------
 4 files changed, 6 insertions(+), 42 deletions(-)



[root@centos-dev cilium]# git bisect log
git bisect start '--term-new=unfixed' '--term-old=fixed'
# fixed: [4145278ccc6e90739aa100c9ea8990a0f561ca95] Prepare for release v1.10.3
git bisect fixed 4145278ccc6e90739aa100c9ea8990a0f561ca95
# unfixed: [dd27883a7baee5bee8f53820b577192f5a09093f] build(deps): bump docker/build-push-action from 2.6.1 to 2.7.0
git bisect unfixed dd27883a7baee5bee8f53820b577192f5a09093f
# fixed: [4e17505afbb79081cb8c733df289372fb0fd9b96] docs/crd: Simplify some operations
git bisect fixed 4e17505afbb79081cb8c733df289372fb0fd9b96
# unfixed: [8b3f00987553836a63211abac80e839a9b0056af] docs: Fix typo in BGP GSG
git bisect unfixed 8b3f00987553836a63211abac80e839a9b0056af
# fixed: [bc6d5866303474ecf62536e56a56bb77064ea116] .github: Don't run CodeQL for every master push
git bisect fixed bc6d5866303474ecf62536e56a56bb77064ea116
# fixed: [90cb786b946c2bbcf30b631c176824b24ee3a6b7] .github/codeql: Run on same branches for push and PR
git bisect fixed 90cb786b946c2bbcf30b631c176824b24ee3a6b7
# fixed: [835f2e9fe426b986ff23a6689a237bdf78afe83d] bpf: ct: use union to hide the rx_bytes hack
git bisect fixed 835f2e9fe426b986ff23a6689a237bdf78afe83d
# fixed: [0b8a68dc30afad48e994a1500df9af127278edc1] test: Do not create Docker network in k8s provision
git bisect fixed 0b8a68dc30afad48e994a1500df9af127278edc1
# fixed: [35f28766c3caf0c6ba973a3b9b9065ee6b930508] Update stable releases
git bisect fixed 35f28766c3caf0c6ba973a3b9b9065ee6b930508
# unfixed: [6234ad88bc40aa8a32f4a5ec3480c336fab2f2cc] build(deps): bump actions/upload-artifact from 2.2.3 to 2.2.4
git bisect unfixed 6234ad88bc40aa8a32f4a5ec3480c336fab2f2cc
# unfixed: [99230d27098c590b59c19fa17dc22811ee88b6a0] build(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds
git bisect unfixed 99230d27098c590b59c19fa17dc22811ee88b6a0
# unfixed: [0681343309ef15677c9335802bd724500f1d663d] Removes CEP subresource.
git bisect unfixed 0681343309ef15677c9335802bd724500f1d663d
# fixed: [3a55d743972a8def382831ab75b42eaf8944591e] fix warning log for list IPV6 address: move IPV4 to IPv6
git bisect fixed 3a55d743972a8def382831ab75b42eaf8944591e
# first unfixed commit: [0681343309ef15677c9335802bd724500f1d663d] Removes CEP subresource.

joestringer · 2021-08-21T01:15:14Z

CC @Weil0ng any thoughts on why that commit might have broken ipcache synchronization? Maybe an unintended change in the CEP watcher?

Weil0ng · 2021-08-22T19:31:05Z

Hmm, the commit itself should not affect how ipcache is synced, but it DOES affect how CEPs should be synced with apiserver, I wonder if this is similar to #16984 (comment)

borkmann · 2021-09-01T12:18:04Z

@Weil0ng did you have a chance to look (or test) further into it? Every node should have ipcache entries of all the endpoints in the cluster given we need to know whether to reach remote endpoints via tunnel or not.

(If there is no entry, connectivity would break since routing won't happen via tunnel (this info as @vincentmli pointed out is stored in the ipcache meta data).)

vincentmli · 2021-09-01T14:51:59Z

@Weil0ng did you have a chance to look (or test) further into it? Every node should have ipcache entries of all the endpoints in the cluster given we need to know whether to reach remote endpoints via tunnel or not.

(If there is no entry, connectivity would break since routing won't happen via tunnel (this info as @vincentmli pointed out is stored in the ipcache meta data).)

@borkmann on a side note, just curious, for nodeport service in tunnel mode, is it possible to use tunnel map for lookup for encapsulation since the tunnel map should always include the node pod cidr, no need for a specific endpoint lookup in ipcache map in case it is missing for some reason.

joestringer · 2021-09-01T15:40:30Z

From Cilium meeting, maybe one next step is to look into how the tunnel IP is associated with the CEP resources? Seemed like there was a question whether it could be included in the "subresources" field in which case maybe the update doesn't propagate.

Weil0ng · 2021-09-01T15:50:30Z

@vincentmli if it's possible, can you please provide sysdump? Running cilium-bugtool in the agent that runs on the same node as the backend should include the info needed, I would like to see if the nginx backend endpoint sync has an error...

vincentmli · 2021-09-01T17:08:58Z

@Weil0ng I have done git bisect reset but i may still has the image in docker hub, let me see if I can re-create the issue.

vincentmli · 2021-09-01T20:33:51Z

@Weil0ng here is the sysdump, the cilium image I ran is from master branch build
cilium-sysdump-20210901-202302.zip

Weil0ng · 2021-09-01T21:06:56Z

Thank you! However I don't see nginx-XXX pods, is the original yaml deployed in this cluster? Can you let me know which one is supposed to be the problematic backend?

vincentmli · 2021-09-01T21:13:36Z

find bugtool-cilium-* -name "cilium-bpf-ipcache-list.md" | xargs cat 
IP PREFIX/ADDRESS   IDENTITY
10.0.0.21/32        37238 0 10.169.72.9   
10.0.0.164/32       6 0 10.169.72.9       
10.0.1.211/32       6 0 10.169.72.19      
10.169.72.9/32      6 0 0.0.0.0           
10.169.72.19/32     6 0 0.0.0.0           
0.0.0.0/0           2 0 0.0.0.0           
10.0.0.10/32        37238 0 10.169.72.9   
10.0.2.128/32       1 0 0.0.0.0           
10.0.2.221/32       3 0 0.0.0.0   <=== nginx pod endpoint missing in  other 2 nodes       
10.3.72.17/32       1 0 0.0.0.0           
10.169.72.17/32     1 0 0.0.0.0           
IP PREFIX/ADDRESS   IDENTITY
10.169.72.9/32      6 0 0.0.0.0           
10.169.72.17/32     6 0 0.0.0.0           
10.169.72.19/32     1 0 0.0.0.0           
0.0.0.0/0           2 0 0.0.0.0           
10.0.0.10/32        37238 0 10.169.72.9   
10.0.1.211/32       1 0 0.0.0.0           
10.0.2.128/32       6 0 10.169.72.17      
10.3.72.19/32       1 0 0.0.0.0           
10.0.0.21/32        37238 0 10.169.72.9   
10.0.0.164/32       6 0 10.169.72.9       
IP PREFIX/ADDRESS   IDENTITY
10.3.72.9/32        1 0 0.0.0.0        
10.169.72.19/32     6 0 0.0.0.0        
0.0.0.0/0           2 0 0.0.0.0        
10.0.2.128/32       6 0 10.169.72.17   
10.169.72.9/32      1 0 0.0.0.0        
10.169.72.17/32     6 0 0.0.0.0        
10.0.0.10/32        37238 0 0.0.0.0    
10.0.0.21/32        37238 0 0.0.0.0    
10.0.0.164/32       1 0 0.0.0.0        
10.0.1.211/32       6 0 10.169.72.19

Weil0ng · 2021-09-01T21:20:04Z

oops, my bad, was looking at a wrong sysdump path...

Weil0ng · 2021-09-01T23:22:23Z

A few observations:

the nginx-sgvrc CEP indeed does NOT have its status populated, which is why remote node does not have its IP nor identity

- apiVersion: cilium.io/v2
  kind: CiliumEndpoint
  metadata:
    creationTimestamp: "2021-09-01T20:20:21Z"
    generation: 1
    labels:
      app: nginx
    managedFields:
    - apiVersion: cilium.io/v2
      fieldsType: FieldsV1
      fieldsV1:
        f:metadata:
          f:labels:
            .: {}
            f:app: {}
          f:ownerReferences:
            .: {}
            k:{"uid":"5170ed73-89b3-4019-86dd-48b0607ee230"}:
              .: {}
              f:apiVersion: {}
              f:blockOwnerDeletion: {}
              f:kind: {}
              f:name: {}
              f:uid: {}
        f:status:
          .: {}
          f:encryption: {}
          f:external-identifiers:
            .: {}
            f:container-id: {}
            f:k8s-namespace: {}
            f:k8s-pod-name: {}
            f:pod-name: {}
          f:id: {}
          f:identity:
            .: {}
            f:id: {}
            f:labels: {}
          f:networking:
            .: {}
            f:addressing: {}
            f:node: {}
          f:state: {}
      manager: cilium-agent
      operation: Update
      time: "2021-09-01T20:20:21Z"
    name: nginx-sqvrc
    namespace: default
    ownerReferences:
    - apiVersion: v1
      blockOwnerDeletion: true
      kind: Pod
      name: nginx-sqvrc
      uid: 5170ed73-89b3-4019-86dd-48b0607ee230
    resourceVersion: "11694246"
    uid: 84c23694-f504-4897-9554-0d4d6b54e74c

However, looking at the cilium-agent log, it seems like the CEP sync was turned away (as expected) when it has not yet got an identity:

2021-09-01T20:08:48.971939164Z level=debug msg="Starting new controller" name="sync-to-k8s-ciliumendpoint (1132)" subsys=controller uuid=1393ddc5-8dcd-4333-a33a-da93b0f316c4
2021-09-01T20:08:48.974901214Z level=debug msg="Skipping CiliumEndpoint update because security identity is not yet available" containerID=9bfc044735 controller="sync-to-k8s-ciliumendpoint (1132)" datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=1132 ipv4=10.0.2.221 ipv6= k8sPodName=default/nginx-sqvrc subsys=endpointsynchronizer

Then a moment later when it did get the identity, it went through with the endpointsynchronization without any error:

2021-09-01T20:08:48.975433142Z level=debug msg="Getting CEP during an initialization" containerID=9bfc044735 controller="sync-to-k8s-ciliumendpoint (1132)" datapathPolicyRevision=0 desiredPolicyRevision=0 endpointID=1132 identity=2912 ipv4=10.0.2.221 ipv6= k8sPodName=default/nginx-sqvrc subsys=endpointsynchronizer
2021-09-01T20:08:52.202489562Z level=debug msg="Controller func execution time: 3.229443121s" name="sync-to-k8s-ciliumendpoint (1132)" subsys=controller uuid=1393ddc5-8dcd-4333-a33a-da93b0f316c4

So at this point, it should've created a CEP in the apiserver with status populated...

The CEP was lasted updated at 2021-09-01T20:20:21Z, but the agent log ends at 2021-09-01T20:11:41Z, I wonder what happened between these two timestamps?

vincentmli · 2021-09-02T00:17:32Z

to add more confusion :) out of curiosity, I manually collect the operator log

 kubectl logs cilium-operator-64f685c4b7-wkxl9  -n kube-system > /tmp/op.log

and I found log entry

level=debug msg="Deleting unused identity" identity="&{{CiliumIdentity cilium.io/v2} {2912    9df9c02b-4d5c-4ebd-a2f3-ee63624a4a4d 9591200 1 2021-08-20 21:08:20 +0000 UTC <nil> <nil> map[app:nginx io.cilium.k8s.policy.cluster:default io.cilium.k8s.policy.serviceaccount:default io.kubernetes.pod.namespace:default] map[] [] []  [{cilium-agent Update cilium.io/v2 2021-08-20 21:08:20 +0000 UTC FieldsV1 {\"f:metadata\":{\"f:labels\":{\".\":{},\"f:app\":{},\"f:io.cilium.k8s.policy.cluster\":{},\"f:io.cilium.k8s.policy.serviceaccount\":{},\"f:io.kubernetes.pod.namespace\":{}}},\"f:security-labels\":{\".\":{},\"f:k8s:app\":{},\"f:k8s:io.cilium.k8s.policy.cluster\":{},\"f:k8s:io.cilium.k8s.policy.serviceaccount\":{},\"f:k8s:io.kubernetes.pod.namespace\":{}}}}]} map[k8s:app:nginx k8s:io.cilium.k8s.policy.cluster:default k8s:io.cilium.k8s.policy.serviceaccount:default k8s:io.kubernetes.pod.namespace:default]}" subsys=cilium-operator-generic
level=info msg="Garbage collected identity" identity=2912 subsys=cilium-operator-generic
level=debug msg="Marking identity alive" identity=37238 subsys=identity-heartbeat
level=debug msg="Controller func execution time: 15.150316ms" name=crd-identity-gc subsys=controller uuid=e776da1b-38d2-4caa-bf25-5d247d997e23
level=debug msg="Deleting identity in heartbeat lifesign table" identity=2912 subsys=identity-heartbeat

note the timestamp seems incorrect, but the endpoint identity 2912 matches the nginx endpoint identity, I also noticed above log output does not have the timestamp for each line, unlike the operator log collected by sysdump, and the operator log collected by sysdump does not have above log entry

Weil0ng · 2021-09-02T01:53:31Z

Hmm...that could the manifestation of not having a CEP reference the identity (not sure...). Is it possible to get apiserver audit log from your env? I wonder if the code ever hits https://github.com/cilium/cilium/blob/master/pkg/k8s/watchers/endpointsynchronizer.go#L168, which it should. So questions is:

If it hits the CREATE, since there's no error, why wouldn't the CEP be created with status field?
If it does not hit the CREATE, then what happened?

vincentmli · 2021-09-02T02:24:34Z

how do I enable apiserver audit log, after i enable apiserver audit log, re-create the nginx pod?

Weil0ng · 2021-09-02T04:16:05Z

how do I enable apiserver audit log, after i enable apiserver audit log, re-create the nginx pod?

Hmm, I'm actually not sure how to enable it, it depends on your environment, is that a self-manged cluster? If so, there should be a flag to apiserver. Yes, after you enable the log, recreate the nginx pod to retrigger this code path.

vincentmli · 2021-09-02T15:18:56Z

@Weil0ng
do you have specific audit config you want https://raw.githubusercontent.com/kubernetes/website/main/content/en/examples/audit/audit-policy.yaml, I am going to use that and following https://kubernetes.io/docs/tasks/debug-application-cluster/audit/ for apiserver auditing

vincentmli · 2021-09-02T16:08:45Z

@Weil0ng Here is another sysdump and apiserver audit log
audit.log.json.txt
cilium-sysdump-20210902-160023.zip

the nginx pod is

# kubectl get po -o wide
NAME          READY   STATUS    RESTARTS   AGE   IP           NODE            NOMINATED NODE   READINESS GATES
nginx-5f9kf   1/1     Running   0          15m   10.0.1.223   cilium-demo-3   <none>           <none>

Weil0ng · 2021-09-03T00:47:26Z

Thanks @vincentmli a lot for going through with the effort! I do see the following in the audit log which confirms the code path that is actually running:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "cf78ce4e-e2d2-427a-b67c-581b9149d8cb",
  "stage": "ResponseComplete",
  "requestURI": "/apis/cilium.io/v2/namespaces/default/ciliumendpoints/nginx-5f9kf",
  "verb": "get",
  "user": {
    "username": "system:serviceaccount:kube-system:cilium",
    "uid": "198bdd5d-5f58-4100-8747-c4effed81854",
    "groups": [
      "system:serviceaccounts",
      "system:serviceaccounts:kube-system",
      "system:authenticated"
    ]
  },
  "sourceIPs": [
    "10.169.72.19"
  ],
  "userAgent": "cilium-agent/v0.0.0 (linux/amd64) kubernetes/$Format",
  "objectRef": {
    "resource": "ciliumendpoints",
    "namespace": "default",
    "name": "nginx-5f9kf",
    "apiGroup": "cilium.io",
    "apiVersion": "v2"
  },
  "responseStatus": {
    "metadata": {},
    "status": "Failure",
    "reason": "NotFound",
    "code": 404
  },
  "requestReceivedTimestamp": "2021-09-02T15:48:08.546493Z",
  "stageTimestamp": "2021-09-02T15:48:08.582018Z",
  "annotations": {
    "authentication.k8s.io/legacy-token": "system:serviceaccount:kube-system:cilium",
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"cilium\" of ClusterRole \"cilium\" to ServiceAccount \"cilium/kube-system\""
  }
}

So above corresponds to the GET call we first do here: https://github.com/cilium/cilium/blob/master/pkg/k8s/watchers/endpointsynchronizer.go#L138

Then I can see right after, apiserver sees the following create event:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "f586f673-bb19-43de-a2aa-b4cd1038f317",
  "stage": "ResponseComplete",
  "requestURI": "/apis/cilium.io/v2/namespaces/default/ciliumendpoints",
  "verb": "create",
  "user": {
    "username": "system:serviceaccount:kube-system:cilium",
    "uid": "198bdd5d-5f58-4100-8747-c4effed81854",
    "groups": [
      "system:serviceaccounts",
      "system:serviceaccounts:kube-system",
      "system:authenticated"
    ]
  },
  "sourceIPs": [
    "10.169.72.19"
  ],
  "userAgent": "cilium-agent/v0.0.0 (linux/amd64) kubernetes/$Format",
  "objectRef": {
    "resource": "ciliumendpoints",
    "namespace": "default",
    "name": "nginx-5f9kf",
    "apiGroup": "cilium.io",
    "apiVersion": "v2"
  },
  "responseStatus": {
    "metadata": {},
    "code": 201
  },
  "requestReceivedTimestamp": "2021-09-02T15:48:08.584419Z",
  "stageTimestamp": "2021-09-02T15:48:08.591432Z",
  "annotations": {
    "authentication.k8s.io/legacy-token": "system:serviceaccount:kube-system:cilium",
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"cilium\" of ClusterRole \"cilium\" to ServiceAccount \"cilium/kube-system\""
  }
}

which is when we create the CEP with https://github.com/cilium/cilium/blob/master/pkg/k8s/watchers/endpointsynchronizer.go#L168

So far so good.

But then we got the following PATCH request right after, which got me a bit confused...

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "auditID": "0ab8e668-7c3e-46ce-a62d-b1ec3a2f188e",
  "stage": "ResponseComplete",
  "requestURI": "/apis/cilium.io/v2/namespaces/default/ciliumendpoints/nginx-5f9kf",
  "verb": "patch",
  "user": {
    "username": "system:serviceaccount:kube-system:cilium",
    "uid": "198bdd5d-5f58-4100-8747-c4effed81854",
    "groups": [
      "system:serviceaccounts",
      "system:serviceaccounts:kube-system",
      "system:authenticated"
    ]
  },
  "sourceIPs": [
    "10.169.72.19"
  ],
  "userAgent": "cilium-agent/v0.0.0 (linux/amd64) kubernetes/$Format",
  "objectRef": {
    "resource": "ciliumendpoints",
    "namespace": "default",
    "name": "nginx-5f9kf",
    "apiGroup": "cilium.io",
    "apiVersion": "v2"
  },
  "responseStatus": {
    "metadata": {},
    "code": 200
  },
  "requestReceivedTimestamp": "2021-09-02T15:48:08.594059Z",
  "stageTimestamp": "2021-09-02T15:48:08.599600Z",
  "annotations": {
    "authentication.k8s.io/legacy-token": "system:serviceaccount:kube-system:cilium",
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"cilium\" of ClusterRole \"cilium\" to ServiceAccount \"cilium/kube-system\""
  }
}

I think next step would be set audit log level to request so that we can capture what is being created with apiserver and what is being patched, not sure if you still have the cluster around to capture that @vincentmli ?

vincentmli · 2021-09-03T01:04:10Z

I think next step would be set audit log level to request so that we can capture what is being created with apiserver and what is being patched, not sure if you still have the cluster around to capture that @vincentmli ?

Below is the audit policy setting I used for apiserver, it has request level set already, can you take a look and see what needs to be changed?
https://raw.githubusercontent.com/kubernetes/website/main/content/en/examples/audit/audit-policy.yaml

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Log "pods/log", "pods/status" at Metadata level
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]

  # Don't log requests to a configmap called "controller-leader"
  - level: None
    resources:
    - group: ""
      resources: ["configmaps"]
      resourceNames: ["controller-leader"]

  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]

  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"

  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]

  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]

  # Log all other resources in core and extensions at the Request level.
  - level: Request
    resources:
    - group: "" # core API group
    - group: "extensions" # Version of group should NOT be included.

  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

Weil0ng · 2021-09-03T02:55:47Z

Yea, you are right, it's already at "Request", hmm...then I wonder if the object being created is without status fields...

Can you help take a look at #17192 (comment) @aanm ? Specifically where does that PATCH come from? Is that expected?

Weil0ng · 2021-09-06T23:03:07Z

friendly ping :) @aanm Can you help take a look at #17192 (comment)? Specifically where does that PATCH come from? Is that expected?

Weil0ng · 2021-09-07T18:35:35Z

@vincentmli If you don't mind giving this another go, can you please edit the last part in your audit policy to record non-core requests at "request" level too? I suspect the reason we did not see the request body is that ciliumendpoints requests were captured by this rule.

In short, can you change the last item in your audit policy to the following and reproduce the issue?

# A catch-all rule to log all other requests at the Request level.
  - level: Request
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

vincentmli · 2021-09-07T20:30:51Z

cilium-sysdump-20210907-202207.zip
audit.json.txt

@Weil0ng here you go

# kubectl get po -o wide
NAME          READY   STATUS    RESTARTS   AGE     IP          NODE            NOMINATED NODE   READINESS GATES
nginx-444p2   1/1     Running   0          6m21s   10.0.1.21   cilium-demo-3   <none>         ```

Weil0ng · 2021-09-07T21:20:14Z

Thank you! Now we can clearly see that the CEP was created with all fields populated correctly, see below it has all the addresses and identity as expected:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Request",
  "auditID": "83aec5e6-1274-40a7-b856-ca961890ffbf",
  "stage": "ResponseComplete",
  "requestURI": "/apis/cilium.io/v2/namespaces/default/ciliumendpoints",
  "verb": "create",
  "user": {
    "username": "system:serviceaccount:kube-system:cilium",
    "uid": "65345e8d-cf79-40b5-8b97-890fce9050c1",
    "groups": [
      "system:serviceaccounts",
      "system:serviceaccounts:kube-system",
      "system:authenticated"
    ]
  },
  "sourceIPs": [
    "10.169.72.19"
  ],
  "userAgent": "cilium-agent/v0.0.0 (linux/amd64) kubernetes/$Format",
  "objectRef": {
    "resource": "ciliumendpoints",
    "namespace": "default",
    "name": "nginx-444p2",
    "apiGroup": "cilium.io",
    "apiVersion": "v2"
  },
  "responseStatus": {
    "metadata": {},
    "code": 201
  },
  "requestObject": {
    "apiVersion": "cilium.io/v2",
    "kind": "CiliumEndpoint",
    "metadata": {
      "creationTimestamp": null,
      "labels": {
        "app": "nginx"
      },
      "name": "nginx-444p2",
      "ownerReferences": [
        {
          "apiVersion": "v1",
          "blockOwnerDeletion": true,
          "kind": "Pod",
          "name": "nginx-444p2",
          "uid": "4530d48a-2da9-4729-ac78-639dcd3b1ba8"
        }
      ]
    },
    "status": {
      "encryption": {},
      "external-identifiers": {
        "container-id": "47de2a8c27f1e2dacf528884bf6e42c6e400af2cd7bc122704a9eb76c2c6b09c",
        "k8s-namespace": "default",
        "k8s-pod-name": "nginx-444p2",
        "pod-name": "default/nginx-444p2"
      },
      "id": 1940,
      "identity": {
        "id": 34093,
        "labels": [
          "k8s:app=nginx",
          "k8s:io.cilium.k8s.policy.cluster=default",
          "k8s:io.cilium.k8s.policy.serviceaccount=default",
          "k8s:io.kubernetes.pod.namespace=default"
        ]
      },
      "networking": {
        "addressing": [
          {
            "ipv4": "10.0.1.21"
          }
        ],
        "node": "10.169.72.19"
      },
      "state": "ready"
    }
  },
  "requestReceivedTimestamp": "2021-09-07T20:20:45.048681Z",
  "stageTimestamp": "2021-09-07T20:20:45.057697Z",
  "annotations": {
    "authentication.k8s.io/legacy-token": "system:serviceaccount:kube-system:cilium",
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"cilium\" of ClusterRole \"cilium\" to ServiceAccount \"cilium/kube-system\""
  }
}

There's also a PATCH request right after this with all fields populated with identical values...

so in summary, so far we've confirmed:

The CEP was patched with a correct object (fully populated) as expected
The code path that was actually run is also as expected

Question still need answer:

why is there a patch request right after with identical values
why then it somehow ends up in a state without it's status fields

aanm · 2021-09-15T14:32:16Z

@Weil0ng

The patch comes from here

cilium/pkg/k8s/watchers/endpointsynchronizer.go

Lines 245 to 249 in ed37faf

    
           localCEP, err = ciliumClient.CiliumEndpoints(namespace).Patch( 
        
           	ctx, podName, 
        
           	types.JSONPatchType, 
        
           	createStatusPatch, 
        
           	meta_v1.PatchOptions{})

One can't create an object with the status fields populated, it needs to create the object without the fields and then update the status afterwards.

Let me know if that helps.

Weil0ng · 2021-09-15T22:15:47Z

Let me know if that helps.

Yes this is the old model with status being a subresource, but in master we have merged the change that makes status a top-level plain field instead of a subresource, and the code path I think should be taken is that:

we create the obj WITH status populated in here https://github.com/cilium/cilium/blob/master/pkg/k8s/watchers/endpointsynchronizer.go#L168
we then exit this controller interation here https://github.com/cilium/cilium/blob/master/pkg/k8s/watchers/endpointsynchronizer.go#L196
so there should not be a PATCH request unless the CEP changes.

Weil0ng · 2021-10-28T21:46:21Z

Coming back to this, I wonder @vincentmli if you still experience this issue? I cannot repro this on GKE at least (using master cilium image). If you still can repro this, can you please provide output from kubectl get crd ciliumendpoints -o yaml? I wonder if the problem is somehow the CRD is still using subresource...

vincentmli · 2021-10-28T22:30:56Z

yes, the problem still here

kubectl get crd ciliumendpoints.cilium.io -o yaml


apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  creationTimestamp: "2021-06-14T22:12:59Z"
  generation: 1
  labels:
    io.cilium.k8s.crd.schema.version: 1.23.3
  name: ciliumendpoints.cilium.io
  resourceVersion: "13829675"
  uid: 0ef89eac-8dbd-488a-bbc5-cd8a00918c8d
spec:
  conversion:
    strategy: None
  group: cilium.io
  names:
    kind: CiliumEndpoint
    listKind: CiliumEndpointList
    plural: ciliumendpoints
    shortNames:
    - cep
    - ciliumep
    singular: ciliumendpoint
  scope: Namespaced
  versions:
  - additionalPrinterColumns:
    - description: Cilium endpoint id
      jsonPath: .status.id
      name: Endpoint ID
      type: integer
    - description: Cilium identity id
      jsonPath: .status.identity.id
      name: Identity ID
      type: integer
    - description: Ingress enforcement in the endpoint
      jsonPath: .status.policy.ingress.enforcing
      name: Ingress Enforcement
      type: boolean
    - description: Egress enforcement in the endpoint
      jsonPath: .status.policy.egress.enforcing
      name: Egress Enforcement
      type: boolean
    - description: Status of visibility policy in the endpoint
      jsonPath: .status.visibility-policy-status
      name: Visibility Policy
      type: string
    - description: Endpoint current state
      jsonPath: .status.state
      name: Endpoint State
      type: string
    - description: Endpoint IPv4 address
      jsonPath: .status.networking.addressing[0].ipv4
      name: IPv4
      type: string
    - description: Endpoint IPv6 address
      jsonPath: .status.networking.addressing[0].ipv6
      name: IPv6
      type: string
    name: v2
    schema:
      openAPIV3Schema:
        description: CiliumEndpoint is the status of a Cilium policy rule.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          status:
            description: EndpointStatus is the status of a Cilium endpoint.
            properties:
              controllers:
                description: Controllers is the list of failing controllers for this
                  endpoint.
                items:
                  description: ControllerStatus is the status of a failing controller.
                  properties:
                    configuration:
                      description: Configuration is the controller configuration
                      properties:
                        error-retry:
                          description: Retry on error
                          type: boolean
                        error-retry-base:
                          description: 'Base error retry back-off time Format: duration'
                          format: int64
                          type: integer
                        interval:
                          description: 'Regular synchronization interval Format: duration'
                          format: int64
                          type: integer
                      type: object
                    name:
                      description: Name is the name of the controller
                      type: string
                    status:
                      description: Status is the status of the controller
                      properties:
                        consecutive-failure-count:
                          format: int64
                          type: integer
                        failure-count:
                          format: int64
                          type: integer
                        last-failure-msg:
                          type: string
                        last-failure-timestamp:
                          type: string
                        last-success-timestamp:
                          type: string
                        success-count:
                          format: int64
                          type: integer
                      type: object
                    uuid:
                      description: UUID is the UUID of the controller
                      type: string
                  type: object
                type: array
              encryption:
                description: Encryption is the encryption configuration of the node
                properties:
                  key:
                    description: Key is the index to the key to use for encryption
                      or 0 if encryption is disabled.
                    type: integer
                type: object
              external-identifiers:
                description: ExternalIdentifiers is a set of identifiers to identify
                  the endpoint apart from the pod name. This includes container runtime
                  IDs.
                properties:
                  container-id:
                    description: ID assigned by container runtime
                    type: string
                  container-name:
                    description: Name assigned to container
                    type: string
                  docker-endpoint-id:
                    description: Docker endpoint ID
                    type: string
                  docker-network-id:
                    description: Docker network ID
                    type: string
                  k8s-namespace:
                    description: K8s namespace for this endpoint
                    type: string
                  k8s-pod-name:
                    description: K8s pod name for this endpoint
                    type: string
                  pod-name:
                    description: K8s pod for this endpoint(Deprecated, use K8sPodName
                      and K8sNamespace instead)
                    type: string
                type: object
              health:
                description: Health is the overall endpoint & subcomponent health.
                properties:
                  bpf:
                    description: bpf
                    type: string
                  connected:
                    description: Is this endpoint reachable
                    type: boolean
                  overallHealth:
                    description: overall health
                    type: string
                  policy:
                    description: policy
                    type: string
                type: object
              id:
                description: ID is the cilium-agent-local ID of the endpoint.
                format: int64
                type: integer
              identity:
                description: Identity is the security identity associated with the
                  endpoint
                properties:
                  id:
                    description: ID is the numeric identity of the endpoint
                    format: int64
                    type: integer
                  labels:
                    description: Labels is the list of labels associated with the
                      identity
                    items:
                      type: string
                    type: array
                type: object
              log:
                description: Log is the list of the last few warning and error log
                  entries
                items:
                  description: "EndpointStatusChange Indication of a change of status
                    \n swagger:model EndpointStatusChange"
                  properties:
                    code:
                      description: 'Code indicate type of status change Enum: [ok
                        failed]'
                      type: string
                    message:
                      description: Status message
                      type: string
                    state:
                      description: state
                      type: string
                    timestamp:
                      description: Timestamp when status change occurred
                      type: string
                  type: object
                type: array
              named-ports:
                description: "NamedPorts List of named Layer 4 port and protocol pairs
                  which will be used in Network Policy specs. \n swagger:model NamedPorts"
                items:
                  description: "Port Layer 4 port / protocol pair \n swagger:model
                    Port"
                  properties:
                    name:
                      description: Optional layer 4 port name
                      type: string
                    port:
                      description: Layer 4 port number
                      type: integer
                    protocol:
                      description: 'Layer 4 protocol Enum: [TCP UDP ANY]'
                      type: string
                  type: object
                type: array
              networking:
                description: Networking is the networking properties of the endpoint.
                properties:
                  addressing:
                    description: IP4/6 addresses assigned to this Endpoint
                    items:
                      description: AddressPair is is a par of IPv4 and/or IPv6 address.
                      properties:
                        ipv4:
                          type: string
                        ipv6:
                          type: string
                      type: object
                    type: array
                  node:
                    description: NodeIP is the IP of the node the endpoint is running
                      on. The IP must be reachable between nodes.
                    type: string
                required:
                - addressing
                type: object
              policy:
                description: EndpointPolicy represents the endpoint's policy by listing
                  all allowed ingress and egress identities in combination with L4
                  port and protocol.
                properties:
                  egress:
                    description: EndpointPolicyDirection is the list of allowed identities
                      per direction.
                    properties:
                      adding:
                        description: Deprecated
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                      allowed:
                        description: AllowedIdentityList is a list of IdentityTuples
                          that species peers that are allowed.
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                      denied:
                        description: DenyIdentityList is a list of IdentityTuples
                          that species peers that are denied.
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                      enforcing:
                        type: boolean
                      removing:
                        description: Deprecated
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                    required:
                    - enforcing
                    type: object
                  ingress:
                    description: EndpointPolicyDirection is the list of allowed identities
                      per direction.
                    properties:
                      adding:
                        description: Deprecated
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                      allowed:
                        description: AllowedIdentityList is a list of IdentityTuples
                          that species peers that are allowed.
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                      denied:
                        description: DenyIdentityList is a list of IdentityTuples
                          that species peers that are denied.
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                      enforcing:
                        type: boolean
                      removing:
                        description: Deprecated
                        items:
                          description: IdentityTuple specifies a peer by identity,
                            destination port and protocol.
                          properties:
                            dest-port:
                              type: integer
                            identity:
                              format: int64
                              type: integer
                            identity-labels:
                              additionalProperties:
                                type: string
                              type: object
                            protocol:
                              type: integer
                          type: object
                        type: array
                    required:
                    - enforcing
                    type: object
                type: object
              state:
                description: State is the state of the endpoint.
                enum:
                - creating
                - waiting-for-identity
                - not-ready
                - waiting-to-regenerate
                - regenerating
                - restoring
                - ready
                - disconnecting
                - disconnected
                - invalid
                type: string
              visibility-policy-status:
                type: string
            type: object
        required:
        - metadata
        type: object
    served: true
    storage: true
    subresources:
      status: {}
status:
  acceptedNames:
    kind: CiliumEndpoint
    listKind: CiliumEndpointList
    plural: ciliumendpoints
    shortNames:
    - cep
    - ciliumep
    singular: ciliumendpoint
  conditions:
  - lastTransitionTime: "2021-06-14T22:12:59Z"
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: "2021-06-14T22:12:59Z"
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  storedVersions:
  - v2

Weil0ng · 2021-10-29T05:19:27Z

Thank you! Ah oki, I see it now, this CRD still has status as a subresource

 subresources:
      status: {}

I checked the schema resource version at the commit in your initial msg, it should be 1.23.4

cilium/pkg/k8s/apis/cilium.io/v2/register.go

Line 26 in c44ff1b

CustomResourceDefinitionSchemaVersion = "1.23.4"

But somehow your cluster has 1.23.3 and I actually am not sure why 1.23.3 still has status as subresource, I don't see it in the tree

cilium/pkg/k8s/apis/cilium.io/v2/types.go

Line 41 in 8d8a7f8

    
           // +kubebuilder:printcolumn:JSONPath=".status.networking.addressing[0].ipv6",description="Endpoint IPv6 address",name="IPv6",type=string

I wonder if you only upgraded your agent from an old ver but not operator? Seems like the CRD was not updated correctly and that's why we are losing the status field.

vincentmli · 2021-10-29T14:20:32Z

yes, I always run cilium/operator-generic:stable, so that is the problem?

Weil0ng · 2021-10-29T16:17:49Z

I think so, the CRDs are registered by the operator, if it does not restart, I suspect the CRDs were never updated. So eventually you have an outdated CRD (which has status as a subresource) with the new agent code which will update the CEP assuming it is using the new CRD (which has status as a plain field), and that's why we see the object fully populated in the API request but not persisting in the etcd.

Weil0ng · 2021-10-29T16:31:23Z

As a mitigation, can you please try restarting/upgrading the operator and see if it fixes the problem?

vincentmli · 2021-10-29T17:14:01Z

@Weil0ng yes, make docker-operator-generic-image and run the operator image fixed the problem, sorry I did not realize the operator image better match the agent image build :)

joestringer · 2021-10-29T17:57:19Z

Cilium is expected to be deployed with one consistent version across all components, so that makes sense.Thanks for digging to the bottom of this! Closing out as it seems like the real solution is to upgrade the operator to match the Cilium version.

vincentmli added the kind/bug This is a bug in the Cilium logic. label Aug 19, 2021

tklauser added kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Aug 19, 2021

vincentmli mentioned this issue Aug 20, 2021

Separate tunnel map key and value structure #17106

Closed

6 tasks

brb added the release-blocker/1.11 This issue will prevent the release of the next version of Cilium. label Sep 2, 2021

brb mentioned this issue Sep 3, 2021

WIP: Debug IPCache issue in NodePort tests #17299

Closed

Weil0ng self-assigned this Sep 7, 2021

vincentmli mentioned this issue Sep 16, 2021

Long live connection being disrupted as CIS deleting arp/fdb F5Networks/k8s-bigip-ctlr#1973

Closed

joestringer removed the release-blocker/1.11 This issue will prevent the release of the next version of Cilium. label Oct 29, 2021

joestringer closed this as completed Oct 29, 2021

Weil0ng mentioned this issue Nov 1, 2021

Adds a warning in the upgrade doc about split cluster #17755

Merged

Cilium upstream master branch breaks nodeport connection from external client #17192

Cilium upstream master branch breaks nodeport connection from external client #17192

Comments

vincentmli commented Aug 19, 2021

Bug report

vincentmli commented Aug 19, 2021

vincentmli commented Aug 19, 2021

vincentmli commented Aug 20, 2021

vincentmli commented Aug 20, 2021

joestringer commented Aug 21, 2021

Weil0ng commented Aug 22, 2021

borkmann commented Sep 1, 2021

vincentmli commented Sep 1, 2021

joestringer commented Sep 1, 2021

Weil0ng commented Sep 1, 2021

vincentmli commented Sep 1, 2021

vincentmli commented Sep 1, 2021

Weil0ng commented Sep 1, 2021

vincentmli commented Sep 1, 2021

Weil0ng commented Sep 1, 2021

Weil0ng commented Sep 1, 2021 • edited Loading

vincentmli commented Sep 2, 2021

Weil0ng commented Sep 2, 2021

vincentmli commented Sep 2, 2021

Weil0ng commented Sep 2, 2021

vincentmli commented Sep 2, 2021

vincentmli commented Sep 2, 2021

Weil0ng commented Sep 3, 2021

vincentmli commented Sep 3, 2021

Weil0ng commented Sep 3, 2021

Weil0ng commented Sep 6, 2021

Weil0ng commented Sep 7, 2021

vincentmli commented Sep 7, 2021

Weil0ng commented Sep 7, 2021 • edited Loading

aanm commented Sep 15, 2021

Weil0ng commented Sep 15, 2021

Weil0ng commented Oct 28, 2021

vincentmli commented Oct 28, 2021

Weil0ng commented Oct 29, 2021

vincentmli commented Oct 29, 2021

Weil0ng commented Oct 29, 2021

Weil0ng commented Oct 29, 2021

vincentmli commented Oct 29, 2021

joestringer commented Oct 29, 2021

Weil0ng commented Sep 1, 2021 •

edited

Loading

Weil0ng commented Sep 7, 2021 •

edited

Loading