Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: controller bpf-map-sync-cilium_lb_affinity_match is failing #12118

Closed
jrajahalme opened this issue Jun 16, 2020 · 7 comments · Fixed by #12134
Closed

CI: controller bpf-map-sync-cilium_lb_affinity_match is failing #12118

jrajahalme opened this issue Jun 16, 2020 · 7 comments · Fixed by #12134
Assignees
Labels
area/daemon Impacts operation of the Cilium daemon. kind/bug This is a bug in the Cilium logic.

Comments

@jrajahalme
Copy link
Member

Test Endpoint can still connect while Cilium is not running failed on test-gke:

11:54:38  18:54:37 STEP: Cilium is not ready yet: controllers are failing: cilium-agent 'cilium-d884n': controller bpf-map-sync-cilium_lb_affinity_match is failing: Exitcode: 0 
11:54:38  Stdout:
11:54:38   	 KVStore:                Ok   Disabled
11:54:38  	 Kubernetes:             Ok   1.14+ (v1.14.10-gke.36) [linux/amd64]
11:54:38  	 Kubernetes APIs:        ["CustomResourceDefinition", "cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Endpoint", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "networking.k8s.io/v1::NetworkPolicy"]
11:54:38  	 KubeProxyReplacement:   Probe   []   [SessionAffinity]
11:54:38  	 Cilium:                 Ok      OK
11:54:38  	 NodeMonitor:            Listening for events on 4 CPUs with 64x4096 of shared memory
11:54:38  	 Cilium health daemon:   Ok   
11:54:38  	 IPAM:                   IPv4: 2/255 allocated from 10.4.47.0/24, 
11:54:38  	 Masquerading:           IPTables
11:54:38  	 Controller Status:      17/18 healthy
11:54:38  	   Name                                    Last success   Last error   Count   Message
11:54:38  	   bpf-map-sync-cilium_lb_affinity_match   never          15s ago      23      4 map sync errors   
11:54:38  	   cilium-health-ep                        26s ago        never        0       no error            
11:54:38  	   dns-garbage-collector-job               27s ago        never        0       no error            
11:54:38  	   endpoint-1949-regeneration-recovery     never          never        0       no error            
11:54:38  	   endpoint-827-regeneration-recovery      never          never        0       no error            
11:54:38  	   k8s-heartbeat                           31s ago        never        0       no error            
11:54:38  	   mark-k8s-node-as-available              4m27s ago      never        0       no error            
11:54:38  	   metricsmap-bpf-prom-sync                6s ago         never        0       no error            
11:54:38  	   resolve-identity-1949                   4m26s ago      never        0       no error            
11:54:38  	   resolve-identity-827                    4m27s ago      never        0       no error            
11:54:38  	   sync-endpoints-and-host-ips             27s ago        never        0       no error            
11:54:38  	   sync-lb-maps-with-k8s-services          4m27s ago      never        0       no error            
11:54:38  	   sync-policymap-1949                     25s ago        never        0       no error            
11:54:38  	   sync-policymap-827                      26s ago        never        0       no error            
11:54:38  	   sync-to-k8s-ciliumendpoint (1949)       6s ago         never        0       no error            
11:54:38  	   sync-to-k8s-ciliumendpoint (827)        7s ago         never        0       no error            
11:54:38  	   template-dir-watcher                    never          never        0       no error            
11:54:38  	   update-k8s-node-annotations             4m28s ago      never        0       no error            
11:54:38  	 Proxy Status:     OK, ip 10.4.47.246, 0 redirects active on ports 10000-20000
11:54:38  	 Hubble:           Ok              Current/Max Flows: 336/4096 (8.20%), Flows/s: 1.27   Metrics: Disabled
11:54:38  	 Cluster health:   2/2 reachable   (2020-06-16T18:54:29Z)
@jrajahalme jrajahalme added the area/CI Continuous Integration testing issue or flake label Jun 16, 2020
@jrajahalme
Copy link
Member Author

Same failure on both Cilium nodes:

11:55:27  ⚠️  Cilium agent 'cilium-d884n': Status: Ok  Health: Ok Nodes "" ContinerRuntime:  Kubernetes: Ok KVstore: Ok Controllers: Total 18 Failed 1
11:55:27  Failed controllers:
11:55:27   controller bpf-map-sync-cilium_lb_affinity_match failure '4 map sync errors'
11:55:27  ⚠️  Cilium agent 'cilium-nxwnv': Status: Ok  Health: Ok Nodes "" ContinerRuntime:  Kubernetes: Ok KVstore: Ok Controllers: Total 18 Failed 1
11:55:27  Failed controllers:
11:55:27   controller bpf-map-sync-cilium_lb_affinity_match failure '4 map sync errors'

@jrajahalme
Copy link
Member Author

@jrajahalme
Copy link
Member Author

uname -a from one of the Cilium PODs:

Linux gke-jarno-cluster-2-default-pool-e9824c89-z065 4.14.138+ #1 SMP Tue Sep 3 02:58:08 PDT 2019 x86_64 x86_64 x86_64 GNU/Linux

@jrajahalme
Copy link
Member Author

Build results for the above: https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/1687/

@brb
Copy link
Member

brb commented Jun 17, 2020

Seems that the affinity match map has not been created:

2020-06-16T18:50:08.71975432Z level=warning msg="Unable to add entry to affinity match map" backendID=5 error="Unable to get object /sys/fs/bpf/tc/globals/cilium_lb_affinity_match: no such file or directory" serviceID=11 subsys=service

@brb brb added kind/bug This is a bug in the Cilium logic. area/daemon Impacts operation of the Cilium daemon. and removed area/CI Continuous Integration testing issue or flake labels Jun 17, 2020
@brb
Copy link
Member

brb commented Jun 17, 2020

OK, this is a bug - initMaps() is called before initKubeProxyReplacementOptions(). In your case, the latter enabled the session affinity, but the related maps can be created by initMaps(). Therefore, the affinity maps were not created, and thus the controller failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/daemon Impacts operation of the Cilium daemon. kind/bug This is a bug in the Cilium logic.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants