Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

host localhost ip cannot discover normally when cilium agent rebuild #25376

Closed
2 tasks done
yanhongchang opened this issue May 11, 2023 · 5 comments
Closed
2 tasks done
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Comments

@yanhongchang
Copy link

yanhongchang commented May 11, 2023

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

1、deploy k8s by kubeadm

2、deploy cilium and setting as follow:

auto-direct-node-routes: "false"
bpf-lb-external-clusterip: "false"
bpf-lb-map-max: "65536"
bpf-map-dynamic-size-ratio: "0.0025"
bpf-policy-map-max: "16384"
cgroup-root: /run/cilium/cgroupv2
cilium-endpoint-gc-interval: 5m0s
cluster-id: ""
cluster-name: default
custom-cni-conf: "false"
debug: "false"
disable-cnp-status-updates: "true"
enable-auto-protect-node-port-range: "true"
enable-bandwidth-manager: "true"
enable-bpf-clock-probe: "true"
enable-bpf-masquerade: "true"
enable-endpoint-health-checking: "true"
enable-health-check-nodeport: "true"
enable-health-checking: "true"
enable-hubble: "true"
enable-ip-masq-agent: "true"
enable-ipv4: "true"
enable-ipv4-masquerade: "true"
enable-ipv6: "false"
enable-ipv6-masquerade: "true"
enable-l7-proxy: "true"
enable-local-redirect-policy: "false"
enable-metrics: "true"
enable-policy: default
enable-remote-node-identity: "true"
enable-session-affinity: "true"
enable-well-known-identities: "false"
enable-xt-socket-fallback: "true"
hubble-disable-tls: "false"
hubble-listen-address: :4244
hubble-metrics: dns drop tcp flow port-distribution icmp http
hubble-metrics-server: :9099
hubble-socket-path: /var/run/cilium/hubble.sock
hubble-tls-cert-file: /var/lib/cilium/tls/hubble/server.crt
hubble-tls-client-ca-files: /var/lib/cilium/tls/hubble/client-ca.crt
hubble-tls-key-file: /var/lib/cilium/tls/hubble/server.key
identity-allocation-mode: crd
install-iptables-rules: "true"
install-no-conntrack-iptables-rules: "false"
ipam: kubernetes
kube-proxy-replacement: strict
kube-proxy-replacement-healthz-bind-address: ""
monitor-aggregation: medium
monitor-aggregation-flags: all
monitor-aggregation-interval: 5s
native-routing-cidr: 10.52.0.0/16
node-port-bind-protection: "true"
operator-api-serve-addr: 127.0.0.1:9234
operator-prometheus-serve-addr: :6942
preallocate-bpf-maps: "false"
prometheus-serve-addr: :9098
proxy-prometheus-port: "9095"
sidecar-istio-proxy-image: cilium/istio_proxy
tunnel: disabled

3、add dummy device on host

ip link add yhctest type dummy
ip addr add 169.254.20.11 dev yhctest

then into a container on the node and ping the ip: 169.254.20.11, ping normally.

image

cilium bpf endpoint list:
image

4、rebuild the cilium-agent on this node:

docker stop c1a5819f14d8;docker rm c1a5819f14d8;

cilium-agent will recreated automaticlly.
image

5、and redo the 3 step ,and found the ping down
image

cilium bpf endpoint list and can not see the 169.254.20.11 in the list:
image

Cilium Version

Client: 1.10.4 2a46fd6 2021-09-01T12:58:41-07:00 go version go1.16.7 linux/amd64
Daemon: 1.10.4 2a46fd6 2021-09-01T12:58:41-07:00 go version go1.16.7 linux/amd64

Kernel Version

kernel: 5.10.23-4.al8.x86_64

Kubernetes Version

Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T13:00:45Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.8", GitCommit:"5575935422cc1cf5169dfc8847cb587aa47bac5a", GitTreeState:"clean", BuildDate:"2021-06-16T12:53:07Z", GoVersion:"go1.15.13", Compiler:"gc", Platform:"linux/amd64"}

Sysdump

No response

Relevant log output

No response

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@yanhongchang yanhongchang added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels May 11, 2023
@tklauser
Copy link
Member

There was a regression the way link-local addresses are handled which was recently fixed by #25298. Could you check whether that PR addresses your issue?

@yanhongchang
Copy link
Author

Ok, I will check the issue,thanks!

@youngnick youngnick added the sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. label May 11, 2023
@yanhongchang
Copy link
Author

There was a regression the way link-local addresses are handled which was recently fixed by #25298. Could you check whether that PR addresses your issue?

I have checked the https://github.com/cilium/cilium/pull/25298,I think its different.
I have checked the code about datapath and syncEndpointsAndHostIPs() function again,
the code here

// ... also all down devices since they won't be reachable.
is explained that if device is down and the ip of the device is Excluded,but when I add device by step 3, the device is down, why the cilium bpf endpoint list will have the ip?

@github-actions
Copy link

This issue has been automatically marked as stale because it has not
had recent activity. It will be closed if no further activity occurs.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Jul 11, 2023
@github-actions
Copy link

This issue has not seen any activity since it was marked stale.
Closing.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
None yet
Development

No branches or pull requests

3 participants