Skip to content

CAPZ (no AKS) with Cilium - podCIDR communication not works #5700

Open
@s0uky

Description

@s0uky

/kind bug

What steps did you take and what happened:
I'm able to built CAPZ cluster on top of Azure instances (no AKS) with Cilium as CNI. I have private VNET not managed by CAPZ. I have cluster with 1 control-plane and 3 worker nodes.

Problem is a defined cluster CIDR is not reachable between nodes, e.g.

Cluster CIDR is 10.88.0.0/16
POD1 (IP 10.88.1.64) on WORKER1 (IP 10.117.21.109) can't reach POD2 (IP 10.88.4.77) on WORKER2 (IP 10.117.21.109) 

Communication is allowed through NSG (Any-Any), but the same subnet should be opened by default.
RouteTable is created and reconciled by CAPZ and also properly filled:

Cilium was added as a cluster addon.

Image

Cilium interface on node is properly created:

4: cilium_net@cilium_host: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether ea:eb:9d:aa:2b:2b brd ff:ff:ff:ff:ff:ff
    inet6 fe80::e8eb:9dff:feaa:2b2b/64 scope link
       valid_lft forever preferred_lft forever
5: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 7e:02:15:79:39:e2 brd ff:ff:ff:ff:ff:ff
    inet 10.88.4.48/32 scope global cilium_host
       valid_lft forever preferred_lft forever
    inet6 fe80::7c02:15ff:fe79:39e2/64 scope link
       valid_lft forever preferred_lft forever

Cilium status:

Name              IP              Node   Endpoints
  worker1-machinedeployment-9qr7m-qc56f (localhost):
    Host connectivity to 10.117.21.118:
      ICMP to stack:   OK, RTT=125.694µs
      HTTP to agent:   OK, RTT=221.314µs
    Endpoint connectivity to 10.88.1.64:
      ICMP to stack:   OK, RTT=151.943µs
      HTTP to agent:   OK, RTT=525.348µs
  worker2-machinedeployment-9qr7m-72sfk:
    Host connectivity to 10.117.21.109:
      ICMP to stack:   OK, RTT=719.321µs
      HTTP to agent:   OK, RTT=424.48µs
    Endpoint connectivity to 10.88.4.77:
      ICMP to stack:   Connection timed out
      HTTP to agent:   Get "http://10.88.4.77:4240/hello": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  worker3-machinedeployment-9qr7m-7g6ln:
    Host connectivity to 10.117.21.10:
      ICMP to stack:   OK, RTT=692.478µs
      HTTP to agent:   OK, RTT=418.877µs
    Endpoint connectivity to 10.88.2.199:
      ICMP to stack:   Connection timed out
      HTTP to agent:   Get "http://10.88.2.199:4240/hello": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  master1-z9z9s:
    Host connectivity to 10.117.20.100:
      ICMP to stack:   OK, RTT=4.251332ms
      HTTP to agent:   OK, RTT=491.527µs
    Endpoint connectivity to 10.88.0.110:
      ICMP to stack:   Connection timed out
      HTTP to agent:   Get "http://10.88.0.110:4240/hello": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Cilium configuration

     identity-allocation-mode: crd
      identity-heartbeat-timeout: 30m0s
      identity-gc-interval: 15m0s
      cilium-endpoint-gc-interval: 5m0s
      nodes-gc-interval: 5m0s
      debug: "true"
      enable-policy: default
      prometheus-serve-addr: :9962
      controller-group-metrics: write-cni-file sync-host-ips sync-lb-maps-with-k8s-services
      proxy-prometheus-port: "9964"
      operator-prometheus-serve-addr: :9963
      enable-metrics: "true"
      enable-ipv4: "true"
      enable-ipv6: "false"
      custom-cni-conf: "false"
      enable-bpf-clock-probe: "false"
      enable-bpf-tproxy: "false"
      monitor-aggregation: medium
      monitor-aggregation-interval: 5s
      monitor-aggregation-flags: all
      bpf-map-dynamic-size-ratio: "0.0025"
      enable-host-legacy-routing: "false"
      bpf-policy-map-max: "16384"
      bpf-lb-map-max: "65536"
      bpf-lb-external-clusterip: "true"
      bpf-events-drop-enabled: "true"
      bpf-events-policy-verdict-enabled: "true"
      bpf-events-trace-enabled: "true"
      preallocate-bpf-maps: "false"
      cluster-name: default
      cluster-id: "0"
      routing-mode: native
      service-no-backend-response: reject
      enable-l7-proxy: "true"
      enable-ipv4-masquerade: "true"
      enable-ipv4-big-tcp: "false"
      enable-ipv6-big-tcp: "false"
      enable-ipv6-masquerade: "true"
      enable-tcx: "true"
      datapath-mode: veth
      enable-bpf-masquerade: "true"
      enable-masquerade-to-route-source: "false"
      enable-xt-socket-fallback: "true"
      install-no-conntrack-iptables-rules: "false"
      auto-direct-node-routes: "false"
      direct-node-routes-skip-unreachable: "true"
      direct-routing-skip-unreachable: "false"
      enable-local-redirect-policy: "false"
      ipv4-native-routing-cidr: 10.88.0.0/16
      enable-runtime-device-detection: "true"
      kube-proxy-replacement: "true"
      kube-proxy-replacement-healthz-bind-address: 0.0.0.0:10256
      bpf-lb-sock: "false"
      bpf-lb-sock-terminate-pod-connections: "false"
      enable-health-check-nodeport: "true"
      enable-health-check-loadbalancer-ip: "false"
      node-port-bind-protection: "true"
      enable-auto-protect-node-port-range: "true"
      bpf-lb-mode: dsr
      bpf-lb-acceleration: disabled
      enable-svc-source-range-check: "true"
      enable-l2-neigh-discovery: "true"
      arping-refresh-period: 30s
      k8s-require-ipv4-pod-cidr: "false"
      k8s-require-ipv6-pod-cidr: "false"
      enable-k8s-networkpolicy: "true"
      write-cni-conf-when-ready: /host/etc/cni/net.d/05-cilium.conflist
      cni-exclusive: "true"
      cni-log-file: /var/run/cilium/cilium-cni.log
      enable-endpoint-health-checking: "true"
      enable-health-checking: "true"
      enable-well-known-identities: "false"
      enable-node-selector-labels: "false"
      synchronize-k8s-nodes: "true"
      operator-api-serve-addr: 127.0.0.1:9234
      enable-hubble: "true"
      hubble-socket-path: /var/run/cilium/hubble.sock
      hubble-metrics-server: :9965
      hubble-metrics-server-enable-tls: "false"
      hubble-metrics: dns drop tcp flow icmp http
      enable-hubble-open-metrics: "false"
      hubble-export-file-max-size-mb: "10"
      hubble-export-file-max-backups: "5"
      hubble-listen-address: :4244
      hubble-disable-tls: "true"
      ipam: kubernetes
      ipam-cilium-node-update-rate: 15s
      cluster-pool-ipv4-cidr: 10.88.0.0/16
      cluster-pool-ipv4-mask-size: "24"
      egress-gateway-reconciliation-trigger-interval: 1s
      enable-vtep: "false"
      vtep-endpoint: ""
      vtep-cidr: ""
      vtep-mask: ""
      vtep-mac: ""
      procfs: /host/proc
      bpf-root: /sys/fs/bpf
      cgroup-root: /run/cilium/cgroupv2
      enable-k8s-terminating-endpoint: "true"
      enable-sctp: "false"
      k8s-client-qps: "10"
      k8s-client-burst: "20"
      remove-cilium-node-taints: "true"
      set-cilium-node-taints: "true"
      set-cilium-is-up-condition: "true"
      unmanaged-pod-watcher-interval: "15"
      dnsproxy-enable-transparent-mode: "true"
      dnsproxy-socket-linger-timeout: "10"
      tofqdns-dns-reject-response-code: refused
      tofqdns-enable-dns-compression: "true"
      tofqdns-endpoint-max-ip-per-hostname: "50"
      tofqdns-idle-connection-grace-period: 0s
      tofqdns-max-deferred-connection-deletes: "10000"
      tofqdns-proxy-response-max-delay: 100ms
      agent-not-ready-taint-key: node.cilium.io/agent-not-ready
      mesh-auth-enabled: "true"
      mesh-auth-queue-size: "1024"
      mesh-auth-rotated-identities-queue-size: "1024"
      mesh-auth-gc-interval: 5m0s
      proxy-xff-num-trusted-hops-ingress: "0"
      proxy-xff-num-trusted-hops-egress: "0"
      proxy-connect-timeout: "2"
      proxy-max-requests-per-connection: "0"
      proxy-max-connection-duration-seconds: "0"
      proxy-idle-timeout-seconds: "60"
      external-envoy-proxy: "false"
      envoy-base-id: "0"
      envoy-keep-cap-netbindservice: "false"
      max-connected-clusters: "255"
      clustermesh-enable-endpoint-sync: "false"
      clustermesh-enable-mcs-api: "false"
      nat-map-stats-entries: "32"
      nat-map-stats-interval: 30s

What did you expect to happen:
Communication within podCIDR should work.

Environment:

  • cluster-api-provider-azure version: v1.19.4
  • Kubernetes version: (use kubectl version): 1.31.6
  • OS (e.g. from /etc/os-release): Ubuntu 24
  • Tested with Cilium v1.16.2 and also v1.17.4

Based on discussion from Slack channel: https://kubernetes.slack.com/archives/CEX9HENG7/p1748952223222099

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions