Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error deploying Cilium: Binary "cilium-envoy" cannot be executed #23640

Closed
2 tasks done
fmunteanu opened this issue Feb 8, 2023 · 26 comments
Closed
2 tasks done

Error deploying Cilium: Binary "cilium-envoy" cannot be executed #23640

fmunteanu opened this issue Feb 8, 2023 · 26 comments
Labels
area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. kind/community-report This was reported by a user in the Cilium community, eg via Slack. kind/question Frequently asked questions & answers. This issue will be linked from the documentation's FAQ. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.

Comments

@fmunteanu
Copy link

fmunteanu commented Feb 8, 2023

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

I'm testing the Cilium installation on a fresh K3S high-availability cluster with two control planes and one node. My goal is to use the cluster with Cilium, MetalLB and Longhorn. This is a fresh install done with Ansible on RaspiOS Lite 64bits, I will have six nodes total once all deployment issues are resolved.

K3S apollo.lan control plane configuration:

# cat /etc/rancher/k3s/config.yaml
cluster-init: true
disable:
  - local-storage
  - servicelb
disable-cloud-controller: true
disable-network-policy: true
flannel-backend: none
node-taint: node.cilium.io/agent-not-ready:NoExecute

K3S boreas.lan control plane configuration:

# cat /etc/rancher/k3s/config.yaml
disable:
  - local-storage
  - servicelb
disable-cloud-controller: true
disable-network-policy: true
flannel-backend: none
node-taint: node.cilium.io/agent-not-ready:NoExecute
server: https://apollo.lan:6443
token: [removed]

K3S cerus.lan node configuration:

# cat /etc/rancher/k3s/config.yaml
node-taint: node.cilium.io/agent-not-ready:NoExecute
server: https://apollo.lan:6443
token: [removed]

After initial cluster deployment, I see all pods pending:

# kubectl get pods -A
NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE
kube-system   coredns-597584b69b-kbq9p          0/1     Pending   0          3m27s
kube-system   helm-install-traefik-9pr8d        0/1     Pending   0          3m27s
kube-system   helm-install-traefik-crd-tgv85    0/1     Pending   0          3m27s
kube-system   metrics-server-5f9f776df5-l5xgn   0/1     Pending   0          3m27s

Next, I installed the cilium binary in both control planes and the Cilium chart:

# kubectl get pods -A
NAMESPACE     NAME                               READY   STATUS              RESTARTS      AGE
kube-system   cilium-6mdw8                       0/1     CrashLoopBackOff    2 (12s ago)   65s
kube-system   cilium-ngrmw                       0/1     CrashLoopBackOff    2 (9s ago)    65s
kube-system   cilium-operator-5f7d7976fd-l5z44   1/1     Running             0             65s
kube-system   cilium-z9qbg                       0/1     CrashLoopBackOff    2 (5s ago)    65s
kube-system   coredns-597584b69b-kbq9p           0/1     ContainerCreating   0             56m
kube-system   helm-install-traefik-9pr8d         0/1     ContainerCreating   0             56m
kube-system   helm-install-traefik-crd-tgv85     0/1     ContainerCreating   0             56m
kube-system   metrics-server-5f9f776df5-l5xgn    0/1     ContainerCreating   0             56m

Pods are on a CrashLoopBackOff:

level=fatal msg="Envoy: Binary \"cilium-envoy\" cannot be executed" error="signal: aborted (core dumped)" subsys=envoy-manager

I followed your Helm installation instructions, not sure what am I missing.

Cilium Version

# cilium version
cilium-cli: v0.12.12 compiled with go1.19.4 on linux/arm
cilium image (default): v1.12.5
cilium image (stable): v1.12.6
cilium image (running): v1.12.6

Kernel Version

# uname -a
Linux apollo 5.15.84-v8+ #1613 SMP PREEMPT Thu Jan 5 12:03:08 GMT 2023 aarch64 GNU/Linux

Kubernetes Version

# kubectl version --short
Flag --short has been deprecated, and will be removed in the future. The --short output will become the default.
Client Version: v1.25.6+k3s1
Kustomize Version: v4.5.7
Server Version: v1.25.6+k3s1

Sysdump

cilium-sysdump-20230208-160817.zip

Relevant log output

# kubectl logs cilium-6mdw8 -n kube-system
Defaulted container "cilium-agent" out of: cilium-agent, mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init)
level=info msg="Started gops server" address="127.0.0.1:9890" subsys=daemon
level=info msg="Memory available for map entries (0.003% of 8190787584B): 20476968B" subsys=config
level=info msg="option bpf-ct-global-tcp-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-ct-global-any-max set by dynamic sizing to 65536" subsys=config
level=info msg="option bpf-nat-global-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-neigh-global-max set by dynamic sizing to 131072" subsys=config
level=info msg="option bpf-sock-rev-map-max set by dynamic sizing to 65536" subsys=config
level=info msg="  --agent-health-port='9879'" subsys=daemon
level=info msg="  --agent-labels=''" subsys=daemon
level=info msg="  --agent-not-ready-taint-key='node.cilium.io/agent-not-ready'" subsys=daemon
level=info msg="  --allocator-list-timeout='3m0s'" subsys=daemon
level=info msg="  --allow-icmp-frag-needed='true'" subsys=daemon
level=info msg="  --allow-localhost='auto'" subsys=daemon
level=info msg="  --annotate-k8s-node='false'" subsys=daemon
level=info msg="  --api-rate-limit=''" subsys=daemon
level=info msg="  --arping-refresh-period='30s'" subsys=daemon
level=info msg="  --auto-create-cilium-node-resource='true'" subsys=daemon
level=info msg="  --auto-direct-node-routes='false'" subsys=daemon
level=info msg="  --bgp-announce-lb-ip='false'" subsys=daemon
level=info msg="  --bgp-announce-pod-cidr='false'" subsys=daemon
level=info msg="  --bgp-config-path='/var/lib/cilium/bgp/config.yaml'" subsys=daemon
level=info msg="  --bpf-ct-global-any-max='262144'" subsys=daemon
level=info msg="  --bpf-ct-global-tcp-max='524288'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-any='1m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-tcp='6h0m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-tcp-fin='10s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-regular-tcp-syn='1m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-service-any='1m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-service-tcp='6h0m0s'" subsys=daemon
level=info msg="  --bpf-ct-timeout-service-tcp-grace='1m0s'" subsys=daemon
level=info msg="  --bpf-filter-priority='1'" subsys=daemon
level=info msg="  --bpf-fragments-map-max='8192'" subsys=daemon
level=info msg="  --bpf-lb-acceleration='disabled'" subsys=daemon
level=info msg="  --bpf-lb-affinity-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-algorithm='random'" subsys=daemon
level=info msg="  --bpf-lb-dev-ip-addr-inherit=''" subsys=daemon
level=info msg="  --bpf-lb-dsr-dispatch='opt'" subsys=daemon
level=info msg="  --bpf-lb-dsr-l4-xlate='frontend'" subsys=daemon
level=info msg="  --bpf-lb-external-clusterip='false'" subsys=daemon
level=info msg="  --bpf-lb-maglev-hash-seed='JLfvgnHc2kaSUFaI'" subsys=daemon
level=info msg="  --bpf-lb-maglev-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-maglev-table-size='16381'" subsys=daemon
level=info msg="  --bpf-lb-map-max='65536'" subsys=daemon
level=info msg="  --bpf-lb-mode='snat'" subsys=daemon
level=info msg="  --bpf-lb-rev-nat-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-rss-ipv4-src-cidr=''" subsys=daemon
level=info msg="  --bpf-lb-rss-ipv6-src-cidr=''" subsys=daemon
level=info msg="  --bpf-lb-service-backend-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-service-map-max='0'" subsys=daemon
level=info msg="  --bpf-lb-sock='false'" subsys=daemon
level=info msg="  --bpf-lb-sock-hostns-only='false'" subsys=daemon
level=info msg="  --bpf-lb-source-range-map-max='0'" subsys=daemon
level=info msg="  --bpf-map-dynamic-size-ratio='0.0025'" subsys=daemon
level=info msg="  --bpf-nat-global-max='524288'" subsys=daemon
level=info msg="  --bpf-neigh-global-max='524288'" subsys=daemon
level=info msg="  --bpf-policy-map-max='16384'" subsys=daemon
level=info msg="  --bpf-root='/sys/fs/bpf'" subsys=daemon
level=info msg="  --bpf-sock-rev-map-max='262144'" subsys=daemon
level=info msg="  --bypass-ip-availability-upon-restore='false'" subsys=daemon
level=info msg="  --certificates-directory='/var/run/cilium/certs'" subsys=daemon
level=info msg="  --cflags=''" subsys=daemon
level=info msg="  --cgroup-root='/run/cilium/cgroupv2'" subsys=daemon
level=info msg="  --cluster-health-port='4240'" subsys=daemon
level=info msg="  --cluster-id='0'" subsys=daemon
level=info msg="  --cluster-name='default'" subsys=daemon
level=info msg="  --clustermesh-config='/var/lib/cilium/clustermesh/'" subsys=daemon
level=info msg="  --cmdref=''" subsys=daemon
level=info msg="  --cni-chaining-mode=''" subsys=daemon
level=info msg="  --config=''" subsys=daemon
level=info msg="  --config-dir='/tmp/cilium/config-map'" subsys=daemon
level=info msg="  --conntrack-gc-interval='0s'" subsys=daemon
level=info msg="  --crd-wait-timeout='5m0s'" subsys=daemon
level=info msg="  --datapath-mode='veth'" subsys=daemon
level=info msg="  --debug='false'" subsys=daemon
level=info msg="  --debug-verbose=''" subsys=daemon
level=info msg="  --derive-masquerade-ip-addr-from-device=''" subsys=daemon
level=info msg="  --devices=''" subsys=daemon
level=info msg="  --direct-routing-device=''" subsys=daemon
level=info msg="  --disable-cnp-status-updates='true'" subsys=daemon
level=info msg="  --disable-conntrack='false'" subsys=daemon
level=info msg="  --disable-endpoint-crd='false'" subsys=daemon
level=info msg="  --disable-envoy-version-check='false'" subsys=daemon
level=info msg="  --disable-iptables-feeder-rules=''" subsys=daemon
level=info msg="  --dns-max-ips-per-restored-rule='1000'" subsys=daemon
level=info msg="  --dns-policy-unload-on-shutdown='false'" subsys=daemon
level=info msg="  --dnsproxy-concurrency-limit='0'" subsys=daemon
level=info msg="  --dnsproxy-concurrency-processing-grace-period='0s'" subsys=daemon
level=info msg="  --egress-masquerade-interfaces=''" subsys=daemon
level=info msg="  --egress-multi-home-ip-rule-compat='false'" subsys=daemon
level=info msg="  --enable-auto-protect-node-port-range='true'" subsys=daemon
level=info msg="  --enable-bandwidth-manager='false'" subsys=daemon
level=info msg="  --enable-bbr='false'" subsys=daemon
level=info msg="  --enable-bgp-control-plane='false'" subsys=daemon
level=info msg="  --enable-bpf-clock-probe='true'" subsys=daemon
level=info msg="  --enable-bpf-masquerade='false'" subsys=daemon
level=info msg="  --enable-bpf-tproxy='false'" subsys=daemon
level=info msg="  --enable-cilium-endpoint-slice='false'" subsys=daemon
level=info msg="  --enable-custom-calls='false'" subsys=daemon
level=info msg="  --enable-endpoint-health-checking='true'" subsys=daemon
level=info msg="  --enable-endpoint-routes='false'" subsys=daemon
level=info msg="  --enable-envoy-config='false'" subsys=daemon
level=info msg="  --enable-external-ips='true'" subsys=daemon
level=info msg="  --enable-health-check-nodeport='true'" subsys=daemon
level=info msg="  --enable-health-checking='true'" subsys=daemon
level=info msg="  --enable-host-firewall='false'" subsys=daemon
level=info msg="  --enable-host-legacy-routing='false'" subsys=daemon
level=info msg="  --enable-host-port='true'" subsys=daemon
level=info msg="  --enable-host-reachable-services='false'" subsys=daemon
level=info msg="  --enable-hubble='true'" subsys=daemon
level=info msg="  --enable-hubble-recorder-api='true'" subsys=daemon
level=info msg="  --enable-icmp-rules='true'" subsys=daemon
level=info msg="  --enable-identity-mark='true'" subsys=daemon
level=info msg="  --enable-ip-masq-agent='false'" subsys=daemon
level=info msg="  --enable-ipsec='false'" subsys=daemon
level=info msg="  --enable-ipv4='true'" subsys=daemon
level=info msg="  --enable-ipv4-egress-gateway='false'" subsys=daemon
level=info msg="  --enable-ipv4-fragment-tracking='true'" subsys=daemon
level=info msg="  --enable-ipv4-masquerade='true'" subsys=daemon
level=info msg="  --enable-ipv6='false'" subsys=daemon
level=info msg="  --enable-ipv6-masquerade='true'" subsys=daemon
level=info msg="  --enable-ipv6-ndp='false'" subsys=daemon
level=info msg="  --enable-k8s-api-discovery='false'" subsys=daemon
level=info msg="  --enable-k8s-endpoint-slice='true'" subsys=daemon
level=info msg="  --enable-k8s-event-handover='false'" subsys=daemon
level=info msg="  --enable-k8s-terminating-endpoint='true'" subsys=daemon
level=info msg="  --enable-l2-neigh-discovery='true'" subsys=daemon
level=info msg="  --enable-l7-proxy='true'" subsys=daemon
level=info msg="  --enable-local-node-route='true'" subsys=daemon
level=info msg="  --enable-local-redirect-policy='false'" subsys=daemon
level=info msg="  --enable-mke='false'" subsys=daemon
level=info msg="  --enable-monitor='true'" subsys=daemon
level=info msg="  --enable-node-port='false'" subsys=daemon
level=info msg="  --enable-policy='default'" subsys=daemon
level=info msg="  --enable-recorder='false'" subsys=daemon
level=info msg="  --enable-remote-node-identity='true'" subsys=daemon
level=info msg="  --enable-runtime-device-detection='false'" subsys=daemon
level=info msg="  --enable-selective-regeneration='true'" subsys=daemon
level=info msg="  --enable-service-topology='false'" subsys=daemon
level=info msg="  --enable-session-affinity='false'" subsys=daemon
level=info msg="  --enable-stale-cilium-endpoint-cleanup='true'" subsys=daemon
level=info msg="  --enable-svc-source-range-check='true'" subsys=daemon
level=info msg="  --enable-tracing='false'" subsys=daemon
level=info msg="  --enable-unreachable-routes='false'" subsys=daemon
level=info msg="  --enable-vtep='false'" subsys=daemon
level=info msg="  --enable-well-known-identities='false'" subsys=daemon
level=info msg="  --enable-wireguard='false'" subsys=daemon
level=info msg="  --enable-wireguard-userspace-fallback='false'" subsys=daemon
level=info msg="  --enable-xdp-prefilter='false'" subsys=daemon
level=info msg="  --enable-xt-socket-fallback='true'" subsys=daemon
level=info msg="  --encrypt-interface=''" subsys=daemon
level=info msg="  --encrypt-node='false'" subsys=daemon
level=info msg="  --endpoint-gc-interval='5m0s'" subsys=daemon
level=info msg="  --endpoint-interface-name-prefix=''" subsys=daemon
level=info msg="  --endpoint-queue-size='25'" subsys=daemon
level=info msg="  --endpoint-status=''" subsys=daemon
level=info msg="  --envoy-config-timeout='2m0s'" subsys=daemon
level=info msg="  --envoy-log=''" subsys=daemon
level=info msg="  --exclude-local-address=''" subsys=daemon
level=info msg="  --fixed-identity-mapping=''" subsys=daemon
level=info msg="  --force-local-policy-eval-at-source='true'" subsys=daemon
level=info msg="  --fqdn-regex-compile-lru-size='1024'" subsys=daemon
level=info msg="  --gops-port='9890'" subsys=daemon
level=info msg="  --host-reachable-services-protos='tcp,udp'" subsys=daemon
level=info msg="  --http-403-msg=''" subsys=daemon
level=info msg="  --http-idle-timeout='0'" subsys=daemon
level=info msg="  --http-max-grpc-timeout='0'" subsys=daemon
level=info msg="  --http-normalize-path='true'" subsys=daemon
level=info msg="  --http-request-timeout='3600'" subsys=daemon
level=info msg="  --http-retry-count='3'" subsys=daemon
level=info msg="  --http-retry-timeout='0'" subsys=daemon
level=info msg="  --hubble-disable-tls='false'" subsys=daemon
level=info msg="  --hubble-event-buffer-capacity='4095'" subsys=daemon
level=info msg="  --hubble-event-queue-size='0'" subsys=daemon
level=info msg="  --hubble-export-file-compress='false'" subsys=daemon
level=info msg="  --hubble-export-file-max-backups='5'" subsys=daemon
level=info msg="  --hubble-export-file-max-size-mb='10'" subsys=daemon
level=info msg="  --hubble-export-file-path=''" subsys=daemon
level=info msg="  --hubble-listen-address=':4244'" subsys=daemon
level=info msg="  --hubble-metrics=''" subsys=daemon
level=info msg="  --hubble-metrics-server=''" subsys=daemon
level=info msg="  --hubble-recorder-sink-queue-size='1024'" subsys=daemon
level=info msg="  --hubble-recorder-storage-path='/var/run/cilium/pcaps'" subsys=daemon
level=info msg="  --hubble-socket-path='/var/run/cilium/hubble.sock'" subsys=daemon
level=info msg="  --hubble-tls-cert-file='/var/lib/cilium/tls/hubble/server.crt'" subsys=daemon
level=info msg="  --hubble-tls-client-ca-files='/var/lib/cilium/tls/hubble/client-ca.crt'" subsys=daemon
level=info msg="  --hubble-tls-key-file='/var/lib/cilium/tls/hubble/server.key'" subsys=daemon
level=info msg="  --identity-allocation-mode='crd'" subsys=daemon
level=info msg="  --identity-change-grace-period='5s'" subsys=daemon
level=info msg="  --identity-restore-grace-period='10m0s'" subsys=daemon
level=info msg="  --install-egress-gateway-routes='false'" subsys=daemon
level=info msg="  --install-iptables-rules='true'" subsys=daemon
level=info msg="  --install-no-conntrack-iptables-rules='false'" subsys=daemon
level=info msg="  --ip-allocation-timeout='2m0s'" subsys=daemon
level=info msg="  --ip-masq-agent-config-path='/etc/config/ip-masq-agent'" subsys=daemon
level=info msg="  --ipam='cluster-pool'" subsys=daemon
level=info msg="  --ipsec-key-file=''" subsys=daemon
level=info msg="  --iptables-lock-timeout='5s'" subsys=daemon
level=info msg="  --iptables-random-fully='false'" subsys=daemon
level=info msg="  --ipv4-native-routing-cidr=''" subsys=daemon
level=info msg="  --ipv4-node='auto'" subsys=daemon
level=info msg="  --ipv4-pod-subnets=''" subsys=daemon
level=info msg="  --ipv4-range='auto'" subsys=daemon
level=info msg="  --ipv4-service-loopback-address='169.254.42.1'" subsys=daemon
level=info msg="  --ipv4-service-range='auto'" subsys=daemon
level=info msg="  --ipv6-cluster-alloc-cidr='f00d::/64'" subsys=daemon
level=info msg="  --ipv6-mcast-device=''" subsys=daemon
level=info msg="  --ipv6-native-routing-cidr=''" subsys=daemon
level=info msg="  --ipv6-node='auto'" subsys=daemon
level=info msg="  --ipv6-pod-subnets=''" subsys=daemon
level=info msg="  --ipv6-range='auto'" subsys=daemon
level=info msg="  --ipv6-service-range='auto'" subsys=daemon
level=info msg="  --join-cluster='false'" subsys=daemon
level=info msg="  --k8s-api-server=''" subsys=daemon
level=info msg="  --k8s-heartbeat-timeout='30s'" subsys=daemon
level=info msg="  --k8s-kubeconfig-path=''" subsys=daemon
level=info msg="  --k8s-namespace='kube-system'" subsys=daemon
level=info msg="  --k8s-require-ipv4-pod-cidr='false'" subsys=daemon
level=info msg="  --k8s-require-ipv6-pod-cidr='false'" subsys=daemon
level=info msg="  --k8s-service-cache-size='128'" subsys=daemon
level=info msg="  --k8s-service-proxy-name=''" subsys=daemon
level=info msg="  --k8s-sync-timeout='3m0s'" subsys=daemon
level=info msg="  --k8s-watcher-endpoint-selector='metadata.name!=kube-scheduler,metadata.name!=kube-controller-manager,metadata.name!=etcd-operator,metadata.name!=gcp-controller-manager'" subsys=daemon
level=info msg="  --keep-config='false'" subsys=daemon
level=info msg="  --kube-proxy-replacement='disabled'" subsys=daemon
level=info msg="  --kube-proxy-replacement-healthz-bind-address=''" subsys=daemon
level=info msg="  --kvstore=''" subsys=daemon
level=info msg="  --kvstore-connectivity-timeout='2m0s'" subsys=daemon
level=info msg="  --kvstore-lease-ttl='15m0s'" subsys=daemon
level=info msg="  --kvstore-max-consecutive-quorum-errors='2'" subsys=daemon
level=info msg="  --kvstore-opt=''" subsys=daemon
level=info msg="  --kvstore-periodic-sync='5m0s'" subsys=daemon
level=info msg="  --label-prefix-file=''" subsys=daemon
level=info msg="  --labels=''" subsys=daemon
level=info msg="  --lib-dir='/var/lib/cilium'" subsys=daemon
level=info msg="  --local-max-addr-scope='252'" subsys=daemon
level=info msg="  --local-router-ipv4=''" subsys=daemon
level=info msg="  --local-router-ipv6=''" subsys=daemon
level=info msg="  --log-driver=''" subsys=daemon
level=info msg="  --log-opt=''" subsys=daemon
level=info msg="  --log-system-load='false'" subsys=daemon
level=info msg="  --max-controller-interval='0'" subsys=daemon
level=info msg="  --metrics=''" subsys=daemon
level=info msg="  --mke-cgroup-mount=''" subsys=daemon
level=info msg="  --monitor-aggregation='medium'" subsys=daemon
level=info msg="  --monitor-aggregation-flags='all'" subsys=daemon
level=info msg="  --monitor-aggregation-interval='5s'" subsys=daemon
level=info msg="  --monitor-queue-size='0'" subsys=daemon
level=info msg="  --mtu='0'" subsys=daemon
level=info msg="  --node-port-acceleration='disabled'" subsys=daemon
level=info msg="  --node-port-algorithm='random'" subsys=daemon
level=info msg="  --node-port-bind-protection='true'" subsys=daemon
level=info msg="  --node-port-mode='snat'" subsys=daemon
level=info msg="  --node-port-range='30000,32767'" subsys=daemon
level=info msg="  --policy-audit-mode='false'" subsys=daemon
level=info msg="  --policy-queue-size='100'" subsys=daemon
level=info msg="  --policy-trigger-interval='1s'" subsys=daemon
level=info msg="  --pprof='false'" subsys=daemon
level=info msg="  --pprof-address='localhost'" subsys=daemon
level=info msg="  --pprof-port='6060'" subsys=daemon
level=info msg="  --preallocate-bpf-maps='false'" subsys=daemon
level=info msg="  --prepend-iptables-chains='true'" subsys=daemon
level=info msg="  --procfs='/host/proc'" subsys=daemon
level=info msg="  --prometheus-serve-addr=':9962'" subsys=daemon
level=info msg="  --proxy-connect-timeout='1'" subsys=daemon
level=info msg="  --proxy-gid='1337'" subsys=daemon
level=info msg="  --proxy-max-connection-duration-seconds='0'" subsys=daemon
level=info msg="  --proxy-max-requests-per-connection='0'" subsys=daemon
level=info msg="  --proxy-prometheus-port='0'" subsys=daemon
level=info msg="  --read-cni-conf=''" subsys=daemon
level=info msg="  --restore='true'" subsys=daemon
level=info msg="  --route-metric='0'" subsys=daemon
level=info msg="  --sidecar-istio-proxy-image='cilium/istio_proxy'" subsys=daemon
level=info msg="  --single-cluster-route='false'" subsys=daemon
level=info msg="  --socket-path='/var/run/cilium/cilium.sock'" subsys=daemon
level=info msg="  --sockops-enable='false'" subsys=daemon
level=info msg="  --state-dir='/var/run/cilium'" subsys=daemon
level=info msg="  --tofqdns-dns-reject-response-code='refused'" subsys=daemon
level=info msg="  --tofqdns-enable-dns-compression='true'" subsys=daemon
level=info msg="  --tofqdns-endpoint-max-ip-per-hostname='50'" subsys=daemon
level=info msg="  --tofqdns-idle-connection-grace-period='0s'" subsys=daemon
level=info msg="  --tofqdns-max-deferred-connection-deletes='10000'" subsys=daemon
level=info msg="  --tofqdns-min-ttl='3600'" subsys=daemon
level=info msg="  --tofqdns-pre-cache=''" subsys=daemon
level=info msg="  --tofqdns-proxy-port='0'" subsys=daemon
level=info msg="  --tofqdns-proxy-response-max-delay='100ms'" subsys=daemon
level=info msg="  --trace-payloadlen='128'" subsys=daemon
level=info msg="  --tunnel='vxlan'" subsys=daemon
level=info msg="  --tunnel-port='0'" subsys=daemon
level=info msg="  --version='false'" subsys=daemon
level=info msg="  --vlan-bpf-bypass=''" subsys=daemon
level=info msg="  --vtep-cidr=''" subsys=daemon
level=info msg="  --vtep-endpoint=''" subsys=daemon
level=info msg="  --vtep-mac=''" subsys=daemon
level=info msg="  --vtep-mask=''" subsys=daemon
level=info msg="  --write-cni-conf-when-ready=''" subsys=daemon
level=info msg="     _ _ _" subsys=daemon
level=info msg=" ___|_| |_|_ _ _____" subsys=daemon
level=info msg="|  _| | | | | |     |" subsys=daemon
level=info msg="|___|_|_|_|___|_|_|_|" subsys=daemon
level=info msg="Cilium 1.12.6 9cc8d71 2023-01-26T09:40:50+00:00 go version go1.18.10 linux/arm64" subsys=daemon
level=fatal msg="Envoy: Binary \"cilium-envoy\" cannot be executed" error="signal: aborted (core dumped)" subsys=envoy-manager

Anything else?

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@fmunteanu fmunteanu added kind/bug This is a bug in the Cilium logic. kind/community-report This was reported by a user in the Cilium community, eg via Slack. needs/triage This issue requires triaging to establish severity and next steps. labels Feb 8, 2023
@fmunteanu fmunteanu changed the title Error deploying cilium Error deploying Cilium: Binary "cilium-envoy" cannot be executed Feb 8, 2023
@pchaigno
Copy link
Member

pchaigno commented Feb 8, 2023

If you don't need L7 policies and L7 visibility, the easiest is probably to set enable-l7-proxy=false.

@pchaigno pchaigno added area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. kind/question Frequently asked questions & answers. This issue will be linked from the documentation's FAQ. and removed needs/triage This issue requires triaging to establish severity and next steps. kind/bug This is a bug in the Cilium logic. labels Feb 8, 2023
@fmunteanu
Copy link
Author

fmunteanu commented Feb 8, 2023

@pchaigno I apologize for my lack of knowledge, this is done with a --set l7Proxy=false during the Helm chart install? Looking at egress-gateway:

$ helm upgrade cilium cilium/cilium --version 1.12.6 \
   --namespace kube-system \
   --reuse-values \
   --set egressGateway.enabled=true \
   --set bpf.masquerade=true \
   --set kubeProxyReplacement=strict \
   --set l7Proxy=false

@pchaigno
Copy link
Member

pchaigno commented Feb 8, 2023

Yes. That will disable the L7 proxy and I believe will remove the need for the cilium-envoy binary. There's also a possibility to use a different cilium-envoy for aarch64 but I'm not familiar with how.

@fmunteanu
Copy link
Author

fmunteanu commented Feb 8, 2023

Adding the --set l7Proxy=false fixed the deployment issues. I was wondering if anyone can provide some guidance how to install the cilium-envoy arm64 binary?

Cilium status:

# cilium status
    /¯¯\
 /¯¯\__/¯¯\    Cilium:         OK
 \__/¯¯\__/    Operator:       OK
 /¯¯\__/¯¯\    Hubble:         OK
 \__/¯¯\__/    ClusterMesh:    disabled
    \__/

Deployment        hubble-ui          Desired: 1, Ready: 1/1, Available: 1/1
Deployment        hubble-relay       Desired: 1, Ready: 1/1, Available: 1/1
DaemonSet         cilium             Desired: 8, Ready: 8/8, Available: 8/8
Deployment        cilium-operator    Desired: 1, Ready: 1/1, Available: 1/1
Containers:       cilium             Running: 8
                  hubble-ui          Running: 1
                  hubble-relay       Running: 1
                  cilium-operator    Running: 1
Cluster Pods:     6/6 managed by Cilium
Image versions    cilium             quay.io/cilium/cilium:v1.12.6@sha256:454134506b0448c756398d3e8df68d474acde2a622ab58d0c7e8b272b5867d0d: 8
                  hubble-ui          quay.io/cilium/hubble-ui:v0.9.2@sha256:d3596efc94a41c6b772b9afe6fe47c17417658956e04c3e2a28d293f2670663e: 1
                  hubble-ui          quay.io/cilium/hubble-ui-backend:v0.9.2@sha256:a3ac4d5b87889c9f7cc6323e86d3126b0d382933bd64f44382a92778b0cde5d7: 1
                  hubble-relay       quay.io/cilium/hubble-relay:v1.12.6@sha256:27a68a16f0ee7ed6ba690e91847de6931a5511f85a7f939320df216486764cb9: 1
                  cilium-operator    quay.io/cilium/operator-generic:v1.12.6@sha256:eec4430d222cb2967d42d3b404d2606e66468de47ae85e0a3ca3f58f00a5e017: 1

@sayboras
Copy link
Member

sayboras commented Feb 9, 2023

For rasperry PI, you might need to install one additional package as per below. Understand that you are not running ubuntu, but no harm to give it a crack and see how it's going.

https://docs.cilium.io/en/latest/operations/system_requirements/#ubuntu-22-04-on-raspberry-pi

@fmunteanu
Copy link
Author

That linux-modules-extra-raspi package is specific to Ubuntu only, not available on RaspiOS Lite.

@sayboras
Copy link
Member

sayboras commented Feb 9, 2023

what I meant is to check if there is any equivalent package for RaspiOS Lite.

The above docs is added recently for the same issue while running proxy in rasperry PI.

@fmunteanu
Copy link
Author

The thing is, I do not know what kernel related packages need to be installed, in order allow me use the l7Proxy and cilium-envoy arm64 binary. I was wondering if anyone has a solution for this. Searching for linux-modules returns no results from the list of packages. I tried looking at the Ubuntu package contents to see what kernel modules are required, but I could not find any references.

@draeron
Copy link

draeron commented Feb 17, 2023

file list for the linux-modules-extra package is here... don't see anything related to eBPF

https://packages.ubuntu.com/jammy/arm64/linux-modules-extra-5.15.0-1005-raspi/filelist

@jDmacD
Copy link

jDmacD commented Feb 19, 2023

Any more info on this? I'm trying to run cilium on the most recent verion of DietPi (a Raspberry Pi OS derivative). Disabling the L7 Proxy breaks Ingress among other things.

@jDmacD
Copy link

jDmacD commented Feb 20, 2023

I've made a bit of progress on this, I think. I rebuilt the RPI kernel with the following options:

CONFIG_ARM64_VA_BITS_39=n 
CONFIG_ARM64_VA_BITS_48=y 
CONFIG_ARM64_VA_BITS=48 
CONFIG_PGTABLE_LEVELS=4

envoy actually starts now, but I've broken netwoking in the process

@draeron
Copy link

draeron commented Feb 20, 2023

Any more info on this? I'm trying to run cilium on the most recent verion of DietPi (a Raspberry Pi OS derivative). Disabling the L7 Proxy breaks Ingress among other things.

even with l7Proxy disabled, when I tried on dietpi the cilium pod does something which affect/disable the host network stack in a way where connection to the device is not possible (say through ssh), since it's a headless setup can't really figure out what's happening and I need to reboot the node to connect through it adn disable cilium before the pod start. After 3-4 hours i kinda stopped the project of running cilium on raspberry pi hardware.

@fmunteanu
Copy link
Author

fmunteanu commented Feb 20, 2023

I have a fully functional cluster of 3 control planes and 5 nodes, with cilium, metallb, longhorn and kube-prometheus-stack installed. Cilium works with L7 proxy disabled. otherwise I would be able to deploy Prometheus stack and the rest of pending pods because of the node-taint: node.cilium.io/agent-not-ready:NoExecute.

image

I stopped using a while ago DietPi and switched back to default arm64 RaspiOS, because I did not see any benefits running it, especially now with hardware dedicated for Kubernetes. I have a full Ansible deployment for cluster that I plan to make public, but I’m waiting on the Cilium devs to release a fix.

@jDmacD
Copy link

jDmacD commented Feb 20, 2023

@fmunteanu Here is a gist I created with steps for compiling the Pi OS kernel so that it works with envoy and the Cillium L7 Proxy.

@fmunteanu
Copy link
Author

fmunteanu commented Feb 20, 2023

@jDmacD while this is a temporary solution, is not the proper way to go. Existing arm64 kernel should be directly supported by Cilium. Imagine the entire RaspiOS planet needs to compile the kernel, because Cilium has specific requirements. That’s not realistic and is the response I got so far from RaspiOS devs in raspberrypi/linux#5354, which is fair.

Every time there is a kernel update, users will need to perform that kernel compile step. Kernel changes should not be promoted as solution, Cilium needs to adapt their software to existing RaspiOS kernel.

I also do not understand why none of Cilium devs voice their opinion here.

@fmunteanu
Copy link
Author

Any updates on this issue?

@fmunteanu
Copy link
Author

fmunteanu commented Mar 8, 2023

@joestringer RaspiOS kernel devs mentioned it should be very easy for Cilium devs to address the issue. It it possible to get some clarifications on your side? Cilium expects larger pages of memory, the size of 48 bits required by Cilium is set to 39 bits for Raspberry Pi kernel. It should not be difficult at all for Cilium devs to address this issue.

@joestringer
Copy link
Member

I can't speak for the difficulty as I have no background on ARM64 platform differences. If it's an easy fix, then great, we would welcome PR proposals to address the issue. I think it's fair to say that upstream Cilium devs would love for Cilium to be more compatible with specific OSes like this, but we rely on the community to report problems and propose solutions in order to make Cilium work better for everyone.

@fmunteanu
Copy link
Author

@joestringer do you know which Cilium dev could look into this? A simple colab with RaspiOS kernel devs should get you all the answers. Let’s get some traction on this please, your help will be much appreciated by community. All required information to address the problem is posted into this issue, as well raspberrypi/linux#5354.

@fmunteanu
Copy link
Author

I think a good first step is to change the issue labels, so Cilium devs can notice it.

rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 12, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 13, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 13, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 13, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 13, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 14, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 16, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to rpardini/armbian-build that referenced this issue Jan 16, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to armbian/build that referenced this issue Jan 17, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
rpardini added a commit to armbian/build that referenced this issue Jan 17, 2024
- for tcmalloc (enjoy, cilium, etc) stuff cilium/cilium#23640
- also: `CONFIG_BPF_JIT=y`, `CONFIG_FTRACE_SYSCALLS=y`, `CONFIG_BPF_KPROBE_OVERRIDE=y`
- this commit should contain no DEBUG/BTF changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. kind/community-report This was reported by a user in the Cilium community, eg via Slack. kind/question Frequently asked questions & answers. This issue will be linked from the documentation's FAQ. stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale.
Projects
None yet
Development

No branches or pull requests

8 participants