New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hubble Relay: Failed to create peer client for peers synchronization #20130
Comments
Thanks for the report. # curl hubble-peer.kube-system.svc.cluster.local:4254
<!doctype html><html><head><meta charset="utf-8"/><title>Hubble UI</title><meta http-equiv="X-UA-Compatible" content="IE=edge"/><meta name="viewport" content="width=device-width,user-scalable=0,initial-scale=1,minimum-scale=1,maximum-scale=1"/><link rel="icon" type="image/png" sizes="32x32" href="favicon-32x32.png"/><link rel="icon" type="image/png" sizes="16x16" href="favicon-16x16.png"/><link rel="shortcut icon" href="favicon.ico"/><script defer="defer" src="/bundle.main.3b2369adf2e0c02229aa.js"></script><link href="/bundle.main.9cd671817b2cf4a1a838.css" rel="stylesheet"></head><body><div id="app"></div></body></html> This looks off to me. The peer service should provide a gRPC interface implemented by the cilium-agent pods, not the Hubble UI frontend. Are you able to provide a sysdump? https://docs.cilium.io/en/v1.11/operations/troubleshooting/#automatic-log-state-collection |
I've got the same issue install Cilium with Hubble with k0s
Cluster was successfully started and networking works fine, but hubble-relay has issues getting peers information:
Here is my sysdump |
I also encountered this problem. But after reinstalling Cilium and Hubble using helm only, Hubble seems to work again. I'm guessing it is my mix using cilium-cli and helm broke Hubble. cilium/hubble#599 (comment) Here is the command I used to install Cilium using cilium-cli (which caused the problem)
using helm (works)
|
I faced the same problem.
In my case, I use cert-manager to issue certificates for Hubble and the CA certificate expired.
I solved this problem by reissuing all certificates related to hubble, including the CA certificate, after setting a longer expiration date for the CA certificate as follows: apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: cilium-selfsigned
spec:
selfSigned: {}
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: cilium-selfsigned-ca
spec:
isCA: true
commonName: cilium-selfsigned-ca
duration: 438000h # 50y
secretName: cilium-selfsigned-ca
privateKey:
algorithm: ECDSA
size: 256
issuerRef:
name: cilium-selfsigned
---
apiVersion: cert-manager.io/v1
kind: Issuer
metadata:
name: cilium-selfsigned-ca
spec:
ca:
secretName: cilium-selfsigned-ca |
I faced the same problem. In my case the problem was the cluster domain was not the default "cluster.local" so relay did not find the peer service. Solved setting the helm value:
|
For me it was a hubble-relay certificate issue. I was using cert-manager with self-issuing cert and it didn't allow the connection to hubble-peer.kube-system.svc.cluster.local |
I guess this problem #20130 (comment) reported by @shlande was caused by this cilium/cilium-cli#1347. |
We install cilium via helm with
the hubble web ui opens and lists namespaces. Clicking one give the above error. Cillium is
Hubble relay complains about:
We're using cilliums since ~2.5 years and did not manage to get hubble running since. To be clear using helm with "--set hubble.enabled=true --set hubble.relay.enabled=true --set hubble.ui.enabled=true" it works. But if you start with "--set hubble.enabled=false" there is no known way to enable it later on. Even if you start with the defaults "hubble.enabled=true" and then run It is totally fine if dev need to run some extra command to make it work, but not having this feature working at all, is really sad. To maybe resolve this, could the documentation explain why one would disable hubble and if one keeps it enabled what the overhead is? |
@ensonic It is currently not possible to mix Cilium CLI and Helm install methods. We understand this is not ideal and creates issues for users. This is something we will address (see cilium/cilium-cli#1396).
I don't think we have precise numbers about the overhead of running Hubble. However, we will be working on optimizing Hubble to reduce the overhead even more and possibly create new modes where e.g. Hubble doesn't do anything unless a client runs a query. |
If you're here because you've seen the following Hubble Relay error in the logs
Please, check the list below for common root causes for this issue:
If you have checked all of the above and still hit this problem, please open a new issue so we can investigate the problem. |
I'd like to add a point to the above list of things to check while troubleshooting which I think may not be immediately obvious when setting up Cilium for the first time. I have a restrictive firewall on my node machines. The hubble relay deployment attempts to connect to the service Hopefully this helps! |
This is indeed the reason. We use hardened ami for our clusters and as the cilium pods run on the host network, we had to open peer port 4244 to resolve this issue. |
This finally helped to solve the issues. Most cloud vendor direct images are hardened with basic iptables blocking this. |
Glad to hear this was resolved by opening the port. For future reference, the necessary ports are documented here: https://docs.cilium.io/en/stable/operations/system_requirements/#firewall-requirements |
I have the same problem and from what I see it looks like the binding of the pod through the host network is only with IPv6. Here is what it looks like on one of my nodes (there is nothing more installed than Kubernetes 1.29, cert-manager and cilium)
For some reason for port 4244 from the POD which is used by the peer Service behind 443 only listens on IPv6. When I curl it, the DNS gets resolved, bu I get an empty reply. Also, I have not enabled IPv6 in the Cilium values when deploying with Helm. Here's the relevant log line:
|
So, I reinstalled from scratch, with CLI and another time with helm. Seems something is not OK with my cert-manger deployment, because only then it does not work. Unfortunately, there is not much input in the logs. Is there an option for more detailed logs? And not, my CA is fine. For instance I have no problem with Ingress etc ... Anyway, the problem is on my side, obviously .... |
Im running k3d on ubuntu and cilium installs and works as expected but hubble-relay does not seem to connect.
The text was updated successfully, but these errors were encountered: