New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT weren't set in pod env #40973
Comments
@dchen1107 @Random-Liu |
Could someone from @kubernetes/sig-node-bugs please take a look. Thanks! cc: @dchen1107 |
Seeing this intermittently as you say, in Origin e2e CI.
|
@pmorie was thinking this may be symptomatic of the |
This is occurring with pretty high incidence in our CI -- is someone from @kubernetes/sig-node-bugs assigned to triage this? |
@hongchaodeng When generating ENVs for container, there is a known race that kubelet may generate ENV before service is started. See this: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/kubelet_pods.go#L412-L416 May be we should make |
I feel this is more like a workaround. This bug should be fixed. |
ping, @kubernetes/sig-node-bugs can we come up with a plan for either fixing the race condition in the kublet or are we planning on changing client-go @kubernetes/sig-api-machinery-misc? |
met same problem, do we have a workaround here ? Client Version: version.Info{Major:"1", Minor:"7+", GitVersion:"v1.7.0-alpha.4.914+b9e8d2aee6d593", GitCommit:"b9e8d2aee6d59330306cb458c1241f5e2578c40b", GitTreeState:"clean", BuildDate:"2017-06-02T04:13:59Z", GoVersion:"go1.8.1", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"7+", GitVersion:"v1.7.0-alpha.4.914+b9e8d2aee6d593", GitCommit:"b9e8d2aee6d59330306cb458c1241f5e2578c40b", GitTreeState:"clean", BuildDate:"2017-06-02T04:13:59Z", GoVersion:"go1.8.1", Compiler:"gc", Platform:"linux/amd64"}
2017/06/02 04:51:34 util.go:131: Step './hack/e2e-internal/e2e-status.sh' finished in 117.680893ms
2017/06/02 04:51:34 util.go:129: Running: ./hack/ginkgo-e2e.sh --ginkgo.focus=ThirdParty
Setting up for KUBERNETES_PROVIDER="local".
Local doesn't need special preparations for e2e tests
Jun 2 04:51:34.868: INFO: >>> kubeConfig:
Jun 2 04:51:34.868: INFO: failed to load config: unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined
panic: |
met the same problem, especially when we reboot the master node. |
So this is biting us with some regularity, so I assume there is a recommended workaround. Any tips? It is sometimes problematic to override environment variables in pod definitions as an IP address is expected by some things: env:
- name: KUBERNETES_SERVICE_HOST
value: "kubernetes.default.svc.cluster.local"
- name: KUBERNETES_SERVICE_PORT
value: "443" I'm wondering if the IP address is available somewhere else? Downward API volume file? |
We worked around this by polling until the |
Re: conditional, we are unlikely to that because we want to move to
explicit service env vars.
What doesn't expect IPs? Nothing about _HOST promises not to be a
name, so anything in Kube or it's ecosystem should fix assumptions
like that (please help us open bugs for that wherever you've hit
that).
|
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
/remove-lifecycle stale |
[ upstream commit 604dab4 ] Since the k8s service is only created after the container is started, kubelet is not fast enough to set `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` in a container which can result in Cilium having non-expected behaviors such as: panicking upon initialization; use an autogenerated IPv4 allocated IP as Cilium won't detect which podCIDR the k8s node has set; Re-allocate cilium_host router IP address which can cause network disruption; Inability to restore endpoints since their IP do not belong to the autogenerated CIDR. As all Cilium DaemonSets have the K8S_NODE_NAME environment variable set we can detect if Cilium is running in k8s mode by also checking if this flag is set and not depend on `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` for this detection. More info: kubernetes/kubernetes#40973 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
…t k8s [ upstream commit 1598f74 ] We've seen panics where it seems k8s isn't setup correctly but CRD related operations occur, and segfault. This occurs when the kubernetes service is not ready by the time cilium starts up and so cilium misses the KUBERNETES_SERVICE_{HOST,PORT} settings resulting in it being misconfigured. See kubernetes/kubernetes#40973 See #11021 Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
[ upstream commit 604dab4 ] Since the k8s service is only created after the container is started, kubelet is not fast enough to set `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` in a container which can result in Cilium having non-expected behaviors such as: panicking upon initialization; use an autogenerated IPv4 allocated IP as Cilium won't detect which podCIDR the k8s node has set; Re-allocate cilium_host router IP address which can cause network disruption; Inability to restore endpoints since their IP do not belong to the autogenerated CIDR. As all Cilium DaemonSets have the K8S_NODE_NAME environment variable set we can detect if Cilium is running in k8s mode by also checking if this flag is set and not depend on `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` for this detection. More info: kubernetes/kubernetes#40973 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
…t k8s [ upstream commit 1598f74 ] We've seen panics where it seems k8s isn't setup correctly but CRD related operations occur, and segfault. This occurs when the kubernetes service is not ready by the time cilium starts up and so cilium misses the KUBERNETES_SERVICE_{HOST,PORT} settings resulting in it being misconfigured. See kubernetes/kubernetes#40973 See #11021 Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
[ upstream commit 604dab4 ] Since the k8s service is only created after the container is started, kubelet is not fast enough to set `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` in a container which can result in Cilium having non-expected behaviors such as: panicking upon initialization; use an autogenerated IPv4 allocated IP as Cilium won't detect which podCIDR the k8s node has set; Re-allocate cilium_host router IP address which can cause network disruption; Inability to restore endpoints since their IP do not belong to the autogenerated CIDR. As all Cilium DaemonSets have the K8S_NODE_NAME environment variable set we can detect if Cilium is running in k8s mode by also checking if this flag is set and not depend on `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` for this detection. More info: kubernetes/kubernetes#40973 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
…t k8s [ upstream commit 1598f74 ] We've seen panics where it seems k8s isn't setup correctly but CRD related operations occur, and segfault. This occurs when the kubernetes service is not ready by the time cilium starts up and so cilium misses the KUBERNETES_SERVICE_{HOST,PORT} settings resulting in it being misconfigured. See kubernetes/kubernetes#40973 See #11021 Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: John Fastabend <john.fastabend@gmail.com>
[ upstream commit 604dab4 ] Since the k8s service is only created after the container is started, kubelet is not fast enough to set `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` in a container which can result in Cilium having non-expected behaviors such as: panicking upon initialization; use an autogenerated IPv4 allocated IP as Cilium won't detect which podCIDR the k8s node has set; Re-allocate cilium_host router IP address which can cause network disruption; Inability to restore endpoints since their IP do not belong to the autogenerated CIDR. As all Cilium DaemonSets have the K8S_NODE_NAME environment variable set we can detect if Cilium is running in k8s mode by also checking if this flag is set and not depend on `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` for this detection. More info: kubernetes/kubernetes#40973 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Chris Tarazi <chris@isovalent.com>
…t k8s [ upstream commit 1598f74 ] We've seen panics where it seems k8s isn't setup correctly but CRD related operations occur, and segfault. This occurs when the kubernetes service is not ready by the time cilium starts up and so cilium misses the KUBERNETES_SERVICE_{HOST,PORT} settings resulting in it being misconfigured. See kubernetes/kubernetes#40973 See #11021 Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: Chris Tarazi <chris@isovalent.com>
[ upstream commit 604dab4 ] Since the k8s service is only created after the container is started, kubelet is not fast enough to set `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` in a container which can result in Cilium having non-expected behaviors such as: panicking upon initialization; use an autogenerated IPv4 allocated IP as Cilium won't detect which podCIDR the k8s node has set; Re-allocate cilium_host router IP address which can cause network disruption; Inability to restore endpoints since their IP do not belong to the autogenerated CIDR. As all Cilium DaemonSets have the K8S_NODE_NAME environment variable set we can detect if Cilium is running in k8s mode by also checking if this flag is set and not depend on `KUBERNETES_SERVICE_HOST` nor `KUBERNETES_SERVICE_PORT` for this detection. More info: kubernetes/kubernetes#40973 Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Chris Tarazi <chris@isovalent.com>
…t k8s [ upstream commit 1598f74 ] We've seen panics where it seems k8s isn't setup correctly but CRD related operations occur, and segfault. This occurs when the kubernetes service is not ready by the time cilium starts up and so cilium misses the KUBERNETES_SERVICE_{HOST,PORT} settings resulting in it being misconfigured. See kubernetes/kubernetes#40973 See #11021 Signed-off-by: Ray Bejjani <ray@isovalent.com> Signed-off-by: Chris Tarazi <chris@isovalent.com>
Met the same issue, is there any workaround?
|
Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT
Kubernetes version (use
kubectl version
):Environment:
GKE
What happened:
KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT weren't set in pod env.
What you expected to happen:
KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT should be set by default in pod.
How to reproduce it (as minimally and precisely as possible):
Create in-cluster client in a pod.
This isn't easily reproducible. We encountered this issue when running extensive e2e tests and the logs showed that a pod crashed due to:
The text was updated successfully, but these errors were encountered: