-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubes ops communicating via websockets and API hanging #602
Comments
@bonneaud I have the same issue, have you found a solution to it? `(most recent call last): HTTP response headers: HTTPHeaderDict({'Date': 'Sun, 16 Sep 2018 18:40:37 GMT', 'Content-Length': '139', 'Content-Type': 'application/json'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Upgrade request required","reason":"BadRequest","code":400} |
@bonneaud I may have tracked down the issue you're hitting, I filed an issue here: kubernetes-client/python-base#106 |
@PaulFurtado - are you hitting this specifically when using CRI-O? The OP's description of this bug matches my own exactly - intermittent, indefinite hangs when invoking connect_get_namespaced_pod_exec. In my case, however, I'm using just Docker. It doesn't look like the OP specified which runtime they were using. |
@arzarif the issue I filed is relevant to every runtime, however, for the commands I tested, something about the framing of the data causes it to be very reproducible when cri-o is the runtime. |
@PaulFurtado Interesting. Thanks for the heads up. This seems to be a pretty serious issue; it appears that Rundeck's remote pod execution feature is broken because of this. I tried to bypass Rundeck altogether and hit the same problem in my own client before stumbling on this issue. |
Yeah, I agree that this is a serious issue for anyone using the exec API from python. Unfortunately, I'm not a maintainer and none of them have commented on that issue, but maybe I should just open a PR to get the ball rolling. On my end, I work around this by monkey-patching the python client, you could do the same if you need an immediate solution. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
/remove-lifecycle stale |
Having the same issue: packit/sandcastle#23 |
After being stuck on this for far too long, I was able to solve this by passing |
@PaulFurtado @arzarif @danwinship This issue seems to be occurring with https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#list_namespaced_pod as well. (k8 python client version: 8.0.1). Adding a timeout does not help either, the API call neither returns anything nor throws an error. Looks like no one is assigned to this issue from the dev team yet. |
@TomasTomecek |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
Rotten issues close after 30d of inactivity. Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
@fejta-bot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I'm facing the same issue with |
Hi,
This is a Bug Report
Problem:
We're using Kubernetes clusters with pods running Airflow. We then have other pods running tasks that send commands to Airflow. To do that, these ops use websockets (kubernetes.stream) and Kubernetes' core_v1_api to get the information about the pod running Airflow. We run commands in one pod, from another pod, by using connect_get_namespaced_pod_exec [1] via the Kubernetes Python API, which, if we understand correctly, is a wrapper for 'kubectl exec'. Our code looks like:
stream(self.api.connect_get_namespaced_pod_exec,
self.web_pod_name,
self.namespace,
command=['airflow'] + cmd,
container='web',
stdin='True',
stdout='True')
Most of the times, our tasks are able to send commands and get their outputs back from Airflow's pods. Once in a while though, the tasks seem to hang forever on the stream call. When the task hangs, it is accessible via Kubes and shown as running, it simply is silent. We don't see any error.
[1] https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#connect_get_namespaced_pod_exec
Proposed Solution:
We don't have a solution right now, except to delete the pod via Kube and re-create it.
Environment:
kubectl version
): 1 (1.9.7-gke.5)The text was updated successfully, but these errors were encountered: