Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Websocket requests are not proxied correctly in Ansible-based operators #2204

Closed
geerlingguy opened this issue Nov 15, 2019 · 6 comments · Fixed by #2716
Closed

Websocket requests are not proxied correctly in Ansible-based operators #2204

geerlingguy opened this issue Nov 15, 2019 · 6 comments · Fixed by #2716
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. language/ansible Issue is related to an Ansible operator project
Milestone

Comments

@geerlingguy
Copy link
Contributor

geerlingguy commented Nov 15, 2019

Bug Report

What did you do?

I am trying to an Ansible k8s_exec module, which allows running the equivalent of kubectl exec commands to exec a command on a Pod via Ansible through the Python Kubernetes library. This allows me to write a task like:

- name: Test a simple command.
  k8s_exec:
    namespace: '{{ meta.namespace }}'
    pod: '{{ tower_pod_name }}'
    command: date

Instead of installing kubectl on my operator image (added COPY --from=lachlanevenson/k8s-kubectl:v1.16.2 /usr/local/bin/kubectl /usr/local/bin/kubectl to my build/Dockerfile) and writing a task like:

- name: Test kubectl exec.
  command: >
    kubectl exec -n {{ meta.namespace }} {{ tower_pod_name }} date

What did you expect to see?

When I run the same task as above on my system Ansible against a Kubernetes cluster, or even inside of the operator Pod's ansible container using ansible-playbook to run it, it executes successfully and registers the result of the command that was executed.

What did you see instead? Under which circumstances?

When it is run via the operator/ansible-runner using the operator's proxy, it results in the following error:

kubernetes.client.rest.ApiException: (0)
Reason: Handshake status 200 OK

It should be getting a 101 response from the Kubernetes API websocket.

Full error message from the failed task:

TASK [tower : Test a simple command.] ******************************************
task path: /opt/ansible/roles/tower/tasks/main.yml:38
 line 136, in <module>\n  File \"/tmp/ansible_k8s_exec_payload_x8PPAd/__main__.py\", line 123, in main\n  File \"/usr/lib/python2.7/site-packages/kubernetes/stream/stream.py\", line 32, in stream\n    return func(*args, **kwargs)\n  File \"/usr/lib/python2.7/site-packages/kubernetes/client/apis/core_v1_api.py\", line 835, in connect_get_namespaced_pod_exec\n    (data) = self.connect_get_namespaced_pod_exec_with_http_info(name, namespace, **kwargs)\n  File \"/usr/lib/python2.7/site-packages/kubernetes/client/apis/core_v1_api.py\", line 935, in connect_get_namespaced_pod_exec_with_http_info\n    collection_formats=collection_formats)\n  File \"/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py\", line 321, in call_api\n    _return_http_data_only, collection_formats, _preload_content, _request_timeout)\n  File \"/usr/lib/python2.7/site-packages/kubernetes/client/api_client.py\", line 155, in __call_api\n    _request_timeout=_request_timeout)\n  File \"/usr/lib/python2.7/site-packages/kubernetes/str
eam/stream.py\", line 27, in _intercept_request_call\n    return ws_client.websocket_call(config, *args, **kwargs)\n  File \"/usr/lib/python2.7/site-packages/kubernetes/stream/ws_client.py\", line 255, in websocket_call\n    raise ApiException(status=0, reason=str(e))\nkubernetes.client.rest.ApiException: (0)\nReason: Handshake status 200 OK\n\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}

Environment

  • operator-sdk version:

v0.11.0

  • go version:

N/A

  • Kubernetes version information:

1.16.2

  • Kubernetes cluster kind:

Molecule

  • Are you writing your operator in ansible, helm, or go?

ansible

Possible Solution

N/A

Additional context

Relates to: geerlingguy/tower-operator#5

@geerlingguy
Copy link
Contributor Author

Note that using kubectl exec works fine inside the operator, even if I set the KUBECONFIG environment variable to the same as the ansible tasks are using with:

- name: Test kubectl exec.
  command: >
    kubectl exec -n {{ meta.namespace }} {{ tower_pod_name }} date
  environment:
    KUBECONFIG: '{{ lookup("env", "KUBECONFIG") }}'

@camilamacedo86
Copy link
Contributor

Hi @geerlingguy,

Before any further analyse and check could you please check it with the latest version of SDK. I mean, could you upgrade your project to use SDK 0.12? Or let us know if you are able to reproduce this scenario using the Memcached sample?

Also, I checked that:

File "/usr/lib/python2.7/site-packages/kubernetes/client/apis/core_v1_api.py\

So, note that the python version was upgrade to 3. Please, could you ensure that your project was upgraded properly and you are using python 3 in the env where it has been executed?

@camilamacedo86 camilamacedo86 added language/ansible Issue is related to an Ansible operator project triage/support Indicates an issue that is a support question. labels Nov 18, 2019
@geerlingguy
Copy link
Contributor Author

@camilamacedo86 - Thanks for the suggestion! I'll definitely upgrade and test things (see linked issue above)—I hope to get to this soon.

@geerlingguy
Copy link
Contributor Author

geerlingguy commented Jan 16, 2020

@camilamacedo86 - I just reproduced the same error on v0.12.0, as well as the current latest version, v0.14.0. Steps to reproduce (requires Molecule, Ansible, and Minikube installed locally):

$ git clone https://github.com/geerlingguy/tower-operator.git
$ git checkout k8s_exec

$ minikube start --memory 6g --cpus 4
$ molecule test -s test-minikube

# while that's running, when you get to reconciliation, in another terminal, run:
$ kubectl logs -f -l name=tower-operator -c ansible

The operator playbook runs but keeps failing at the k8s_exec task with a message that ends like:

...
  File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/stream.py\", line 27, in _intercept_request_call
    return ws_client.websocket_call(config, *args, **kwargs)
  File \"/usr/local/lib/python3.6/site-packages/kubernetes/stream/ws_client.py\", line 255, in websocket_call
    raise ApiException(status=0, reason=str(e))
kubernetes.client.rest.ApiException: (0)
Reason: Handshake status 200 OK

I was speaking with @fabianvf on Slack and he mentioned that the likely problem is the Ansible Operator HTTP proxy that is injected between Kubernetes' API and the operator itself is not handling websockets requests correctly (thus we get this error with the 200 OK handshake—it should be continuing on and streaming the response to Python, which it is not).

@geerlingguy
Copy link
Contributor Author

geerlingguy commented Jan 16, 2020

If found this issue upstream in the client-go/rest package: kubernetes/client-go#45 — it seems that issue went stale and was automatically closed.

There's a 2017 blog post linked with a workaround: Writing a Custom Kubectl Exec Command, and there's an HTTPWrappersForConfig function that is "exposed to allow more clients that need HTTP-like behavior but then must hijack the underlying connection (like WebSocket or HTTP2 clients)."

It would be nice if we could make proxy.go (https://github.com/operator-framework/operator-sdk/blob/master/pkg/ansible/proxy/proxy.go) work with websockets, and also add a test case that uses exec on a pod in https://github.com/operator-framework/operator-sdk/blob/master/pkg/ansible/proxy/proxy_test.go

@camilamacedo86 camilamacedo86 added kind/bug Categorizes issue or PR as related to a bug. language/ansible Issue is related to an Ansible operator project and removed language/ansible Issue is related to an Ansible operator project triage/support Indicates an issue that is a support question. labels Jan 16, 2020
@asmacdo asmacdo modified the milestones: v0.16.0, v0.17.0 Mar 9, 2020
@geerlingguy
Copy link
Contributor Author

#2716 looks like it resolves this issue 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. language/ansible Issue is related to an Ansible operator project
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants