Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl exec fails #7425

Closed
rsuniev opened this issue Jan 16, 2017 · 14 comments
Closed

kubectl exec fails #7425

rsuniev opened this issue Jan 16, 2017 · 14 comments
Assignees
Labels
area/kubernetes kind/bug Issues that are defects reported by users or that we know have reached a real release

Comments

@rsuniev
Copy link

rsuniev commented Jan 16, 2017

Rancher Versions:
Server:1.2.2
healthcheck:0.2.0
ipsec:0.0.2
network-services:0.0.8
scheduler:0.2.0
kubernetes (if applicable): 1.4.6v1

Docker Version:

OS and where are the hosts located? (cloud, bare metal, etc): AWS, ALB

Setup Details: (single node rancher vs. HA rancher, internal DB vs. external DB) HA rancher

Environment Type: (Cattle/Kubernetes/Swarm/Mesos) Kubernetes

Steps to Reproduce:

  1. Provision Kubernetes environment using v1.4.6v1 template.
  2. Provision a pod.
  3. run: kubectl exec -it container_name bash

Results:
request fails with the error:

server response object: [{
 "kind": "Status",
 "apiVersion": "v1",
 "metadata": {},
 "status": "Failure",
 "message": "Upgrade request required",
 "reason": "BadRequest",
 "code": 400
}]

Expected:
kubectl exec works

@rawmind0
Copy link
Contributor

Hi @rsuniev ...

how are you accessing to kubernetes api?? Through aws elb?? If yes, could you please try to access kubernetes api directly instead through aws elb?

May be related to this kubernetes issue...
kubernetes/kubernetes#19293

@rsuniev
Copy link
Author

rsuniev commented Jan 16, 2017

@rawmind0 We are using AWS ALB? Same issue?

@rawmind0
Copy link
Contributor

If you are using aws alb, may be same issue... failing only kubectl exec commands...

Are other kubectl commands working well?? Could you please, try to modify kubectl config file and try to access directly to one k8s api server instead of through aws alb??

@rsuniev
Copy link
Author

rsuniev commented Jan 16, 2017

We tried this but we are terminating a ALB at SSL and then forwarding to port 8080. Which didn't work because the server is expecting the header to be HTTPS.

@wlan0
Copy link

wlan0 commented Jan 17, 2017

The issue is that traefik does not pass Upgrade header

This tcpdump proves it -

01:30:30.597485 IP (tos 0x0, ttl 64, id 38724, offset 0, flags [DF], proto TCP (6), length 811)
    ip-172-17-0-3.us-east-2.compute.internal.44112 > 4f49424675f8.http-alt: Flags [P.], cksum 0x5b45 (incorrect -> 0x0d27), seq 1:760, ack 1, win 229, options [nop,nop,TS val 8195760 ecr 8195760], length 759
E..+.D@.@.Ha.........P..Cs...w......[E.....
.}...}..POST /r/projects/1a5/kubernetes/api/v1/namespaces/default/pods/nginxsvc/exec?command=bash&container=nginxsvc&container=nginxsvc&stderr=true&stdout=true HTTP/1.1
Host: xxxx122.rancher.space
X-Forwarded-Proto: https
X-Forwarded-Port: 443
X-Forwarded-For: x.x.37.17, 172.31.33.227
Upgrade: SPDY/3.1
Connection: Upgrade
Content-Length: 0
X-Amzn-Trace-Id: Root=1-587d73b6-1ad77d992ca15e4776c68cba
User-Agent: kubectl/v1.4.0 (darwin/amd64) kubernetes/a16c0a7
Authorization: Basic RDNCMjAyQTgyRUMyOTI1RUVCNjg6QnRnRXZVNHZCSzhDdFVKMnNWMUtQZWpHM0tiYnRkZThiaVFSWkx5Zg==
--
01:30:30.598155 IP (tos 0x0, ttl 64, id 33408, offset 0, flags [DF], proto TCP (6), length 1081)
    4f49424675f8.http-alt > ip-172-17-0-3.us-east-2.compute.internal.34854: Flags [P.], cksum 0x5c53 (incorrect -> 0xf712), seq 4289:5318, ack 11777, win 1452, options [nop,nop,TS val 8195761 ecr 8195642], length 1029
E..9..@.@.\............&..3.3..:....\S.....
.}...}.:.=020fd030-fafe-4829-bc16-1731d8a3de7b||0||/v1/container-proxy/.~..020fd030-fafe-4829-bc16-1731d8a3de7b||1||{"host":"xxxxx122.rancher.space","method":"POST","url":"http://10.42.14.47:80/api/v1/namespaces/default/pods/nginxsvc/exec?command=bash\u0026container=nginxsvc\u0026container=nginxsvc\u0026stderr=true\u0026stdout=true","headers":{"Accept-Encoding":["gzip"],"Authorization":["Basic RDNCMjAyQTgyRUMyOTI1RUVCNjg6QnRnRXZVNHZCSzhDdFVKMnNWMUtQZWpHM0tiYnRkZThiaVFSWkx5Zg=="],"Content-Length":["0"],"User-Agent":["kubectl/v1.4.0 (darwin/amd64) kubernetes/a16c0a7"],"X-Amzn-Trace-Id":["Root=1-587d73b6-1ad77d992ca15e4776c68cba"],"X-Forwarded-For":["x.x.37.17, 172.31.33.227, 172.17.0.3, ["],"X-Forwarded-Host":["xxxxx122.rancher.space"],"X-Forwarded-Port":["443"],"X-Forwarded-Proto":["https"],"X-Forwarded-Server":["4f49424675f8"],"X-Stream-Protocol-Version":["v4.channel.k8s.io","v3.channel.k8s.io","v2.channel.k8s.io","channel.k8s.io"],"X-Traefik-Reqid":["24111"]}}.5020fd030-fafe-4829-bc16-1731d8a3de7b||1||{"eof":true}
01:30:30.598176 IP (tos 0x0, ttl 64, id 5307, offset 0, flags [DF], proto TCP (6), length 52)
    ip-172-17-0-3.us-east-2.compute.internal.34854 > 4f49424675f8.http-alt: Flags [.], cksum 0x584e (incorrect -> 0xd334), seq 11777, ack 5318, win 1444, options [nop,nop,TS val 8195761 ecr 8195761], length 0
E..4..@.@............&..3..:..7.....XN.....
.}...}..
01:30:30.672428 IP (tos 0x0, ttl 64, id 5308, offset 0, flags [DF], proto TCP (6), length 581)
    ip-172-17-0-3.us-east-2.compute.internal.34854 > 4f49424675f8.http-alt: Flags [P.], cksum 0x5a5f (incorrect -> 0x2097), seq 11777:12306, ack 5318, win 1444, options [nop,nop,TS val 8195779 ecr 8195761], length 529
E..E..@.@............&..3..:..7.....Z_.....
.}...}............3...0...b...7.......5...0...b...4...2...!...g...!...@...f...O...w...X...:...!...w.......f...!...o...w...,...m...!...f...!.../...#...#...4...9...0...N...~...l...9...~.....~...L?..N<...n..S;..G"..O9..I<..Fn...8...>...-...v..\j...n...F...Z...k...F...V...Y..F...V...m..HF...U...l...l...C...m...F..NB..GL..C-...J..Zz..<...jg..</..bx..8)..w{..k...i...86..&6..?%..`>..?7...............................................
01:30:30.672799 IP (tos 0x0, ttl 64, id 33409, offset 0, flags [DF], proto TCP (6), length 95)
    4f49424675f8.http-alt > ip-172-17-0-3.us-east-2.compute.internal.34854: Flags [P.], cksum 0x5879 (incorrect -> 0x45e1), seq 5318:5361, ack 12306, win 1452, options [nop,nop,TS val 8195779 ecr 8195779], length 43
--

if you search for this id (1-587d73b6-1ad77d992ca15e4776c68cba (x-Amzn-Trace-Id)) then you'll notice that the above data has two http requests - the first from kubectl to the traefik proxy, and the second from websocket-proxy to kube-apiserver (on my setup 10.42.14.47 is the kubeapiserver ip). In the second request, it doesnt pass the http upgrade header, but the first request has the http header.

somehow the http header is lost between traefik and websocket-proxy, and since we know that web socket proxy works just fine if we switch off traefik, we jumped to the conclusion that traefik is the one dropping the headers

@aemneina
Copy link

Thanks for the expert analysis @wlan0 , @ibuildthecloud is putting a fix into traefik to correct this.

@aemneina aemneina assigned wlan0 and ibuildthecloud and unassigned wlan0 Jan 17, 2017
@TylerRick
Copy link

We're running into this issue as well. We have nearly the same configuration as the OP but aren't using traefik as far as I know.

Have been able to work around it by using the "Execute Shell" feature in the Rancher UI but would like to be able to use kubectl exec as well (and the."Execute Shell" shells get killed after a while).

@wlan0
Copy link

wlan0 commented Jan 17, 2017

@TylerRick Is your setup a Rancher HA setup? i.e. do you start the rancher server with the --advertise-address option? If so, then traefik is started inside the rancher/server container.

We are submitting a patch to traefik to fix this issue. We'll be updating the traefik version with the patch as soon as possible. I'll comment on this thread once it is fixed.

@TylerRick
Copy link

@wlan0 No, we just have a single Rancher server currently. Our setup seems about the same other than that though (Rancher v1.3.1, Kubernetes v1.5.1, AWS, and an Application Load Balancer for ports 80/443 that routes to our single Rancher instance on port 8080), so it may not be specific to traefik and more to do with ALB...? Should I create a separate issue?

@wlan0
Copy link

wlan0 commented Jan 17, 2017

@TylerRick ALB supports websocket, SPDY and HTTP/2.0. That being said, Please create a separate issue. We can debug further there.

@deniseschannon deniseschannon added area/kubernetes kind/bug Issues that are defects reported by users or that we know have reached a real release status/to-test labels Jan 19, 2017
@galal-hussein
Copy link
Contributor

Validated the fix on Rancher v1.2.3-rc2, here are the steps:

  • Run Rancher HA setup with ELB
  • The ELB should be configured to use proxy protocol
$ aws elb create-load-balancer-policy --load-balancer-name my-elb --policy-name myorg-ProxyProtocol-policy --policy-type-name ProxyProtocolPolicyType --policy-attributes AttributeName=ProxyProtocol,AttributeValue=true
$ aws elb set-load-balancer-policies-for-backend-server --load-balancer-name my-elb --instance-port 81 --policy-names myorg-ProxyProtocol-policy
$ aws elb set-load-balancer-policies-for-backend-server --load-balancer-name my-elb --instance-port 444 --policy-names myorg-ProxyProtocol-policy
$ aws elb set-load-balancer-policies-for-backend-server --load-balancer-name my-elb --instance-port 8080 --policy-names myorg-ProxyProtocol-policy
  • Register few agents with Rancher
  • Install kubernetes stack
  • Deploy few pods and services
  • Test kubectl exec and logs from remote kubectl client

Both kubectl exec and logs work from remote client.

@jinglejengel
Copy link

Sorry to ping on a closed issue @galal-hussein but I'm getting 400s from an apache load balancer, despite following this: http://rancher.com/docs/rancher/v1.6/en/installing-rancher/installing-server/basic-ssl-config/#example-apache-configuration configuration.

The apache server is running a flat Apache/2.4.7 install (non-container).

I have proxy_wstunnel enabled, and everything in general works fine with rancher with regards to the UI and almost all kubectl commands, but exec doesn't work:

$ kubectl exec -it nginx-test-2926275868-6sjss /bin/sh
Error from server (BadRequest): Upgrade request required

And in the apache logs I see:

10.1.8.180 - - [15/Aug/2017:15:01:55 -0700] "GET /r/projects/1a7/kubernetes/api HTTP/1.1" 200 6337 "-" "kubectl/v1.6.2 (linux/amd64) kubernetes/477efc3"
10.1.8.180 - - [15/Aug/2017:15:01:59 -0700] "GET /r/projects/1a7/kubernetes/apis HTTP/1.1" 200 3633 "-" "kubectl/v1.6.2 (linux/amd64) kubernetes/477efc3"
10.1.8.180 - - [15/Aug/2017:15:01:59 -0700] "GET /r/projects/1a7/kubernetes/api/v1/namespaces/default/pods/nginx-test-2926275868-6sjss HTTP/1.1" 200 2992 "-" "kubectl/v1.6.2 (linux/amd64) kubernetes/477efc3"
10.1.8.180 - - [15/Aug/2017:15:01:59 -0700] "POST /r/projects/1a7/kubernetes/api/v1/namespaces/default/pods/nginx-test-2926275868-6sjss/exec?command=%2Fbin%2Fsh&container=nginx-test&container=nginx-test&stdin=true&stdout=true&tty=true HTTP/1.1" 400 6367 "-" "kubectl/v1.6.2 (linux/amd64) kubernetes/477efc3"

In my apache virtual host for rancher I have the following:

            ProxyRequests Off
            ProxyPreserveHost On

            RewriteEngine On
            RewriteCond %{HTTP:Connection} Upgrade [NC]
            RewriteCond %{HTTP:Upgrade} websocket [NC]
            RewriteRule /(.*) balancer://dev-rancher-ws/$1 [P,L]

            RequestHeader set X-Forwarded-Proto "https"
            RequestHeader set X-Forwarded-Port "443"

            ProxyPass        "/" "balancer://dev-rancher/"
            ProxyPassReverse "/" "balancer://dev-rancher/"

The balancer is pulled from a shared apache config:

<Proxy "balancer://dev-rancher">
	BalancerMember "http://dev-rancher-01:8080"
	BalancerMember "http://dev-rancher-02:8080"
</Proxy>

<Proxy "balancer://dev-rancher-ws">
        BalancerMember "ws://dev-rancher-01:8080"
        BalancerMember "ws://dev-rancher-02:8080"
</Proxy>

Do you have any idea anything I could be missing?

@Angelinsky7
Copy link

Hi, i have the same issue and with almost the same config as @Joeskyyy (without the proxypass to a balancer)

<IfModule mod_proxy.c>
  ProxyRequests off
  ProxyVia On
  
  ProxyPreserveHost On
  RequestHeader set X-Forwarded-Proto "https"
  RequestHeader set X-Forwarded-Port "443"
  
  RewriteEngine On
  RewriteCond %{HTTP:Connection} Upgrade [NC]
  RewriteCond %{HTTP:Upgrade} WebSocket [NC]
  RewriteRule /(.*) ws://docker.internal:8080/$1 [P,L]

  ProxyPass / http://docker.internal:8080/
  ProxyPassReverse / http://docker.internal:8080/

</IfModule>

@Angelinsky7
Copy link

After some research i can say that's it doesn't seem the rancher fault, if i directly connect (with kubectl) to the rancher server (@ https://docker.internal:443) the exec command is working. (i cannot make the 8080 connection directly works, i have a redirect on the https://*:8443 ports)
So now, my question: what is the correct configuration for an apache reverse proxy to make this work ?
Where could i find error/access logs of the rancher server to see the differences between the connections with and without the apache reverse proxy ?

thanks a lots for all the help !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes kind/bug Issues that are defects reported by users or that we know have reached a real release
Projects
None yet
Development

No branches or pull requests

10 participants