Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rancher behind proxy #9420

Closed
joshuacox opened this issue Jul 21, 2017 · 15 comments
Closed

rancher behind proxy #9420

joshuacox opened this issue Jul 21, 2017 · 15 comments
Assignees
Labels
area/server kind/question Issues that just require an answer. No code change needd
Milestone

Comments

@joshuacox
Copy link

joshuacox commented Jul 21, 2017

Rancher versions:
rancher/server: 1.6.4

Docker version: (docker version,docker info preferred)

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

uname -a
Linux hosty 4.9.34-rancher #1 SMP Mon Jun 26 01:54:18 UTC 2017 x86_64 GNU/Linux

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO) KVM

Setup details: (single node rancher vs. HA rancher, internal DB vs. external DB) single node

Environment Template: (Cattle/Kubernetes/Swarm/Mesos) cattle

Steps to Reproduce: start a rancher go directly to http://RANCHER_IP:8080, set up auth, and add a few hosts, setup traefik, point traefik at the rancher's 8080 (in this case I am using a small nginx container with the traefic labels, and it proxies to the rancher 8080) with an associated dns hostname, now try and login from the hostname with ssl

Results: Failure, rancher reloads after successful login back to the login page, not logging me in.

nginx-tiny-proxy this is the container that I use to get rancher to proxy external services (usually not inside the rancher itself) using this template in this case it proxies to the RANCHER_IP itself on port 8080. This is the rancher in question in case you want to see that it does indeed proxy the login page just fine.

@aemneina aemneina added area/server kind/question Issues that just require an answer. No code change needd labels Jul 21, 2017
@aemneina
Copy link

Hey @joshuacox do the developer tools call out what api call fails? This might in turn cause a redirect to the login page. What auth are you using here? Does the rancher server container log any interesting errors?

@joshuacox
Copy link
Author

at this point I was using a local auth

@joshuacox
Copy link
Author

joshuacox commented Jul 21, 2017

logs:

runtime.systemstack_switch()
rax    0x0
rbp    0xbf69ee
rbx    0x11a7608
rcx    0x916a57
r10    0x8o/src/github.co
rip    0x8105f7cal/go/sr
r12    0x2c
r13    0xb53918
r14    0x0
r15    0x8
rip    0x8105f7
rflags 0x206
cs     0x33
fs     0x0
gs     0x0
fatal error: runtime: out of memory


rax    0x0
rbx    0x11a7608
rcx    0x916a57
rdx    0x6
rdi    0x16c8
rsi    0x16c8
rbp    0xd69a9e
rsp    0x7ffd2600cd58
r8     0xa
r9     0x2023880
r10    0x8
r11    0x202
r12    0x2c
r13    0xd3e57c
r14    0x0
r15    0x8
rip    0x916a57
rflags 0x202
cs     0x33
fs     0x0
gs     0x0
runtime/cgo: pthread_create failed: Resource temporarily unavailable
SIGABRT: abort
PC=0x8105f7 m=0

goroutine 0 [idle]:

goroutine 1 [running]:
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_amd64.s:245 fp=0xc82001e770 sp=0xc82001e768
runtime.main()
        /usr/local/go/src/runtime/proc.go:126 +0x62 fp=0xc82001e7c0 sp=0xc82001e770
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1 fp=0xc82001e7c8 sp=0xc82001e7c0

goroutine 17 [syscall, locked to thread]:
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1998 +0x1

rax    0x0
rbx    0xfc1b68
rcx    0x8105f7
rdx    0x6
rdi    0x16c9
rsi    0x16c9

I have 2 gigs of ram in that VM, and it shows around 450MB free, moreover if I log in directly using the IP and no SSL etc, I get in just fine. The logs then look like this after successfully logging in using the IP directly:

goroutine 1 [running, locked to thread]:
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_amd64.s:281 fp=0xc42019bcb8 sp=0xc42019bcb0
runtime.(*mcache).nextFree(0x7f48f811e000, 0xa7b40b, 0x6c, 0xc400000073, 0x3)
        /usr/local/go/src/runtime/malloc.go:527 +0xb9 fp=0xc42019bd10 sp=0xc42019bcb8
runtime.mallocgc(0xa0, 0x0, 0xc42019be00, 0x40ec52)
        /usr/local/go/src/runtime/malloc.go:679 +0x827 fp=0xc42019bdb0 sp=0xc42019bd10
runtime.rawstring(0x96, 0x0, 0x0, 0x0, 0x0, 0x0)
        /usr/local/go/src/runtime/string.go:237 +0x85 fp=0xc42019bde0 sp=0xc42019bdb0
runtime.rawstringtmp(0x0, 0x96, 0xa752c0, 0xc42019be68, 0x4b3bf1, 0xd63c60, 0xc4201cd700)
        /usr/local/go/src/runtime/string.go:107 +0x78 fp=0xc42019be20 sp=0xc42019bde0
runtime.slicebytetostring(0x0, 0xc4202da400, 0x96, 0x100, 0x1, 0x1)
        /usr/local/go/src/runtime/string.go:89 +0x3e fp=0xc42019be78 sp=0xc42019be20

the browser console in developer tools only has a bunch of 200 OKs with the excepition of

DELETE https://rancher.bokbot.com/v2-beta/token/current | 204 No Content |   | 301ms

EDIT: I pasted the wrong line earlier, the 204 above is now the correct line I meant to paste

@joshuacox
Copy link
Author

joshuacox commented Jul 25, 2017

@deniseschannon @LLParse I have an empty uninitialized instance up here that exhibits all the bad behavior without any auth whatsoever, click on links and tons of failures.

i realize that I have become too reliant on letting traefik run all my cert management. Perhaps I should learn the official method of using lets encrypt and the rancher load balancers

@moelsayed
Copy link
Contributor

I just tested the scenario you described with our latest v1.6.8-rc2 and it seems to work as expected:

  • install server v1.6.8-rc1
  • enable local auth
  • deploy traefik from the catalog and configure it on one of the hosts.
  • add the whc catalog and deploy the tiny nginx stack, configured it to use traefik service and point to the rancher server IP

I pointed the DNS record configured in tiny-nginx to the traefik host and I was able to login/logout, add stacks and new hosts using the new DNS record with no problems.

@joshuacox could you please try to reproduce this using v1.6.8-rc1 ? Just make sure I replicated your setup correctly.

@joshuacox
Copy link
Author

hip hip hooray! It works great! Closing, I'm going to do some more thorough testing, but for now I am very satisfied as this meets one of my edge cases, but a very important one.

@joshuacox
Copy link
Author

maybe not, I migrated the ranch to another KVM instance and I'm back to gettting booted after auth with what appears to be mixed content:

Blocked loading mixed active content “http://rancher.bokbot.com/v2-beta/identities?limit=-1&sort=name”[Learn More]

@joshuacox joshuacox reopened this Aug 16, 2017
@moelsayed
Copy link
Contributor

@joshuacox Can you please provide more details on the setup and versions used this time ?

@joshuacox
Copy link
Author

joshuacox commented Aug 19, 2017


Component | Version
-- | --
Rancher | v1.6.8-rc2
Cattle | v0.183.8
User Interface | v1.6.20
Rancher CLI | v0.6.3
Rancher Compose | v0.12.5

^ like that?

sudo ros os version
v1.0.4
docker info
Containers: 5
 Running: 1
 Paused: 0
 Stopped: 4
Images: 4
Server Version: 17.03.1-ce
Storage Driver: overlay
 Backing Filesystem: extfs
 Supports d_type: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 4ab9917febca54791c5f071a9d1f404867857fcc
runc version: 54296cf40ad8143b62dbcaa1d90e520a2136ddfe
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.40-rancher
Operating System: RancherOS v1.0.4
OSType: linux
Architecture: x86_64
CPUs: 3
Total Memory: 3.092 GiB
Name: hippo
ID: 2SWJ:7BFV:FJCK:JPDQ:55X4:NPBV:ZRDN:ZI65:2ETR:SFMR:UWSL:RB6B
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

traefik:
rawmind/alpine-traefik:1.3.3

nginx config:

upstream rancher {
  server ${NGINX_SRC_HOST}:${NGINX_SRC_PORT};
}

map $http_upgrade $connection_upgrade {
    default Upgrade;
    ''      close;
}

server {
    listen 80;
    server_name ${NGINX_HOST}.${NGINX_DOMAIN};

    location / {
        proxy_set_header Host $host;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_pass http://rancher;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        # This allows the ability for the execute shell window to remain open for up to 15 minutes. Without this parameter, the default is 1 minute and will automatically close.
        proxy_read_timeout 900s;
    }
}

@moelsayed
Copy link
Contributor

Ok. I managed to reproduce it this time. I will let you know as soon as I have an update. Thank you for the detailed information.

@joshuacox
Copy link
Author

@moelsayed I look forward to your results, please let me know if I can provide any other information or try out configs, or anything to help.

@rawmind0
Copy link
Contributor

Hi @joshuacox ,

nginx is overwriting x-forwarded headers . Requests that comes from traefik, x-forwarded headers are already added to the request. In that case, there is a change of protocol in the middle, and you shouldn't overwrite the headers specially X-Forwarded-Proto.

Please, try adding a check if header is set to avoid overwrite....in your nginx location configuration..

...
    location / {
    
    	if ($http_x_forwarded_proto = false) {
        	set $http_x_forwarded_proto $scheme;
    	}

        proxy_set_header Host $host;
        #proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-Port $server_port;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_pass http://rancher;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        # This allows the ability for the execute shell window to remain open for up to 15 minutes. Without this parameter, the default is 1 minute and will automatically close.
        proxy_read_timeout 900s;
    }
...

Another note...Traefik ws is working fine just in v1.3.3 but it seems that doesn't work in v1.3.4 neiher v1.3.5.... May be related to this traefik/traefik#1905

@joshuacox
Copy link
Author

I've updated my tiny proxy's rancher template to match what you've given above and I think you've nailed it, closing.

@rawmind0
Copy link
Contributor

Traefik ws is working fine again in v1.3.6....

Upgraded catalog traefik package... rancher/community-catalog#603

@joshuacox
Copy link
Author

I can verify, I just upgraded and everything appears to be working great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/server kind/question Issues that just require an answer. No code change needd
Projects
None yet
Development

No branches or pull requests

6 participants