New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to retrieve user's IP address in docker swarm mode #25526

Open
PanJ opened this Issue Aug 9, 2016 · 199 comments

Comments

Projects
None yet
@PanJ

PanJ commented Aug 9, 2016

Output of docker version:

Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:00:36 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:00:36 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 155
 Running: 65
 Paused: 0
 Stopped: 90
Images: 57
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 868
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: host overlay null bridge
Swarm: active
 NodeID: 0ddz27v59pwh2g5rr1k32d9bv
 Is Manager: true
 ClusterID: 32c5sn0lgxoq9gsl1er0aucsr
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot interval: 10000
  Heartbeat tick: 1
  Election tick: 3
 Dispatcher:
  Heartbeat period: 5 seconds
 CA configuration:
  Expiry duration: 3 months
 Node Address: 172.31.24.209
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-92-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.42 GiB
Name: ip-172-31-24-209
ID: 4LDN:RTAI:5KG5:KHR2:RD4D:MV5P:DEXQ:G5RE:AZBQ:OPQJ:N4DK:WCQQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: panj
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

Steps to reproduce the issue:

  1. run following service which publishes port 80
docker service create \
--name debugging-simple-server \
--publish 80:3000 \
panj/debugging-simple-server
  1. Try connecting with http://<public-ip>/.

Describe the results you received:
Neither ip nor header.x-forwarded-for is the correct user's IP address.

Describe the results you expected:
ip or header.x-forwarded-for should be user's IP address. The expected result can be archieved using standalone docker container docker run -d -p 80:3000 panj/debugging-simple-server. You can see both of the results via following links,
http://swarm.issue-25526.docker.takemetour.com:81/
http://container.issue-25526.docker.takemetour.com:82/

Additional information you deem important (e.g. issue happens only occasionally):
This happens on both global mode and replicated mode.

I am not sure if I missed anything that should solve this issue easily.

In the meantime, I think I have to do a workaround which is running a proxy container outside of swarm mode and let it forward to published port in swarm mode (SSL termination should be done on this container too), which breaks the purpose of swarm mode for self-healing and orchestration.

@thaJeztah

This comment has been minimized.

Show comment
Hide comment
@thaJeztah
Member

thaJeztah commented Aug 9, 2016

/cc @aluzzardi @mrjana ptal

@mavenugo

This comment has been minimized.

Show comment
Hide comment
@mavenugo

mavenugo Aug 9, 2016

Contributor

@PanJ can you please share some details on how debugging-simple-server determines the ip ? Also what is the expectation if a service is scaled to more than 1 replica across multiple hosts (or global mode) ?

Contributor

mavenugo commented Aug 9, 2016

@PanJ can you please share some details on how debugging-simple-server determines the ip ? Also what is the expectation if a service is scaled to more than 1 replica across multiple hosts (or global mode) ?

@PanJ

This comment has been minimized.

Show comment
Hide comment
@PanJ

PanJ Aug 9, 2016

@mavenugo it's koa's request object which uses node's remoteAddress from net module. The result should be the same for any other libraries that can retrieve remote address.

The expectation is that ip field should always be remote address regardless of any configuration.

PanJ commented Aug 9, 2016

@mavenugo it's koa's request object which uses node's remoteAddress from net module. The result should be the same for any other libraries that can retrieve remote address.

The expectation is that ip field should always be remote address regardless of any configuration.

@marech

This comment has been minimized.

Show comment
Hide comment
@marech

marech Sep 19, 2016

@PanJ you still use your workaround or found some better solution?

marech commented Sep 19, 2016

@PanJ you still use your workaround or found some better solution?

@sanimej

This comment has been minimized.

Show comment
Hide comment
@sanimej

sanimej Sep 19, 2016

@PanJ When I run your app as a standalone container..

docker run -it --rm -p 80:3000 --name test panj/debugging-simple-server

and access the published port from another host I get this

vagrant@net-1:~$ curl 192.168.33.12
{"method":"GET","url":"/","header":{"user-agent":"curl/7.38.0","host":"192.168.33.12","accept":"*/*"},"ip":"::ffff:192.168.33.11","ips":[]}
vagrant@net-1:~$

192.168.33.11 is the IP of the host in which I am running curl. Is this the expected behavior ?

sanimej commented Sep 19, 2016

@PanJ When I run your app as a standalone container..

docker run -it --rm -p 80:3000 --name test panj/debugging-simple-server

and access the published port from another host I get this

vagrant@net-1:~$ curl 192.168.33.12
{"method":"GET","url":"/","header":{"user-agent":"curl/7.38.0","host":"192.168.33.12","accept":"*/*"},"ip":"::ffff:192.168.33.11","ips":[]}
vagrant@net-1:~$

192.168.33.11 is the IP of the host in which I am running curl. Is this the expected behavior ?

@PanJ

This comment has been minimized.

Show comment
Hide comment
@PanJ

PanJ Sep 19, 2016

@sanimej Yes, it is the expected behavior that should be on swarm mode as well.

PanJ commented Sep 19, 2016

@sanimej Yes, it is the expected behavior that should be on swarm mode as well.

@PanJ

This comment has been minimized.

Show comment
Hide comment
@PanJ

PanJ Sep 19, 2016

@marech I am still using the standalone container as a workaround, which works fine.

In my case, there are 2 nginx intances, standalone and swarm instances. SSL termination and reverse proxy is done on standalone nginx. Swarm instance is used to route to other services based on request host.

PanJ commented Sep 19, 2016

@marech I am still using the standalone container as a workaround, which works fine.

In my case, there are 2 nginx intances, standalone and swarm instances. SSL termination and reverse proxy is done on standalone nginx. Swarm instance is used to route to other services based on request host.

@sanimej

This comment has been minimized.

Show comment
Hide comment
@sanimej

sanimej Sep 19, 2016

@PanJ The way the published port of a container is accessed is different in swarm mode. In the swarm mode a service can be reached from any node in the cluster. To facilitate this we route through an ingress network. 10.255.0.x is the address of the ingress network interface on the host in the cluster from which you try to reach the published port.

sanimej commented Sep 19, 2016

@PanJ The way the published port of a container is accessed is different in swarm mode. In the swarm mode a service can be reached from any node in the cluster. To facilitate this we route through an ingress network. 10.255.0.x is the address of the ingress network interface on the host in the cluster from which you try to reach the published port.

@PanJ

This comment has been minimized.

Show comment
Hide comment
@PanJ

PanJ Sep 19, 2016

@sanimej I kinda saw how it works when I dug into the issue. But the use case (ability to retrieve user's IP) is quite common.

I have limited knowledge on how the fix should be implemented. Maybe a special type of network that does not alter source IP address?

Rancher is similar to Docker swarm mode and it seems to have expected behavior. Maybe it is a good place to start.

PanJ commented Sep 19, 2016

@sanimej I kinda saw how it works when I dug into the issue. But the use case (ability to retrieve user's IP) is quite common.

I have limited knowledge on how the fix should be implemented. Maybe a special type of network that does not alter source IP address?

Rancher is similar to Docker swarm mode and it seems to have expected behavior. Maybe it is a good place to start.

@marech

This comment has been minimized.

Show comment
Hide comment
@marech

marech Sep 20, 2016

@sanimej good idea could be add all IPs to X-Forwarded-For header if its possible then we can see all chain.

@PanJ hmm, and how your nignx standalone container communicate to swarm instance, via service name or ip? Maybe can share nginx config part where you pass it to swarm instance.

marech commented Sep 20, 2016

@sanimej good idea could be add all IPs to X-Forwarded-For header if its possible then we can see all chain.

@PanJ hmm, and how your nignx standalone container communicate to swarm instance, via service name or ip? Maybe can share nginx config part where you pass it to swarm instance.

@PanJ

This comment has been minimized.

Show comment
Hide comment
@PanJ

PanJ Sep 20, 2016

@marech standalone container listens to port 80 and then proxies to localhost:8181

server {
  listen 80 default_server;
  location / {
    proxy_set_header        Host $host;
    proxy_set_header        X-Real-IP $remote_addr;
    proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header        X-Forwarded-Proto $scheme;
    proxy_pass          http://localhost:8181;
    proxy_read_timeout  90;
  }
}

If you have to do SSL termination, add another server block that listens to port 443, then do the SSL termination and proxies to localhost:8181 as well

Swarm mode's nginx publishes 8181:80 and routes to another service based on request host.

server {
  listen 80;
  server_name your.domain.com;
  location / {
    proxy_pass          http://your-service:80;
    proxy_set_header Host $host;
    proxy_read_timeout  90;
  }
}

server {
  listen 80;
  server_name another.domain.com;
  location / {
    proxy_pass          http://another-service:80;
    proxy_set_header Host $host;
    proxy_read_timeout  90;
  }
}

PanJ commented Sep 20, 2016

@marech standalone container listens to port 80 and then proxies to localhost:8181

server {
  listen 80 default_server;
  location / {
    proxy_set_header        Host $host;
    proxy_set_header        X-Real-IP $remote_addr;
    proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header        X-Forwarded-Proto $scheme;
    proxy_pass          http://localhost:8181;
    proxy_read_timeout  90;
  }
}

If you have to do SSL termination, add another server block that listens to port 443, then do the SSL termination and proxies to localhost:8181 as well

Swarm mode's nginx publishes 8181:80 and routes to another service based on request host.

server {
  listen 80;
  server_name your.domain.com;
  location / {
    proxy_pass          http://your-service:80;
    proxy_set_header Host $host;
    proxy_read_timeout  90;
  }
}

server {
  listen 80;
  server_name another.domain.com;
  location / {
    proxy_pass          http://another-service:80;
    proxy_set_header Host $host;
    proxy_read_timeout  90;
  }
}
@o3o3o

This comment has been minimized.

Show comment
Hide comment
@o3o3o

o3o3o Oct 24, 2016

In our case, our API RateLimit and other functions is depend on the user's ip address. Is there any way to skip the problem in swarm mode?

o3o3o commented Oct 24, 2016

In our case, our API RateLimit and other functions is depend on the user's ip address. Is there any way to skip the problem in swarm mode?

@darrellenns

This comment has been minimized.

Show comment
Hide comment
@darrellenns

darrellenns Nov 1, 2016

I've also run into the issue when trying to run logstash in swarm mode (for collecting syslog messages from various hosts). The logstash "host" field always appears as 10.255.0.x, instead of the actual IP of the connecting host. This makes it totally unusable, as you can't tell which host the log messages are coming from. Is there some way we can avoid translating the source IP?

darrellenns commented Nov 1, 2016

I've also run into the issue when trying to run logstash in swarm mode (for collecting syslog messages from various hosts). The logstash "host" field always appears as 10.255.0.x, instead of the actual IP of the connecting host. This makes it totally unusable, as you can't tell which host the log messages are coming from. Is there some way we can avoid translating the source IP?

@vfarcic

This comment has been minimized.

Show comment
Hide comment
@vfarcic

vfarcic Nov 2, 2016

+1 for a solution for this issue.

Without the ability to retrieve user's IP prevents us from using monitoring solutions like Prometheus.

vfarcic commented Nov 2, 2016

+1 for a solution for this issue.

Without the ability to retrieve user's IP prevents us from using monitoring solutions like Prometheus.

@darrellenns

This comment has been minimized.

Show comment
Hide comment
@darrellenns

darrellenns Nov 2, 2016

Perhaps the linux kernel IPVS capabilities would be of some use here. I'm guessing that the IP change is taking place because the connections are being proxied in user space. IPVS, on the other hand, can redirect and load balance requests in kernel space without changing the source IP address. IPVS could also be good down the road for building in more advanced functionality, such as different load balancing algorithms, floating IP addresses, and direct routing.

darrellenns commented Nov 2, 2016

Perhaps the linux kernel IPVS capabilities would be of some use here. I'm guessing that the IP change is taking place because the connections are being proxied in user space. IPVS, on the other hand, can redirect and load balance requests in kernel space without changing the source IP address. IPVS could also be good down the road for building in more advanced functionality, such as different load balancing algorithms, floating IP addresses, and direct routing.

@vfarcic

This comment has been minimized.

Show comment
Hide comment
@vfarcic

vfarcic Nov 2, 2016

For me, it would be enough if I could somehow find out the relation between the virtual IP and the IP of the server the endpoint belongs to. That way, when Prometheus send an alert related to some virtual IP, I could find out what is the affected server. It would not be a good solution but it would be better than nothing.

vfarcic commented Nov 2, 2016

For me, it would be enough if I could somehow find out the relation between the virtual IP and the IP of the server the endpoint belongs to. That way, when Prometheus send an alert related to some virtual IP, I could find out what is the affected server. It would not be a good solution but it would be better than nothing.

@darrellenns

This comment has been minimized.

Show comment
Hide comment
@darrellenns

darrellenns Nov 2, 2016

@vfarcic I don't think that's possible with the way it works now. All client connections come from the same IP, so you can't translate it back. The only way that would work is if whatever is doing the proxy/nat of the connections saved a connection log with timestamp, source ip, and source port. Even then, it wouldn't be much help in most use cases where the source IP is needed.

darrellenns commented Nov 2, 2016

@vfarcic I don't think that's possible with the way it works now. All client connections come from the same IP, so you can't translate it back. The only way that would work is if whatever is doing the proxy/nat of the connections saved a connection log with timestamp, source ip, and source port. Even then, it wouldn't be much help in most use cases where the source IP is needed.

@vfarcic

This comment has been minimized.

Show comment
Hide comment
@vfarcic

vfarcic Nov 2, 2016

I probably did not explain well the use case.

I use Prometheus that is configured to scrap exporters that are running as Swarm global services. It uses tasks.<SERVICE_NAME> to get the IPs of all replicas. So, it's not using the service but replica endpoints (no load balancing). What I'd need is to somehow figure out the IP of the node where each of those replica IPs come from.

vfarcic commented Nov 2, 2016

I probably did not explain well the use case.

I use Prometheus that is configured to scrap exporters that are running as Swarm global services. It uses tasks.<SERVICE_NAME> to get the IPs of all replicas. So, it's not using the service but replica endpoints (no load balancing). What I'd need is to somehow figure out the IP of the node where each of those replica IPs come from.

@vfarcic

This comment has been minimized.

Show comment
Hide comment
@vfarcic

vfarcic Nov 3, 2016

I just realized the "docker network inspect <NETWORK_NAME>" provides information about containers and IPv4 addresses of a single node. Can this be extended so that there is a cluster-wide information of a network together with nodes?

Something like:

       "Containers": {
            "57bc4f3d826d4955deb32c3b71550473e55139a86bef7d5e584786a3a5fa6f37": {
                "Name": "cadvisor.0.8d1s6qb63xdir22xyhrcjhgsa",
                "EndpointID": "084a032fcd404ae1b51f33f07ffb2df9c1f9ec18276d2f414c2b453fc8e85576",
                "MacAddress": "02:42:0a:00:00:1e",
                "IPv4Address": "10.0.0.30/24",
                "IPv6Address": "",
                "Node": "swarm-4"
            },
...

Note the addition of the "Node".

If such information would be available for the whole cluster, not only a single node with the addition of a --filter argument, I'd have everything I'd need to figure out the relation between a container IPv4 address and the node. It would not be a great solution but still better than nothing. Right now, when Prometheus detects a problem, I need to execute "docker network inspect" on each node until I find out the location of the address.

vfarcic commented Nov 3, 2016

I just realized the "docker network inspect <NETWORK_NAME>" provides information about containers and IPv4 addresses of a single node. Can this be extended so that there is a cluster-wide information of a network together with nodes?

Something like:

       "Containers": {
            "57bc4f3d826d4955deb32c3b71550473e55139a86bef7d5e584786a3a5fa6f37": {
                "Name": "cadvisor.0.8d1s6qb63xdir22xyhrcjhgsa",
                "EndpointID": "084a032fcd404ae1b51f33f07ffb2df9c1f9ec18276d2f414c2b453fc8e85576",
                "MacAddress": "02:42:0a:00:00:1e",
                "IPv4Address": "10.0.0.30/24",
                "IPv6Address": "",
                "Node": "swarm-4"
            },
...

Note the addition of the "Node".

If such information would be available for the whole cluster, not only a single node with the addition of a --filter argument, I'd have everything I'd need to figure out the relation between a container IPv4 address and the node. It would not be a great solution but still better than nothing. Right now, when Prometheus detects a problem, I need to execute "docker network inspect" on each node until I find out the location of the address.

@tlvenn

This comment has been minimized.

Show comment
Hide comment
@tlvenn

tlvenn Nov 3, 2016

I agree with @dack , given the ingress network is using IPVS, we should solve this issue using IPVS so that the source IP is preserved and presented to the service correctly and transparently.

The solution need to work at the IP level so that any service that are not based on HTTP can still work properly as well (Can't rely on http headers...).

And I cant stress out how important this is, without it, there are many services that simply cant operate at all in swarm mode.

tlvenn commented Nov 3, 2016

I agree with @dack , given the ingress network is using IPVS, we should solve this issue using IPVS so that the source IP is preserved and presented to the service correctly and transparently.

The solution need to work at the IP level so that any service that are not based on HTTP can still work properly as well (Can't rely on http headers...).

And I cant stress out how important this is, without it, there are many services that simply cant operate at all in swarm mode.

@tlvenn

This comment has been minimized.

Show comment
Hide comment

tlvenn commented Nov 3, 2016

@tlvenn

This comment has been minimized.

Show comment
Hide comment
@tlvenn

tlvenn Nov 3, 2016

@kobolog might be able to shed some light on this matter given his talk on IPVS at DockerCon.

tlvenn commented Nov 3, 2016

@kobolog might be able to shed some light on this matter given his talk on IPVS at DockerCon.

@r3pek

This comment has been minimized.

Show comment
Hide comment
@r3pek

r3pek Jun 16, 2018

people really should stop saying "Mode: host" = working, because that's not using Ingress. That makes it impossible to have just one container with a service running on the swarm but still be able to access it via any host. You either have to make the service "Global" or you can only access it on the host it is running, which kinda defeats the purpose of Swarm.

TLDR: "Mode: Host" is a workaround, not a solution

r3pek commented Jun 16, 2018

people really should stop saying "Mode: host" = working, because that's not using Ingress. That makes it impossible to have just one container with a service running on the swarm but still be able to access it via any host. You either have to make the service "Global" or you can only access it on the host it is running, which kinda defeats the purpose of Swarm.

TLDR: "Mode: Host" is a workaround, not a solution

@robertofabrizi

This comment has been minimized.

Show comment
Hide comment
@robertofabrizi

robertofabrizi Jul 5, 2018

@r3pek While I agree with you that you lose Ingress if you use Host mode to solve this predicament, I'd say that it hardly defeats the whole purpose of Swarm, which does so much more that a public facing ingress network. In our usage scenario we have in the same overlay swarm:
management replicated containers that should only be accessed over the intranet -> they don't need the caller's ip, therefore they are configured "normally" and take advantage of the ingress.
non-exposed containers -> nothing to say about these (I belive you are underestimating the power of being able to access them via their service name though).
public facing service -> this is an nginx proxy that does https and url based routing. It was defined global even before the need to x-forward-for the client's real ip, so no real issue there.

Having nginx global and not having ingress means that you can reach it via any ip of the cluster, but it's not load balanced or fault tolerant, so we added a very very cheap and easy to set up L4 Azure Load Balancer in front of the nginx service.

As you say, Host is a workaround, but saying that enabling it completely defeats the purpose of Docker Swarm is a little exagerated imo.

robertofabrizi commented Jul 5, 2018

@r3pek While I agree with you that you lose Ingress if you use Host mode to solve this predicament, I'd say that it hardly defeats the whole purpose of Swarm, which does so much more that a public facing ingress network. In our usage scenario we have in the same overlay swarm:
management replicated containers that should only be accessed over the intranet -> they don't need the caller's ip, therefore they are configured "normally" and take advantage of the ingress.
non-exposed containers -> nothing to say about these (I belive you are underestimating the power of being able to access them via their service name though).
public facing service -> this is an nginx proxy that does https and url based routing. It was defined global even before the need to x-forward-for the client's real ip, so no real issue there.

Having nginx global and not having ingress means that you can reach it via any ip of the cluster, but it's not load balanced or fault tolerant, so we added a very very cheap and easy to set up L4 Azure Load Balancer in front of the nginx service.

As you say, Host is a workaround, but saying that enabling it completely defeats the purpose of Docker Swarm is a little exagerated imo.

@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Jul 5, 2018

sandys commented Jul 5, 2018

@arno01

This comment has been minimized.

Show comment
Hide comment
@arno01

arno01 Jul 28, 2018

It is clear that a poor load balancer (IPVS) was picked for the Docker Swarm's ingress. If it was supporting at least the L4 proxy protocol then this would not be an issue. Except that it would still be a L4(TCP) load balancer without all the extra features that L7 lb can give.

In Kubernetes there are L4(TCP)-L7(HTTP) load balancers like nginx ingress, haproxy ingress which both allow usage of the L4 proxy protocol or the L7 HTTP headers to ensure X-Forwarded-For is leveraged for passing the user's real IP to the backend.

I am wondering what would the Docker Swarm ingress's developers say. Probably someone has to move this case to https://github.com/docker/swarmkit/issues ?

arno01 commented Jul 28, 2018

It is clear that a poor load balancer (IPVS) was picked for the Docker Swarm's ingress. If it was supporting at least the L4 proxy protocol then this would not be an issue. Except that it would still be a L4(TCP) load balancer without all the extra features that L7 lb can give.

In Kubernetes there are L4(TCP)-L7(HTTP) load balancers like nginx ingress, haproxy ingress which both allow usage of the L4 proxy protocol or the L7 HTTP headers to ensure X-Forwarded-For is leveraged for passing the user's real IP to the backend.

I am wondering what would the Docker Swarm ingress's developers say. Probably someone has to move this case to https://github.com/docker/swarmkit/issues ?

@djmaze

This comment has been minimized.

Show comment
Hide comment
@djmaze

djmaze Jul 28, 2018

Contributor

In Kubernetes there are L4(TCP)-L7(HTTP) load balancers like nginx ingress, haproxy ingress which both allow usage of the L4 proxy protocol or the L7 HTTP headers to ensure X-Forwarded-For is leveraged for passing the user's real IP to the backend.

AFAICS, those LB services are not embedded into K8s but services which need to be explicitely deployed. You can do the same with Docker swarm as well. I do not see a difference here. (Apart from that the nginx ingress controller seems to be "official".)

Contributor

djmaze commented Jul 28, 2018

In Kubernetes there are L4(TCP)-L7(HTTP) load balancers like nginx ingress, haproxy ingress which both allow usage of the L4 proxy protocol or the L7 HTTP headers to ensure X-Forwarded-For is leveraged for passing the user's real IP to the backend.

AFAICS, those LB services are not embedded into K8s but services which need to be explicitely deployed. You can do the same with Docker swarm as well. I do not see a difference here. (Apart from that the nginx ingress controller seems to be "official".)

@setiseta

This comment has been minimized.

Show comment
Hide comment
@setiseta

setiseta Jul 28, 2018

As far as i know, the difference is that even if you deploy such a loadbalancing service it will be 'called' from the swarmkit loadbalancer and so you loose the users ip. So you can not disable the swarmkit loadbalancer if not using hostmode.

setiseta commented Jul 28, 2018

As far as i know, the difference is that even if you deploy such a loadbalancing service it will be 'called' from the swarmkit loadbalancer and so you loose the users ip. So you can not disable the swarmkit loadbalancer if not using hostmode.

@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Jul 28, 2018

sandys commented Jul 28, 2018

@jamiejackson

This comment has been minimized.

Show comment
Hide comment
@jamiejackson

jamiejackson Aug 8, 2018

I thought we were finished ironing out our swarm wrinkles, but then we got to stage and noticed that all external access to the web server container appears as the ingress network IP.

I'm running my stack on a single-node swarm and will be doing so for at least the next few months. Can you recommend the least bad workaround for our current (single-node swarm) use case? I can't do without the client IP--too much relies on it.

jamiejackson commented Aug 8, 2018

I thought we were finished ironing out our swarm wrinkles, but then we got to stage and noticed that all external access to the web server container appears as the ingress network IP.

I'm running my stack on a single-node swarm and will be doing so for at least the next few months. Can you recommend the least bad workaround for our current (single-node swarm) use case? I can't do without the client IP--too much relies on it.

@maximelb

This comment has been minimized.

Show comment
Hide comment
@maximelb

maximelb Aug 8, 2018

Our temporary approach has been to run a simple proxy container in “global” mode (which IIRC can get the actual NIC’s IP) and then have it forward all connections to the internal service running on the swarm overlay network with added proxy headers.

If getting an x-forwarded-for header is enough for you, that setup should work AFAICT.

maximelb commented Aug 8, 2018

Our temporary approach has been to run a simple proxy container in “global” mode (which IIRC can get the actual NIC’s IP) and then have it forward all connections to the internal service running on the swarm overlay network with added proxy headers.

If getting an x-forwarded-for header is enough for you, that setup should work AFAICT.

@jamiejackson

This comment has been minimized.

Show comment
Hide comment
@jamiejackson

jamiejackson Aug 8, 2018

Thanks, @maximelb. What did you end up going with (e.g., nginx, haproxy)?

jamiejackson commented Aug 8, 2018

Thanks, @maximelb. What did you end up going with (e.g., nginx, haproxy)?

@maximelb

This comment has been minimized.

Show comment
Hide comment
@maximelb

maximelb Aug 8, 2018

@jamiejackson that’s where things will be a bit different. In our case we are running a server that hosts long-running SSL connections and a custom binary protocol underneath so HTTP proxies were not possible. So we created a simple TCP forwarder and used a “msgpack” header that we could unpack manually on the internal server.

I’m not super familiar with HTTP proxies but I suspect most of them would do the trick for you. :-/

maximelb commented Aug 8, 2018

@jamiejackson that’s where things will be a bit different. In our case we are running a server that hosts long-running SSL connections and a custom binary protocol underneath so HTTP proxies were not possible. So we created a simple TCP forwarder and used a “msgpack” header that we could unpack manually on the internal server.

I’m not super familiar with HTTP proxies but I suspect most of them would do the trick for you. :-/

@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Aug 8, 2018

sandys commented Aug 8, 2018

@maximelb

This comment has been minimized.

Show comment
Hide comment
@maximelb

maximelb Aug 8, 2018

@sandys sure, here is an excerpt from our docker-compose with the relevant containers.

This is the reverse proxy docker-compose entry:

reverseproxy:
    image: yourorg/repo-proxy:latest
    networks:
      - network_with_backend_service
    deploy:
      mode: global
    ports:
      - target: 443
        published: 443
        protocol: tcp
        mode: host

This is the backend service entry:

backendservice:
    image: yourorg/repo-backend:latest
    networks:
      - network_with_backend_service
    deploy:
      replicas: 2

The target of the reverseproxy (the backend side) would be tasks.backendservice (which has A records for every replica). You can skip the networks part if the backend service is on the default swarm overlay network.

The global bit says "deploy this container exactly-once on every Docker swarm node. The ports mode: host is the one saying "bind to the native NIC of the node".

Hope it helps.

maximelb commented Aug 8, 2018

@sandys sure, here is an excerpt from our docker-compose with the relevant containers.

This is the reverse proxy docker-compose entry:

reverseproxy:
    image: yourorg/repo-proxy:latest
    networks:
      - network_with_backend_service
    deploy:
      mode: global
    ports:
      - target: 443
        published: 443
        protocol: tcp
        mode: host

This is the backend service entry:

backendservice:
    image: yourorg/repo-backend:latest
    networks:
      - network_with_backend_service
    deploy:
      replicas: 2

The target of the reverseproxy (the backend side) would be tasks.backendservice (which has A records for every replica). You can skip the networks part if the backend service is on the default swarm overlay network.

The global bit says "deploy this container exactly-once on every Docker swarm node. The ports mode: host is the one saying "bind to the native NIC of the node".

Hope it helps.

@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Aug 8, 2018

sandys commented Aug 8, 2018

@maximelb

This comment has been minimized.

Show comment
Hide comment
@maximelb

maximelb Aug 8, 2018

Not 100% sure on what you mean, but externally we use a DNS with an A record per cluster node. This provides cheap "balancing" without having an external moving part. When a client makes a request, they chose a random A record, and connect to 443 on one of the cluster nodes.

There, the reverse proxy that is running on that specific node and listening on 443 gets a native connection, including the actual client IP. That reverse proxy container then adds a header and forwards the connection to another internal container using the swarm overlay network (tasks.backend). Since it uses the tasks.backend target, it will also get a random A record for an internal service.

So in the strict sense, it is bypassing magic of the overlay network that redirects the connection. It instead kind of replicates this behavior with the reverse proxy and adds a header. The final effect is the same (in a loose sense) as the magic of the overlay network. It also does it in parallel to running the swarm, meaning I can run all my other services that do not require the client IP on the same cluster without doing anything else for those.

By no means a perfect solution but until a fix is made (if ever) it gets you by without external components or major docker configuration.

maximelb commented Aug 8, 2018

Not 100% sure on what you mean, but externally we use a DNS with an A record per cluster node. This provides cheap "balancing" without having an external moving part. When a client makes a request, they chose a random A record, and connect to 443 on one of the cluster nodes.

There, the reverse proxy that is running on that specific node and listening on 443 gets a native connection, including the actual client IP. That reverse proxy container then adds a header and forwards the connection to another internal container using the swarm overlay network (tasks.backend). Since it uses the tasks.backend target, it will also get a random A record for an internal service.

So in the strict sense, it is bypassing magic of the overlay network that redirects the connection. It instead kind of replicates this behavior with the reverse proxy and adds a header. The final effect is the same (in a loose sense) as the magic of the overlay network. It also does it in parallel to running the swarm, meaning I can run all my other services that do not require the client IP on the same cluster without doing anything else for those.

By no means a perfect solution but until a fix is made (if ever) it gets you by without external components or major docker configuration.

@oppodeldoc

This comment has been minimized.

Show comment
Hide comment
@oppodeldoc

oppodeldoc Aug 8, 2018

@jamiejackson the "least bad" workaround we've found is using Traefik as a global service in host mode. They have a good generic example in their docs. We've seen some bugs that may or may not be related to this setup, but Traefik is a great project and it seems pretty stable on Swarm. There's a whole thread on their issues page on it (that loops back here :) ), with similar workarounds:
containous/traefik#1880

Hope this helps. We also can't use a solution that doesn't allow us to check actual requester IPs so we're stuck with this kludge fix until something changes. It seems like a pretty common need, for security reasons at least.

oppodeldoc commented Aug 8, 2018

@jamiejackson the "least bad" workaround we've found is using Traefik as a global service in host mode. They have a good generic example in their docs. We've seen some bugs that may or may not be related to this setup, but Traefik is a great project and it seems pretty stable on Swarm. There's a whole thread on their issues page on it (that loops back here :) ), with similar workarounds:
containous/traefik#1880

Hope this helps. We also can't use a solution that doesn't allow us to check actual requester IPs so we're stuck with this kludge fix until something changes. It seems like a pretty common need, for security reasons at least.

@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Aug 8, 2018

sandys commented Aug 8, 2018

@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Aug 8, 2018

sandys commented Aug 8, 2018

@cpuguy83

This comment has been minimized.

Show comment
Hide comment
@cpuguy83

cpuguy83 Aug 8, 2018

Contributor

Well, Docker does not currently touch ingress traffic, so definitely at least not insignificant to add.
Keep in mind also this is an open source project, if you really want something then it's generally going to be up to you to implement it.

Contributor

cpuguy83 commented Aug 8, 2018

Well, Docker does not currently touch ingress traffic, so definitely at least not insignificant to add.
Keep in mind also this is an open source project, if you really want something then it's generally going to be up to you to implement it.

justinclift added a commit to sqlitebrowser/sqlitebrowser that referenced this issue Aug 9, 2018

Revert back to GitHub for downloads atm
This experiment has shown there are some important things to resolve
first:

  * Client IP address is being lost.  Looks like Moby issue 25526:

      moby/moby#25526

    There are some suggested fixes that we'll investigate.

  * The present solution doesn't seem to be handling IPv6

    That'll need fixing too. ;)
@JoelLinn

This comment has been minimized.

Show comment
Hide comment
@JoelLinn

JoelLinn Sep 10, 2018

+1, this really is a showstopper.
I would believe the majority of applications needs the real clients ip. Just think of a mailserver stack - you cannt afford to accept mails from arbitrary hosts.

JoelLinn commented Sep 10, 2018

+1, this really is a showstopper.
I would believe the majority of applications needs the real clients ip. Just think of a mailserver stack - you cannt afford to accept mails from arbitrary hosts.

@rubot

This comment has been minimized.

Show comment
Hide comment
@rubot

rubot Sep 11, 2018

We switched to proxy_protocol nginx global stream instance host mode, which is forwarding to replicated application proxy_nginx. This works well enough for the moment.

service global nginx_stream

stream {
    resolver_timeout 5s;
    # 127.0.0.11 is docker swarms dns server
    resolver 127.0.0.11 valid=30s;
    # set does not work in stream module, using map here
    map '' $upstream_endpoint {
        default proxy_nginx:443;
    }

    server {
        listen 443;
        proxy_pass $upstream_endpoint;
        proxy_protocol on;
    }
}

service replicated nginx_proxy

server {
    listen 443 ssl http2 proxy_protocol;
    include /ssl.conf.include;

    ssl_certificate /etc/nginx/certs/main.crt;
    ssl_certificate_key /etc/nginx/certs/main.key;

    server_name example.org;

    auth_basic           "closed site";
    auth_basic_user_file /run/secrets/default.htpasswd;

    # resolver info in nginx.conf
    set $upstream_endpoint app;
    location / {
        # relevant proxy_set_header in nginx.conf
        proxy_pass http://$upstream_endpoint;
    }
}

rubot commented Sep 11, 2018

We switched to proxy_protocol nginx global stream instance host mode, which is forwarding to replicated application proxy_nginx. This works well enough for the moment.

service global nginx_stream

stream {
    resolver_timeout 5s;
    # 127.0.0.11 is docker swarms dns server
    resolver 127.0.0.11 valid=30s;
    # set does not work in stream module, using map here
    map '' $upstream_endpoint {
        default proxy_nginx:443;
    }

    server {
        listen 443;
        proxy_pass $upstream_endpoint;
        proxy_protocol on;
    }
}

service replicated nginx_proxy

server {
    listen 443 ssl http2 proxy_protocol;
    include /ssl.conf.include;

    ssl_certificate /etc/nginx/certs/main.crt;
    ssl_certificate_key /etc/nginx/certs/main.key;

    server_name example.org;

    auth_basic           "closed site";
    auth_basic_user_file /run/secrets/default.htpasswd;

    # resolver info in nginx.conf
    set $upstream_endpoint app;
    location / {
        # relevant proxy_set_header in nginx.conf
        proxy_pass http://$upstream_endpoint;
    }
}
@sandys

This comment has been minimized.

Show comment
Hide comment
@sandys

sandys Sep 11, 2018

sandys commented Sep 11, 2018

@djmaze

This comment has been minimized.

Show comment
Hide comment
@djmaze

djmaze Sep 11, 2018

Contributor

@sandys I've got a haproxy based solution for the proxy protocol part which is configured via environment variables.

Contributor

djmaze commented Sep 11, 2018

@sandys I've got a haproxy based solution for the proxy protocol part which is configured via environment variables.

@rubot

This comment has been minimized.

Show comment
Hide comment
@rubot

rubot Sep 11, 2018

Would it be possible to past the whole nginx config for nginx_stream and nginx_proxy with their Swarm configs ? This is awesome if it works !

@sandys Something like this:
https://gist.github.com/rubot/10c79ee0086a8a246eb43ab631f3581f

rubot commented Sep 11, 2018

Would it be possible to past the whole nginx config for nginx_stream and nginx_proxy with their Swarm configs ? This is awesome if it works !

@sandys Something like this:
https://gist.github.com/rubot/10c79ee0086a8a246eb43ab631f3581f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment