Skip to content

Unable to retrieve user's IP address in docker swarm mode #25526

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
PanJ opened this issue Aug 9, 2016 · 362 comments
Open

Unable to retrieve user's IP address in docker swarm mode #25526

PanJ opened this issue Aug 9, 2016 · 362 comments
Labels
area/networking area/swarm kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/needs-attention Calls for a collective discussion during a review session version/1.12

Comments

@PanJ
Copy link

PanJ commented Aug 9, 2016

Output of docker version:

Client:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:00:36 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.12.0
 API version:  1.24
 Go version:   go1.6.3
 Git commit:   8eab29e
 Built:        Thu Jul 28 22:00:36 2016
 OS/Arch:      linux/amd64

Output of docker info:

Containers: 155
 Running: 65
 Paused: 0
 Stopped: 90
Images: 57
Server Version: 1.12.0
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 868
 Dirperm1 Supported: false
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: host overlay null bridge
Swarm: active
 NodeID: 0ddz27v59pwh2g5rr1k32d9bv
 Is Manager: true
 ClusterID: 32c5sn0lgxoq9gsl1er0aucsr
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot interval: 10000
  Heartbeat tick: 1
  Election tick: 3
 Dispatcher:
  Heartbeat period: 5 seconds
 CA configuration:
  Expiry duration: 3 months
 Node Address: 172.31.24.209
Runtimes: runc
Default Runtime: runc
Security Options: apparmor
Kernel Version: 3.13.0-92-generic
Operating System: Ubuntu 14.04.4 LTS
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.42 GiB
Name: ip-172-31-24-209
ID: 4LDN:RTAI:5KG5:KHR2:RD4D:MV5P:DEXQ:G5RE:AZBQ:OPQJ:N4DK:WCQQ
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: panj
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Insecure Registries:
 127.0.0.0/8

Additional environment details (AWS, VirtualBox, physical, etc.):

Steps to reproduce the issue:

  1. run following service which publishes port 80
docker service create \
--name debugging-simple-server \
--publish 80:3000 \
panj/debugging-simple-server
  1. Try connecting with http://<public-ip>/.

Describe the results you received:
Neither ip nor header.x-forwarded-for is the correct user's IP address.

Describe the results you expected:
ip or header.x-forwarded-for should be user's IP address. The expected result can be archieved using standalone docker container docker run -d -p 80:3000 panj/debugging-simple-server. You can see both of the results via following links,
http://swarm.issue-25526.docker.takemetour.com:81/
http://container.issue-25526.docker.takemetour.com:82/

Additional information you deem important (e.g. issue happens only occasionally):
This happens on both global mode and replicated mode.

I am not sure if I missed anything that should solve this issue easily.

In the meantime, I think I have to do a workaround which is running a proxy container outside of swarm mode and let it forward to published port in swarm mode (SSL termination should be done on this container too), which breaks the purpose of swarm mode for self-healing and orchestration.

@thaJeztah thaJeztah added kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. area/networking area/swarm labels Aug 9, 2016
@thaJeztah
Copy link
Member

/cc @aluzzardi @mrjana ptal

@mavenugo
Copy link
Contributor

mavenugo commented Aug 9, 2016

@PanJ can you please share some details on how debugging-simple-server determines the ip ? Also what is the expectation if a service is scaled to more than 1 replica across multiple hosts (or global mode) ?

@PanJ
Copy link
Author

PanJ commented Aug 9, 2016

@mavenugo it's koa's request object which uses node's remoteAddress from net module. The result should be the same for any other libraries that can retrieve remote address.

The expectation is that ip field should always be remote address regardless of any configuration.

@marech
Copy link

marech commented Sep 19, 2016

@PanJ you still use your workaround or found some better solution?

@sanimej
Copy link

sanimej commented Sep 19, 2016

@PanJ When I run your app as a standalone container..

docker run -it --rm -p 80:3000 --name test panj/debugging-simple-server

and access the published port from another host I get this

vagrant@net-1:~$ curl 192.168.33.12
{"method":"GET","url":"/","header":{"user-agent":"curl/7.38.0","host":"192.168.33.12","accept":"*/*"},"ip":"::ffff:192.168.33.11","ips":[]}
vagrant@net-1:~$

192.168.33.11 is the IP of the host in which I am running curl. Is this the expected behavior ?

@PanJ
Copy link
Author

PanJ commented Sep 19, 2016

@sanimej Yes, it is the expected behavior that should be on swarm mode as well.

@PanJ
Copy link
Author

PanJ commented Sep 19, 2016

@marech I am still using the standalone container as a workaround, which works fine.

In my case, there are 2 nginx intances, standalone and swarm instances. SSL termination and reverse proxy is done on standalone nginx. Swarm instance is used to route to other services based on request host.

@sanimej
Copy link

sanimej commented Sep 19, 2016

@PanJ The way the published port of a container is accessed is different in swarm mode. In the swarm mode a service can be reached from any node in the cluster. To facilitate this we route through an ingress network. 10.255.0.x is the address of the ingress network interface on the host in the cluster from which you try to reach the published port.

@PanJ
Copy link
Author

PanJ commented Sep 19, 2016

@sanimej I kinda saw how it works when I dug into the issue. But the use case (ability to retrieve user's IP) is quite common.

I have limited knowledge on how the fix should be implemented. Maybe a special type of network that does not alter source IP address?

Rancher is similar to Docker swarm mode and it seems to have expected behavior. Maybe it is a good place to start.

@marech
Copy link

marech commented Sep 20, 2016

@sanimej good idea could be add all IPs to X-Forwarded-For header if its possible then we can see all chain.

@PanJ hmm, and how your nignx standalone container communicate to swarm instance, via service name or ip? Maybe can share nginx config part where you pass it to swarm instance.

@PanJ
Copy link
Author

PanJ commented Sep 20, 2016

@marech standalone container listens to port 80 and then proxies to localhost:8181

server {
  listen 80 default_server;
  location / {
    proxy_set_header        Host $host;
    proxy_set_header        X-Real-IP $remote_addr;
    proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header        X-Forwarded-Proto $scheme;
    proxy_pass          http://localhost:8181;
    proxy_read_timeout  90;
  }
}

If you have to do SSL termination, add another server block that listens to port 443, then do the SSL termination and proxies to localhost:8181 as well

Swarm mode's nginx publishes 8181:80 and routes to another service based on request host.

server {
  listen 80;
  server_name your.domain.com;
  location / {
    proxy_pass          http://your-service:80;
    proxy_set_header Host $host;
    proxy_read_timeout  90;
  }
}

server {
  listen 80;
  server_name another.domain.com;
  location / {
    proxy_pass          http://another-service:80;
    proxy_set_header Host $host;
    proxy_read_timeout  90;
  }
}

@o3o3o
Copy link

o3o3o commented Oct 24, 2016

In our case, our API RateLimit and other functions is depend on the user's ip address. Is there any way to skip the problem in swarm mode?

@darrellenns
Copy link

I've also run into the issue when trying to run logstash in swarm mode (for collecting syslog messages from various hosts). The logstash "host" field always appears as 10.255.0.x, instead of the actual IP of the connecting host. This makes it totally unusable, as you can't tell which host the log messages are coming from. Is there some way we can avoid translating the source IP?

@vfarcic
Copy link

vfarcic commented Nov 2, 2016

+1 for a solution for this issue.

Without the ability to retrieve user's IP prevents us from using monitoring solutions like Prometheus.

@darrellenns
Copy link

Perhaps the linux kernel IPVS capabilities would be of some use here. I'm guessing that the IP change is taking place because the connections are being proxied in user space. IPVS, on the other hand, can redirect and load balance requests in kernel space without changing the source IP address. IPVS could also be good down the road for building in more advanced functionality, such as different load balancing algorithms, floating IP addresses, and direct routing.

@vfarcic
Copy link

vfarcic commented Nov 2, 2016

For me, it would be enough if I could somehow find out the relation between the virtual IP and the IP of the server the endpoint belongs to. That way, when Prometheus send an alert related to some virtual IP, I could find out what is the affected server. It would not be a good solution but it would be better than nothing.

@darrellenns
Copy link

@vfarcic I don't think that's possible with the way it works now. All client connections come from the same IP, so you can't translate it back. The only way that would work is if whatever is doing the proxy/nat of the connections saved a connection log with timestamp, source ip, and source port. Even then, it wouldn't be much help in most use cases where the source IP is needed.

@vfarcic
Copy link

vfarcic commented Nov 2, 2016

I probably did not explain well the use case.

I use Prometheus that is configured to scrap exporters that are running as Swarm global services. It uses tasks.<SERVICE_NAME> to get the IPs of all replicas. So, it's not using the service but replica endpoints (no load balancing). What I'd need is to somehow figure out the IP of the node where each of those replica IPs come from.

@vfarcic
Copy link

vfarcic commented Nov 3, 2016

I just realized the "docker network inspect <NETWORK_NAME>" provides information about containers and IPv4 addresses of a single node. Can this be extended so that there is a cluster-wide information of a network together with nodes?

Something like:

       "Containers": {
            "57bc4f3d826d4955deb32c3b71550473e55139a86bef7d5e584786a3a5fa6f37": {
                "Name": "cadvisor.0.8d1s6qb63xdir22xyhrcjhgsa",
                "EndpointID": "084a032fcd404ae1b51f33f07ffb2df9c1f9ec18276d2f414c2b453fc8e85576",
                "MacAddress": "02:42:0a:00:00:1e",
                "IPv4Address": "10.0.0.30/24",
                "IPv6Address": "",
                "Node": "swarm-4"
            },
...

Note the addition of the "Node".

If such information would be available for the whole cluster, not only a single node with the addition of a --filter argument, I'd have everything I'd need to figure out the relation between a container IPv4 address and the node. It would not be a great solution but still better than nothing. Right now, when Prometheus detects a problem, I need to execute "docker network inspect" on each node until I find out the location of the address.

@tlvenn
Copy link

tlvenn commented Nov 3, 2016

I agree with @dack , given the ingress network is using IPVS, we should solve this issue using IPVS so that the source IP is preserved and presented to the service correctly and transparently.

The solution need to work at the IP level so that any service that are not based on HTTP can still work properly as well (Can't rely on http headers...).

And I cant stress out how important this is, without it, there are many services that simply cant operate at all in swarm mode.

@tlvenn
Copy link

tlvenn commented Nov 3, 2016

That's how HaProxy is solving this issue: http://blog.haproxy.com/2012/06/05/preserve-source-ip-address-despite-reverse-proxies/

@tlvenn
Copy link

tlvenn commented Nov 3, 2016

@kobolog might be able to shed some light on this matter given his talk on IPVS at DockerCon.

@thaJeztah thaJeztah added the status/needs-attention Calls for a collective discussion during a review session label Nov 3, 2016
@struanb
Copy link

struanb commented Jun 14, 2022

Hi @jerrac. Thank you for trying Docker Ingress Routing Daemon (DIND). Please raise an issue describing the symptoms you're experiencing and we'll be happy to help. DIND is used in production, so we know it does work (at least in many standard configurations).

@funkypenguin
Copy link

Traefik is also involved, so I've posted on their forums as well. Full details of my set up are there.

FWIW, I've used Traefik in host mode to avoid this issue - details at https://geek-cookbook.funkypenguin.co.nz/ha-docker-swarm/traefik/

@jerrac
Copy link

jerrac commented Jun 15, 2022

@funkypenguin Huh, I thought that since docker stack deploy said it ignores network_mode the mode option on the ports would be ignored as well. (I had actually run across your post during my searching...)

I just tried adding mode: host to my ports config on my Traefik service. Doesn't seem to have done anything. I still don't see my client ip address.

@funkypenguin
Copy link

IIRC it was necessary to use the "long form" of the ports definition, like so:

    ports:
      - target: 80
        published: 80
        protocol: tcp
        mode: host
      - target: 443
        published: 443
        protocol: tcp
        mode: host
      - target: 8080
        published: 8080
        protocol: tcp

@jerrac
Copy link

jerrac commented Jun 15, 2022

@funkypenguin Yep, that's what I'm using.

    ports:
      - target: 80
        published: 80
        protocol: tcp
        mode: host
      - target: 443
        published: 443
        protocol: tcp
        mode: host

@benz0li
Copy link

benz0li commented Jun 15, 2022

@funkypenguin
Copy link

@jerrac what version of Docker are you using? Mine doesn't complain about not supporting network mode upon deploy...

root@swarm:/var/data/config/traefikv2# docker stack deploy traefikv2 -c traefikv2.yml
Updating service traefikv2_app (id: i7httr4imjwdmnqhg9vmv0cca)
root@swarm:/var/data/config/traefikv2# docker -v
Docker version 19.03.12, build 48a66213fe
root@swarm:/var/data/config/traefikv2#

@benz0li
Copy link

benz0li commented Jun 15, 2022

@jerrac What version (3.9 being the latest) are you stating in your YAML file to deploy the stack?

@jerrac
Copy link

jerrac commented Jun 15, 2022

@funkypenguin I'm on Docker 20.10.17.

Here is my full docker-compose file: https://pastebin.com/wcCw5i7C

I've been applying it via docker stack deploy --compose-file docker-compose.yml servicename.

@kaysond
Copy link

kaysond commented Jun 15, 2022

Re: traefik - don't you also have to deploy it as global? Maybe for a single node it doesn't matter, but for multiple I don't believe the mesh network will route traffic.

@jerrac
Copy link

jerrac commented Jun 15, 2022

Update (posting this in all 3 places I asked for help...):
I turned off the docker-ingress-routing-daemon and configured the ports on my Traefik service to set mode: host. This left me with just the load balancer's ip in XFF. The same effect running the daemon had.

A while later, after a meeting, I randomly decided to try terminating https at my external load balancer. After re-configuring Traefik to not redirect to 443, and configuring my service to use the 80 entrypoint, I can see my client ip in my container logs.

Does this make sense?

@struanb
Copy link

struanb commented Jun 15, 2022

@jerrac - as also explained here: newsnowlabs/docker-ingress-routing-daemon#24 (comment) :-

To be clear, DIND exists to transform Docker's ingress routing mesh to use policy routing instead of SNAT, to redirect client traffic to service nodes. It will only work to preserve the client IP if incoming requests directly reach a load-balancer node on a port published for a service via the ingress routing mesh. DIND is a network-layer tool (IPv4) and cannot inspect or modify HTTP headers.

I understand Traefik has often been used as a reverse proxy to work around the same limitation as DIND. In this model, incoming requests much directly reach the reverse proxy, which presumably must not be using the ingress routing mesh, but instead have its ports published using host mode, and be launched using --mode global. The Traefik reverse proxy will see the client IP of requests and can add these to the XFF header before reverse proxying them to an internal application service.

DIND therefore exists to solve a similar problem as a Traefik reverse proxy service placed in front of an internal application service, but without the need for the extra Traefik service (or for proxying, or for introduction/modification of XFF headers) and therefore without modification of the application service (if it doesn't natively support XFF headers).

Combining DIND with Traefik should allow Traefik itself to be deployed using the ingress routing mesh, which could be useful if Traefik is providing additional benefits in one's setup.

However, I'm not sure I can see a use-case for combining DIND with an internal application service published via the ingress routing mesh, and still fronted by a Traefik reverse proxy. Since the reverse proxy node is the client for the internal application service request, doing this will just expose the Docker network IP of that node, instead of the ingress network IP, to the internal application service.

Hope this makes sense.

@jerrac
Copy link

jerrac commented Jun 16, 2022

@struanb

Yes, that makes sense. Thanks for the clear explanation. :)

Traefik also provides routing based on hostname. So I think I'll likely stick to using it.

Thanks again, your help was appreciated!

@xucian
Copy link

xucian commented Feb 25, 2024

@jerrac - as also explained here: newsnowlabs/docker-ingress-routing-daemon#24 (comment) :-

To be clear, DIND exists to transform Docker's ingress routing mesh to use policy routing instead of SNAT, to redirect client traffic to service nodes. It will only work to preserve the client IP if incoming requests directly reach a load-balancer node on a port published for a service via the ingress routing mesh. DIND is a network-layer tool (IPv4) and cannot inspect or modify HTTP headers.

I understand Traefik has often been used as a reverse proxy to work around the same limitation as DIND. In this model, incoming requests much directly reach the reverse proxy, which presumably must not be using the ingress routing mesh, but instead have its ports published using host mode, and be launched using --mode global. The Traefik reverse proxy will see the client IP of requests and can add these to the XFF header before reverse proxying them to an internal application service.

DIND therefore exists to solve a similar problem as a Traefik reverse proxy service placed in front of an internal application service, but without the need for the extra Traefik service (or for proxying, or for introduction/modification of XFF headers) and therefore without modification of the application service (if it doesn't natively support XFF headers).

Combining DIND with Traefik should allow Traefik itself to be deployed using the ingress routing mesh, which could be useful if Traefik is providing additional benefits in one's setup.

However, I'm not sure I can see a use-case for combining DIND with an internal application service published via the ingress routing mesh, and still fronted by a Traefik reverse proxy. Since the reverse proxy node is the client for the internal application service request, doing this will just expose the Docker network IP of that node, instead of the ingress network IP, to the internal application service.

Hope this makes sense.

hi! thanks for posting this. any idea if this comes with any unexpected limitations (as compared to how k3s ingress lbs do it)?
I'd really like to keep my docker composes and run swarm for now, and anything helping me see the originating ip (which is an obvious expectation) is crucial

@bohdan-shulha
Copy link

Are there any recommended solutions yet?

This bug eliminates the whole class of applications - basicaly, every application that needs to get the IP address of a user. So, no dockerized firewalls, no anti-spam/anti-fraud detection software, no IP address whitelists. Awful.

@struanb
Copy link

struanb commented Aug 25, 2024

@xucian @bohdan-shulha Have you tried Docker Ingress Routing Daemon (DIRD, formerly DIND), or are there particulars of your server/infrastructure setup that prevent its use?

If so, I'd be keen to hear what those are. Several years on, we continue to run DIRD stably in production at NewsNow.

I know it's not an officially recommended solution but, as explained elsewhere, once DIRD is installed across the nodes of a Docker swarm, there's really nothing else you have to do for swarm services to see the originating client IP. This would therefore allow a range of software, like anti-spam, to be deployed using docker service create.

@bohdan-shulha
Copy link

No, not yet tried, sorry.

Had to ask just because the thread is soooo long already (8-year long discussion puts it's fingerprints) and I was curious whether there is an official solition to the problem.

Thanks once more for the tips. <3

@bohdan-shulha
Copy link

Nice, seems to be a working solution.

Thanks, @struanb !

image

@xucian
Copy link

xucian commented Aug 29, 2024

Docker Ingress Routing Daemon

finally some light, thanks for sharing!

@scyto
Copy link

scyto commented Nov 28, 2024

Is there any risk to installing DIRD on an existing swarm? will it break the normal overlay routing or any other downside?

@struanb
Copy link

struanb commented Nov 28, 2024

@scyto There is always risk, modifying a live service, but it should work if done carefully. My recommendations for calling DIRD safely are:

  • work out the <ips> argument for --ingress-gateway-ips <ips> carefully in advance - getting this wrong will easily break things;
    • a command to run on a manager to generate <ips> is docker node ls --format "{{.Hostname}}" | pdsh -N -R ssh -w - "docker network inspect ingress --format '{{index (split (index .Containers \"ingress-sbox\").IPv4Address \"/\") 0}}' 2>/dev/null" | sort -t . -n -k 1,1 -k 2,2 -k 3,3 -k 4,4' (pdsh and ssh needed) though N.B. this <ips> list is overkill if you only use a subset of nodes as load balancers;
    • be sure to embed <ips> in double quotes if it contains spaces e.g. --ingress-gateway-ips "10.0.0.2 10.0.0.3"; otherwise separate IPs with commas e.g. --ingress-gateway-ips 10.0.0.2,10.0.0.3;
  • use the --preexisting option so that when DIRD is launched it applies policy routing rules to each matching preexisting container;
  • use --services <services>, --tcp-ports <tcp-ports> and if you need it --udp-ports <udp-ports> to whitelist DIRD behaviour for only the specific swarm services and ports that you need to;
  • use --iptables-wait or --iptables-wait-seconds <n> to avoid possible errors resulting from contention with other firewall apps for the iptables lock;
  • using [OPTIONS] derived accordingly, prepare a dird.service systemd unit that launches docker-ingress-routing-daemon --install [OPTIONS] and deploy it (to /etc/systemd/system, or /usr/local/lib/systemd/system according to your distribution) to all swarm nodes (whether load balancer and/or service container nodes) BUT without running it;
  • enable the systemd unit across all nodes using e.g. docker node ls --format "{{.Hostname}}" | pdsh -R ssh -w - 'systemctl daemon-reload; systemctl enable dird'
  • launch across all nodes as quickly and as simultaneously as possible, using e.g. docker node ls --format "{{.Hostname}}" | pdsh -R ssh -w - systemctl start dird

Even using the pdsh launch method, a few (milli)seconds of downtime might be unavoidable. I'll look into whether there is any way to avoid even that, and if I find a way I'll update this comment.

@struanb
Copy link

struanb commented Nov 30, 2024

@scyto From my tests I've found that during the migration to DIRD, if you can restrict incoming public requests to a subset of your nodes (i.e. nodes you nominate as "load balancers"), a recipe that works very smoothly is as follows:

  1. First, launch DIRD, fully configured, on all nominated non-load-balancer nodes; this will not break connection handling, as long as the incoming connections are terminated by a nominated load-balancer node not yet running DIRD.
  2. Second, bring up DIRD simultaneously on all remaining nodes, i.e. on the nominated load-balancer nodes.

Put another way, before launching DIRD on some node(s), make sure you have previously removed their public load-balancer IPs from the pool of load balancer endpoints your public requests are reaching. Repeat the process as many times as needed until you are left with a minimal set of load balancer node(s). Finally launch DIRD on the remaining load balancer node(s).

This works because:

  • Nodes already running DIRD will be able to load-balance only to containers on nodes already running DIRD.
  • Nodes not yet running DIRD will be able to load-balance to containers, both on nodes already running and not yet running DIRD.

Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking area/swarm kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/needs-attention Calls for a collective discussion during a review session version/1.12
Projects
None yet
Development

No branches or pull requests