Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server ip is marked as down when using different internal networks #1132

Closed
gtaspider opened this issue May 8, 2018 · 42 comments
Closed

server ip is marked as down when using different internal networks #1132

gtaspider opened this issue May 8, 2018 · 42 comments
Labels
kind/bug Issue reporting a bug

Comments

@gtaspider
Copy link

gtaspider commented May 8, 2018

Got a docker-compose file which will have multiple internal docker networks. The proxy seems to find the correct ip and set it to the correct upstream but the server is marked as down, eventhough the commet says nginx was able to connect to the right network:

default.conf:

#[...]
# cloud.website.com
upstream cloud.website.com {
				# Cannot connect to network of this container
				server 127.0.0.1 down;
				## Can be connected with "test_net-dashboard" network
		# test_nextcloud_1
			server 172.31.0.4 down;
}
server {
	server_name cloud.website.com;
	listen 80 ;
	access_log /var/log/nginx/access.log vhost;
	location / {
		proxy_pass http://cloud.website.com;
	}
}

When I remove the down, safe it and then reload nginx with docker exec -it test_proxy_1 nginx -s reload it works like a charm. But if a container starts/stops the file will be rewritten (this is why I want to use this tool..)

I created a simple docker-compose file, to show the problem:

version: '3'

services:
  db:
    image: postgres:alpine
    restart: always
    volumes:
      - db:/var/lib/postgresql/data
    environment:
      - POSTGRES_PASSWORD=
      - POSTGRES_DB=nextcloud
      - POSTGRES_USER=usr
    networks:
      - net-dashboard
  nextcloud:  
    image: nextcloud:apache
    restart: always
    volumes:
      - nextcloud:/var/www/html
    environment:
      - VIRTUAL_HOST=cloud.website.com
      - POSTGRES_HOST=db
      - MM_USERNAME=usr
      - MM_PASSWORD=
      - MM_DBNAME=nextcloud
    depends_on:
      - db
    networks:
     - net-dashboard
  #NGINX Proxy
  proxy:
    image: jwilder/nginx-proxy:alpine
    restart: always
    ports:
      - 80:80
    volumes:
#      - certs:/etc/nginx/certs:ro
      - vhost.d:/etc/nginx/vhost.d
      - conf.d:/etc/nginx/conf.d
      - html:/usr/share/nginx/html
      - /var/run/docker.sock:/tmp/docker.sock:ro
    networks:
      - net-dashboard
      - default
 

volumes:
  db:
  nextcloud:
#  certs:
  vhost.d:
  conf.d:
  html:


networks:
  net-dashboard:
    internal: true

This seems to be a bug, so I posted it here.
Thanks in advance,
Spider

@phhutter
Copy link

Hi Spider,

same issue here.
tested with jwilder 0.6.0 0.7.0 and redhat 7.5 docker 1.13

I've noticed, that when you startup a container with an internal network only, the ports wont get displayed in a simple docker ps. Is it possible that the proxy can't fetch the port informations as well ?

[root@in001 opt]# docker network create int_network --internal
20985d85d58aef2880642422fa9e5649babada8d8d556327dfc64f693f1f772e

[root@in001 opt]# docker run -d --network int_network -p 80:80 nginx
544847612ffe5f3582c449ada154bb4774598d14fd9f9b13433a04c9fb7cab64

[root@in001 opt]# docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                  NAMES
544847612ffe        nginx               "nginx -g 'daemon ..."   3 seconds ago       Up 2 seconds                               awesome_lamport

When i run a docker inspect i can see the ExposedPorts, even if its an internal network. But the "Ports"-section is empty.

....
            "ExposedPorts": {
                "80/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "NGINX_VERSION=1.13.7-1~stretch",
                "NJS_VERSION=1.13.7.0.1.15-1~stretch"
            ],
            "Cmd": [
                "nginx",
                "-g",
                "daemon off;"
            ],
            "ArgsEscaped": true,
            "Image": "nginx",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "maintainer": "NGINX Docker Maintainers <docker-maint@nginx.com>"
            },
            "StopSignal": "SIGTERM"
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "43831a9bc2278a82023019733015148a5bca2d9f20bd1fd63a175203c68d5fad",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
...

would appreciate if anyone can verify that.
regards,
phhutter

@phhutter
Copy link

i'm digging ...

just need to know how the values get filled into the template.

{{ $CurrentContainer := where $ "ID" .Docker.CurrentContainerID | first }}

{{ define "upstream" }}
        {{ if .Address }}
                {{/* If we got the containers from swarm and this container's port is published to host, use host IP:PORT */}}
                {{ if and .Container.Node.ID .Address.HostPort }}
                        # {{ .Container.Node.Name }}/{{ .Container.Name }}
                        server {{ .Container.Node.Address.IP }}:{{ .Address.HostPort }};
                {{/* If there is no swarm node or the port is not published on host, use container's IP:PORT */}}
                {{ else if .Network }}
                        # {{ .Container.Name }}
                        server {{ .Network.IP }}:{{ .Address.Port }};
                {{ end }}
        {{ else if .Network }}
                # {{ .Container.Name }}
                {{ if .Network.IP }}
                        server {{ .Network.IP }} down;
                {{ else }}
                        server 127.0.0.1 down;
                {{ end }}
        {{ end }}

{{ end }}

@phhutter
Copy link

think the problem would be an empty .Address variable.

#https://github.com/jwilder/docker-gen/blob/a4f4f0167148a5ae52ce063474d850ef8b78f93c/generator.go

		for k, v := range container.NetworkSettings.Ports {
			address := Address{
				IP:           container.NetworkSettings.IPAddress,
				IP6LinkLocal: container.NetworkSettings.LinkLocalIPv6Address,
				IP6Global:    container.NetworkSettings.GlobalIPv6Address,
				Port:         k.Port(),
				Proto:        k.Proto(),
}

but when i inspect the containers, i will get an empty array from the internal network container

external:

[root@in001 ~]# docker inspect --format=" {{ .NetworkSettings.Ports }} " d99315310f36
 map[80/tcp:[{0.0.0.0 80}]]

internal:

[root@in001 ~]# docker inspect --format=" {{ .NetworkSettings.Ports }} " 70c4c3b1d313
 map[]

@gtaspider
Copy link
Author

I can verify your posts. So there is actually no workaround, right?
Thanks for your support!

@meron1122
Copy link

If you don't need separate subnet, but you have separate docker-compose, simple workaround is setting up default network to join nginx network.

Add this of the bottom of your app docker-compose file:

networks:
  default:
    external:
      name: webproxy

@gtaspider
Copy link
Author

Thanks for your response but unfortunately I need an seperate internal subnet for security purposes ...

@lunedis
Copy link

lunedis commented Jun 14, 2018

ran into the same issue, any plan on this getting fixed? or is it working as intended / wontfix?

@phhutter
Copy link

phhutter commented Jun 18, 2018

At the moment theres a workaround, just add a 2nd (not internal) network.

But as gtaspider said, for security reason the generator.go needs a fix. Im not sure if its possible and why the guys took the .Address parameter instead of another one who will show the exposed ports.

Ps. Think it would fix following issue as well #1066.

@webdevotion
Copy link

webdevotion commented Jul 14, 2018

Thanks @meron1122 for the pointer.

I can confirm that removing all internal networks from my app's docker-compose.yml fixed it for me. My default.conf had similar lines with server 127.0.0.1 down; under the upstream entry.

After removing all internal networks and declaring the default, external network everything started working as expected. Mixing in the webproxy network together with other internal networks did not work for me.

For future Googlers.

Put the snippet from @meron1122's comment at the bottom of your docker compose file and remove all other entries of networks in each of the container's configuration. Please understand that there might be security implications as pointed out by @gtaspider.

Important notice

Don't forget to create the external network webproxy. That network will be available for all your dockerized environments. You can do so by running:

$ docker network create webproxy

This might help

You might want to stop all your services ( nginx-proxy and your application's stack ) before restarting them one at a time. First the nginx-proxy stack, then your application(s).

I am only using the webproxy network

No other networks are used anywhere in my docker-compose.yml file. I will probably try to get things working again with multiple, ( internal, ) networks to split up frontend, backend, ... and so on. But his works for now. As you can read in the comments above, it seems to be quite a challenge to find a permanent fix.

@zedtux
Copy link

zedtux commented Jul 15, 2018

I'm on the same boat :(

@muesli
Copy link

muesli commented Jul 17, 2018

I'm experiencing the same issue for any container that exposes two ports (e.g. gitea, one for SSH & one for HTTP), even though it is in the same network as the nginx-proxy. I'm not sure if it's really related, but since a few people here already debugged the surrounding code, I'm wondering if there's any advice for this scenario?

@frederikbosch
Copy link

@muesli That can be solved by using PR #1157. I don’t know if it also solves the problem mentioned by @gtaspider.

@phhutter
Copy link

phhutter commented Aug 2, 2018

We should give a shot. It seems to solve the issue (just reviewed the changes), but havnt tested it now.

@willtho89
Copy link

I can use multiple internal networks with nginx-gen and an older template. Here is the diff between the two:

diff working.tmpl nginx.tmpl
21a22
>
132c133
<                 ## Can be connect with "{{ $containerNetwork.Name }}" network
---
>                 ## Can be connected with "{{ $containerNetwork.Name }}" network
143a145,147
>             {{ else }}
>                 # Cannot connect to network of this container
>                 server 127.0.0.1 down;
159,161d162
< {{/* Get the NETWORK_ACCESS defined by containers w/ the same vhost, falling back to "external" */}}
< {{ $network_tag := or (first (groupByKeys $containers "Env.NETWORK_ACCESS")) "external" }}
<

This was committed in #1106 to fix #1105.

@Redsandro
Copy link

The fix by @webdevotion works and my containers are now proxied properly.

It is very tempting to just up this stack. However, I don't understand the security implications mentioned by @gtaspider and @phhutter. Can someone explain these, so I can make a more educated decision on whether or not I should use this setup?

@willtho89
Copy link

@Redsandro
You no longer have a strong separation of networks.
Let's say you have a vulnerable webapp A and another "secure" webapp B with B_critical_db. Normally an attacker would have to attack A->B->B_critical_db to get access to your data, because only A & B and B & B_critical_db are on the same networks. With this fix an attacker could directly go
A->B_critical_db.

@Redsandro
Copy link

@willtho89 thanks for clarifying.

So to put this in perspective, all ports are open, and an attacker could find a vector in ports that are not supposed to be public.

So while in theory B_critical_db should not be vulnerable over any port, in reality it just became a lot easier to probe for vulnerabilities.

I hope we can find a solution that is not too complex to implement

@MicahZoltu
Copy link

Does anyone know why https://github.com/hwellmann/nginx-proxy/blob/b61c84192960d937a5e5b517264efe9653f8f899/nginx.tmpl#L17 exists? I'm no nginx expert, but it feels somewhat pointless to hard-code a server as down, it appears that the config builder isn't even trying to determine if the service is up or not. I'm also unclear what causes ``Addresses` to be empty, since that is what is causing that code path to be hit rather than hitting the code path that does actual port assignment.

I would love to get this figured out so I can put nginx on an external network and internal network, then put all of my services on an internal network only.

@Redsandro
Copy link

@MicahZoltu when I reviewed this code, I assumed it was to block connections to the container when a certain unsafe/bridged/shared network type was connected. E.g. when a private isolated network for the container fails. E.g. to prevent this from happening.

But in all honestly, I don't know.

@frederikbosch
Copy link

@MicahZoltu Nginx acts as a proxy. This means that virtual hosts should be passed to an another resource. These other resources are the upstream sources. The upstream directive is a list of resources to forward the connection to. A host is linked to the upstream on line 125.

Then line 17. Because there is no IP address for the specific container, see elseif on line 14, the resource is added to the upstream list as down, meaning permanently unavailable.

@MicahZoltu
Copy link

@frederikbosch In this particular code path, the upstream has an IP address and nginx-proxy knows about it. When you look at the generated config, you can see server 172.20.0.5 down. If nginx were to try to send something to 172.20.0.5 it would be routed over the shared network to that host. However, since it is hard-coded as down nginx won't even attempt to route to that host.

@MicahZoltu
Copy link

Hmm, maybe I see what you are saying. Is .Network.IP in that context the IP of the nginx container? I didn't bother to check previously whether that was the case or not.

@Redsandro
Copy link

If nginx were to try to send something to 172.20.0.5 it would be routed over the shared network to that host. However, since it is hard-coded as down nginx won't even attempt to route to that host.

I assumed in the right direction. 😎

@areaeuro
Copy link

areaeuro commented Nov 21, 2018

This works for me, using separate networks for front and back-end containers:
Here's the trimmed down version of docker-compose.yml
version: '3'
services:

nginx-proxy:
image: jwilder/nginx-proxy:alpine
container_name: nginx-proxy
ports:
- "80:80"
volumes:
- /var/run/docker.sock:/tmp/docker.sock:ro
networks:
- proxy-net
restart: always

wolf-static:
build: ./wolf/.
container_name: wolf-static-web
environment:
- VIRTUAL_HOST=wolf.area-europa.net
- TIMEZONE=Europe/Madrid
networks:
- proxy-net
depends_on:
- nginx-proxy
restart: always

networks:
db-net:
proxy-net:

There are a whole bunch of other containers connecting to both networks, some just to the proxy-net and some just to the db-net.

I don't know if the way you are defining the network has an impact 'internal / external part'. I did't use it.

Here's how it shows up in cat /etc/nginx/conf.d/default.conf on nginx-proxy
# wolf.area-europa.net upstream wolf.area-europa.net { ## Can be connected with "docker-test-environment_proxy-net" network # wolf-static-web server 192.168.48.5:80; }
Hope this can help someone

@Redsandro
Copy link

Redsandro commented Dec 8, 2018

@areaeuro if all most containers use proxy-net, you have no strict separation of networks.

@areaeuro
Copy link

areaeuro commented Dec 8, 2018

@Redsandro Please read the whole comment.

There are a whole bunch of other containers connecting to both networks, some just to the proxy-net and some just to the db-net.

Not all containers in the compose file used were pasted.

@Redsandro
Copy link

Redsandro commented Dec 8, 2018

@areaeuro I'm afraid I failed to communicate what I mean.

What you seem to have is this:

   ┌───────┬───────┬───────┬───────┐      // proxy-net
[NGINX] [Dock1] [Dock2] [Dock3] [Dock4]
           ├───────────────┤              // db-net
        [DB-01]         [DB-02]

Separation of networks would be more like this:

                     [NGINX]
           ┌──────────┘││└─────────┐      // net01 through net04
           │       ┌───┘└──┐       │
        [Dock1] [Dock2] [Dock3] [Dock4]
           │               │              // net05 and net06
        [DB-01]         [DB-02]

@areaeuro
Copy link

areaeuro commented Dec 8, 2018

Hey @Redsandro
If you think it will help people, post a docker-compose file.
Cheers

@Redsandro
Copy link

@areaeuro unfortunately there is no better solution. What you do is what we all do. It's sharing a single proxy network that connects ngingx-proxy to all the containers it serves. This is the solution @meron1122 suggested back in May.

I thought that you thought that you had figured out a better solution, so I pointed out that it is the same solution, and we need to keep looking for a solution that has strict separation of networks. To my understanding, as of now it's still not possible. Which is slightly confusing to me because @willtho89 made it sound like it's a (simple?) regression (#1132 (comment)). But none of the mentioned relevant PRs in this issue are merged, so I guess it's not that simple.

@orangelynx
Copy link

still relevant.

@VinnieNZ
Copy link

Also have just been hit with this using a docker-compose and connecting a container to multiple networks.

I added the port I wanted exposed to my docker-compose file, and this seemed to make it work (but far from an ideal solution) - checking with the command that @phhutter provided, I can now see the port is mapped:

root@host # docker inspect --format=" {{ .NetworkSettings.Ports }} " container
 map[8080/tcp:[{0.0.0.0 8080}]]"

And in the nginx default.conf file:

upstream container.domain.name {
                                ## Can be connected with "reverseproxy" network
                        # container
                        server 172.26.0.5:8080;
                                # Cannot connect to network of this container
                                server 127.0.0.1 down;

@milis92
Copy link

milis92 commented Jun 30, 2019

Networks marked as internal cant expose ports, and this is by design. More details here

jpic added a commit to jpic/nginx-proxy that referenced this issue Jul 8, 2019
According to nginx upstream docs:

> down
>    marks the server as permanently unavailable. 

I suppose that this may be useful for users who are manually editing their nginx configuration file:
They can mark a server down without loosing the configuration. Is there any other purpose ?
For users who generate the configuration: just put the working servers in the configuration and skip
others.

The most disturbing for me is:

    server 127.0.0.1 down;

What is 127.0.0.1 ? It's the nginx-proxy container. 
Is it supposed to be an upstream ? Not in any case that I can think of.
What does having such line cause ? Well I've had:

   [emerg] 28#28: invalid number of arguments in "upstream" directive

But also:

    Generated '/etc/nginx/conf.d/default.conf' from 12 containers
    Running 'nginx -s reload'
    Error running notify command: nginx -s reload, exit status 1
    Received event start for container 5c6cb0bf8e05

Yep, the whole LB is down and no helpful error message about what is going on.

I beleive this fixes what most users have been facing in nginx-proxy#1132
As well as the regression introduced in nginx-proxy#1106.

> Deleted code is debugged code.
        — Jeff Sickel
jpic added a commit to jpic/nginx-proxy that referenced this issue Jul 8, 2019
According to nginx upstream docs:

> down
>    marks the server as permanently unavailable. 

I suppose that this may be useful for users who are manually editing their nginx configuration file:
They can mark a server down without loosing the configuration. Is there any other purpose ?
For users who generate the configuration: just put the working servers in the configuration and skip
others.

The most disturbing for me is:

    server 127.0.0.1 down;

What is 127.0.0.1 ? It's the nginx-proxy container. 
Is it supposed to be an upstream ? Not in any case that I can think of.
What does having such line cause ? Well I've had:

   [emerg] 28#28: invalid number of arguments in "upstream" directive

But also:

    Generated '/etc/nginx/conf.d/default.conf' from 12 containers
    Running 'nginx -s reload'
    Error running notify command: nginx -s reload, exit status 1
    Received event start for container 5c6cb0bf8e05

Yep, the whole LB is down !

I beleive this fixes what most users have been facing in nginx-proxy#1132
As well as the regression introduced in nginx-proxy#1106.

> Deleted code is debugged code.
        — Jeff Sickel
jpic added a commit to jpic/nginx-proxy that referenced this issue Jul 8, 2019
This patch fixes a critical condition in which the whole LB is down.

    [emerg] 28#28: invalid number of arguments in "upstream" directive

And:

    Generated '/etc/nginx/conf.d/default.conf' from 12 containers
    Running 'nginx -s reload'
    Error running notify command: nginx -s reload, exit status 1
    Received event start for container 5c6cb0bf8e05

What is 127.0.0.1 ? Isn't it the nginx-proxy container ?
How can it be an upstream at all ?

I beleive this fixes what most users have been facing in nginx-proxy#1132
As well as the regression introduced in nginx-proxy#1106.
jpic added a commit to jpic/nginx-proxy that referenced this issue Jul 8, 2019
This patch fixes a critical condition in which the whole LB is down.

    [emerg] 28#28: invalid number of arguments in "upstream" directive

And:

    Generated '/etc/nginx/conf.d/default.conf' from 12 containers
    Running 'nginx -s reload'
    Error running notify command: nginx -s reload, exit status 1
    Received event start for container 5c6cb0bf8e05

What is 127.0.0.1 ? Isn't it the nginx-proxy container ?
How can it be an upstream at all ?

Related: nginx-proxy#1132 nginx-proxy#1106
@jpic
Copy link

jpic commented Jul 8, 2019

Please try with #1302 and report your results ;)

jpic added a commit to jpic/nginx-proxy that referenced this issue Jul 8, 2019
This patch fixes a critical condition in which the whole LB is down.

    [emerg] 28#28: invalid number of arguments in "upstream" directive

And:

    Generated '/etc/nginx/conf.d/default.conf' from 12 containers
    Running 'nginx -s reload'
    Error running notify command: nginx -s reload, exit status 1
    Received event start for container 5c6cb0bf8e05

What is 127.0.0.1 ? Isn't it the nginx-proxy container ?
How can it be an upstream at all ?

Related: nginx-proxy#1132 nginx-proxy#1106
Comments (not OP) of: nginx-proxy#375
Maybe: nginx-proxy#1144
@gfiasco
Copy link

gfiasco commented Mar 12, 2020

hey guys,

A common mistake is to forget to EXPOSE your port in your Dockerfile.

you can double check this with:

docker-compose ps
docker inspect --format=" {{ .NetworkSettings.Ports }} "  service_name_1

BAD Result:

docker inspect --format=" {{ .NetworkSettings.Ports }} " riphop_web_1
 **map[]** 

GOOD Result

docker inspect --format=" {{ .NetworkSettings.Ports }} " riphop_web_1
 map[**8000**/tcp:[]] 

my fix was

FROM  gcr.io/distroless/python3-debian10

COPY --from=build-env /app /app
COPY --from=build-env /usr/local/lib/python3.7/site-packages /usr/local/lib/python3.7/site-packages
COPY --from=build-env /usr/local/bin/gunicorn /bin/gunicorn

ARG BUILD=local_dev

ENV BUILD_ID=${BUILD}

WORKDIR /app

ENV PYTHONPATH=/usr/local/lib/python3.7/site-packages

# EXPOSE port 8000 to allow communication to/from server
EXPOSE 8000

ENTRYPOINT ["/bin/sh", "docker-entrypoint.sh"]

@huhudev-git
Copy link

@gfiasco Yes, that is exactly the cause of the problem.
For some people who use docker-compose, just write

expose:
      - your_port

and make sure you VIRTUAL_PORT=your_port, now it worked like a charm

@SimonLammer
Copy link

SimonLammer commented Apr 20, 2020

I've also had this issue and weirdly enough, starting the application container via docker run resulted in the error, whereas using docker-compose with a seemingly equivalent configuration worked:

Manual docker run command: sudo docker run -d --name busybox -p 8080:80 --expose 80 -e "VIRTUAL_PORT=80" -e "VIRTUAL_HOST=lammer.biz.tm" -e "LETSENCRYPT_HOST=lammer.biz.tm" -e "LETSENCRYPT_EMAIL=lammer.simon@gmail.com" hypriot/rpi-busybox-httpd

docker-compose file:

version: '3'
services:
  busybox:
    image: hypriot/rpi-busybox-httpd
    ports:
      - 80
    expose:
      - 80
    restart: always
    environment:
    - VIRTUAL_HOST=lammer.biz.tm
    - VIRTUAL_PORT=80
    - LETSENCRYPT_HOST=lammer.biz.tm
    - LETSENCRYPT_EMAIL=lammer.simon@gmail.com

At this point I'm just glad I found a way that works and wanted to potentially help others that stumble onto this issue with my message.

Edit:
Further inspection showed that the network mode was different in the containers.
When I copied the docker-compose file to another directory and started it there, the error returned.
A quick diff of docker inspection of the two containers generated from the docker-compose files yielded (among many id changes):

<             "NetworkMode": "busybox_default",
---
>             "NetworkMode": "rpi-nginx-proxy_default",

I assume this means that the first docker-compose file worked, because docker-compose put it on the same network as the nginx-proxy container when started from within the same directory (just a guess tho).

@iolpltuisciaoln
Copy link

iolpltuisciaoln commented May 23, 2020

still relevant

root@4fb9ee9c7177:/app# ping wiki -c 1
PING wiki (10.0.0.4) 56(84) bytes of data.
64 bytes from wiki.DMZ (10.0.0.4): icmp_seq=1 ttl=64 time=0.186 ms
root@4fb9ee9c7177:/app# cat /etc/nginx/conf.d/default.conf:
...
server {
        server_name _; # This is just an invalid value which will never trigger on a real hostname.
        listen 80;
        access_log /var/log/nginx/access.log vhost;
        return 503;
}
# wiki
upstream wiki {
                                ## Can be connected with "DMZ" network
                # wiki
                        server 10.0.0.4 down;
                                # Cannot connect to network of this container
                                server 127.0.0.1 down;
}
server {
        server_name wiki;
        listen 80 ;
        access_log /var/log/nginx/access.log vhost;
        location / {
                proxy_pass http://wiki;
        }
}
nginx.1    | 2020/05/23 13:50:21 [error] 190#190: *12 no live upstreams while connecting to upstream, client: 172.40.0.1, server:
wiki, request: "GET / HTTP/1.1", upstream: "http://wiki/", host: "wiki"
nginx.1    | wiki 172.40.0.1 - - [23/May/2020:13:50:21 +0000] "GET / HTTP/1.1" 502 157 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X
 10_15_4 Supplemental Update) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/12.1.1 Safari/605.1.15"
root@4fb9ee9c7177:/app# wget wiki
--2020-05-23 13:58:57--  http://wiki/
Resolving wiki (wiki)... 10.0.0.4
Connecting to wiki (wiki)|10.0.0.4|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: 'index.html.2'

index.html.2                         [ <=>                                                     ]  12.24K  --.-KB/s    in 0s

2020-05-23 13:58:57 (289 MB/s) - 'index.html.2' saved [12536]

@pini-gh
Copy link
Contributor

pini-gh commented Nov 4, 2020

As I understand it the nginx.tmpl template relies on exposed ports only, and doesn't honor VIRTUAL_PORT if the considered container exposes only one port:

if (reachable $container) {
  if ($container.exposed_ports.length == 1) {
    use $container.IP:$container.exposed_ports[0]
  }
  else {
    if (not defined VIRTUAL_PORT) {
      VIRTUAL_PORT=80
    }
    if ($VIRTUAL_PORT in $container.exposed_ports) {
      use $container.IP:$VIRTUAL_PORT
    }
    else {
      mark $container.IP as down
    }
}
else {
  mark 127.0.0.1 as down
}

Instead I think that:

  • VIRTUAL_PORT should always be honored (least surprise principle)
  • the server shouldn't be marked down unless it is unreachable
  • a comment could be added when VIRTUAL_PORT is not exposed (troubleshooting hint)

Something like this:

if (reachable $container ) {
  if (not defined VIRTUAL_PORT) {
    if ($container.exposed_ports.length == 1) {
      VIRTUAL_PORT=$container.exposed_ports[0]
    }
    else {
      VIRTUAL_PORT=80
    }
  }
  if ($VIRTUAL_PORT not in $container.exposed_ports) {
    comment "Port $VIRTUAL_PORT not exposed"
  }
  use $container.IP:$VIRTUAL_PORT
}
else {
  mark 127.0.0.1 as down
}

Patch proposal:

diff --git a/nginx.tmpl b/nginx.tmpl
index 07e2b50..14d3825 100644
--- a/nginx.tmpl
+++ b/nginx.tmpl
@@ -17,7 +17,7 @@
 	{{ else if .Network }}
 		# {{ .Container.Name }}
 		{{ if .Network.IP }}
-			server {{ .Network.IP }} down;
+			server {{ .Network.IP }}:{{ .VirtualPort }};
 		{{ else }}
 			server 127.0.0.1 down;
 		{{ end }}
@@ -178,23 +178,18 @@ server {
 upstream {{ $upstream_name }} {
 
 {{ range $container := $containers }}
-	{{ $addrLen := len $container.Addresses }}
-
+	{{/* If only 1 port exposed, use that as a default, else 80 */}}
+        {{ $defaultPort := (when (eq (len $container.Addresses) 1) (first $container.Addresses) (dict "Port" "80")).Port }}
 	{{ range $knownNetwork := $CurrentContainer.Networks }}
 		{{ range $containerNetwork := $container.Networks }}
 			{{ if (and (ne $containerNetwork.Name "ingress") (or (eq $knownNetwork.Name $containerNetwork.Name) (eq $knownNetwork.Name "host"))) }}
 				## Can be connected with "{{ $containerNetwork.Name }}" network
-
-				{{/* If only 1 port exposed, use that */}}
-				{{ if eq $addrLen 1 }}
-					{{ $address := index $container.Addresses 0 }}
-					{{ template "upstream" (dict "Container" $container "Address" $address "Network" $containerNetwork) }}
-				{{/* If more than one port exposed, use the one matching VIRTUAL_PORT env var, falling back to standard web port 80 */}}
-				{{ else }}
-					{{ $port := coalesce $container.Env.VIRTUAL_PORT "80" }}
-					{{ $address := where $container.Addresses "Port" $port | first }}
-					{{ template "upstream" (dict "Container" $container "Address" $address "Network" $containerNetwork) }}
+				{{ $port := (coalesce $container.Env.VIRTUAL_PORT $defaultPort) }}
+				{{ $address := where $container.Addresses "Port" $port | first }}
+				{{ if not $address }}
+				# /!\ Virtual port not exposed
 				{{ end }}
+				{{ template "upstream" (dict "Container" $container "Address" $address "Network" $containerNetwork "VirtualPort" $port) }}
 			{{ else }}
 				# Cannot connect to network of this container
 				server 127.0.0.1 down;

@MaartenSanders
Copy link

@pini-gh thank you.
I can confirm this solves my issue in a docker setup with a few VLANs and associated bridge networks and some internal networks between containers where I would get a: server a.b.c.d down; if the server was only attached to an internal network (where the internal network is also attached to docker-gen and nginx containers).
I took [(https://github.com/pini-gh/nginx-proxy/blob/pini/app/nginx.tmpl)] and put that in the volume that refers to /etc/docker-gen/templates inside my docker-gen instead of the jwilder supplied.

@buchdag
Copy link
Member

buchdag commented Apr 29, 2021

@pini-gh a PR would be more than welcome 👍

@pini-gh
Copy link
Contributor

pini-gh commented Apr 30, 2021

@buchdag : #1609.

@buchdag
Copy link
Member

buchdag commented May 28, 2021

Closed by #1609

@buchdag buchdag closed this as completed May 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issue reporting a bug
Projects
None yet
Development

No branches or pull requests