Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution works only for one network when two networks are attached #8399

Closed
simonkrenger opened this issue Nov 18, 2020 · 27 comments
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature

Comments

@simonkrenger
Copy link

simonkrenger commented Nov 18, 2020

Is this a BUG REPORT or FEATURE REQUEST?

/kind bug

Description

PR #7460 introduced support for podman network create with the ability to create networks and interact between containers using DNS entries. The following works as expected for both web-a.dns.podman and also just web-a:

$ podman network create foo-a
$ podman run -d --name web-a --hostname web --network foo-a nginx:alpine
$ podman run --rm --network foo-a registry.fedoraproject.org/fedora-minimal:33 curl http://web-a.dns.podman
$ podman run --rm --network foo-a registry.fedoraproject.org/fedora-minimal:33 curl http://web-a

However, when there are two networks attached to the client container (for example a reverse proxy), DNS is only resolved for the last --network specified in the podman run command.
Accessing the containers via IPs works as expected, so networking is not an issue, only the DNS resolution.

Steps to reproduce the issue:

  1. Create two networks and launch a nginx container into both networks:

    $ podman network create foo-a
    $ podman network create foo-b
    $ podman run -d --name web-a --hostname web-a --network foo-a nginx:alpine
    $ podman run -d --name web-b --hostname web-b --network foo-b nginx:alpine
    
  2. Use a third client container to access the other two containers

    $ podman run --rm --network foo-a --network foo-b registry.fedoraproject.org/fedora-minimal:33 curl http://web-a.dns.podman
    $ podman run --rm --network foo-a --network foo-b registry.fedoraproject.org/fedora-minimal:33 curl http://web-b.dns.podman
    

Describe the results you received:

The container fails to resolve any DNS entries for containers in the first network specified:

$ podman run --rm --network foo-a --network foo-b registry.fedoraproject.org/fedora-minimal:33 curl http://web-a.dns.podman
...
curl: (6) Could not resolve host: web-a.dns.podman
$ podman run --rm --network foo-a --network foo-b registry.fedoraproject.org/fedora-minimal:33 curl -q http://web-b.dns.podman
...
<title>Welcome to nginx!</title>
....

Describe the results you expected:

DNS resolution is expected to work for both networks as follows

$ podman run --rm --network foo-a --network foo-b registry.fedoraproject.org/fedora-minimal:33 curl http://web-a.dns.podman
...
<title>Welcome to nginx!</title>
....
$ podman run --rm --network foo-a --network foo-b registry.fedoraproject.org/fedora-minimal:33 curl http://web-b.dns.podman
...
<title>Welcome to nginx!</title>
....

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

$ podman version
Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.13.15
Built:        Sat Oct 31 00:23:30 2020
OS/Arch:      linux/amd64

Output of podman info --debug:

$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.0.21-1.el8.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: 3701b9d2b0bd16229427f6f372cb3b96243fe19b-dirty'
  cpus: 2
  distribution:
    distribution: '"centos"'
    version: "8"
  eventLogger: journald
  hostname: centos8-test.krenger.ch
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 4.18.0-193.28.1.el8_2.x86_64
  linkmode: dynamic
  memFree: 6576820224
  memTotal: 8189632512
  ociRuntime:
    name: runc
    package: runc-1.0.0-65.rc10.module_el8.2.0+305+5e198a41.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-0.4.2-3.git21fdece.module_el8.2.0+305+5e198a41.x86_64
    version: |-
      slirp4netns version 0.4.2+dev
      commit: 21fdece2737dc24ffa3f01a341b8a6854f8b13b4
  swapFree: 3221221376
  swapTotal: 3221221376
  uptime: 1h 11m 7.07s (Approximately 0.04 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/simon/.config/containers/storage.conf
  containerStore:
    number: 4
    paused: 0
    running: 3
    stopped: 1
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-5.module_el8.2.0+305+5e198a41.x86_64
      Version: |-
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.2.1
        using FUSE kernel interface version 7.26
  graphRoot: /home/simon/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 18
  runRoot: /run/user/1000
  volumePath: /home/simon/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1604100210
  BuiltTime: Sat Oct 31 00:23:30 2020
  GitCommit: ""
  GoVersion: go1.13.15
  OsArch: linux/amd64
  Version: 2.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

$ rpm -q podman
podman-2.1.1-10.el8.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

Tested on VirtualBox with centos-release-8.2-2.2004.0.2.el8.x86_64

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Nov 18, 2020
@mheon
Copy link
Member

mheon commented Nov 18, 2020

@baude PTAL

@Luap99
Copy link
Member

Luap99 commented Nov 19, 2020

Question @mheon @baude, If I specify several networks shouldn't I have several interfaces in the container? Because right now (tested with master) I only have one (the latest). This happens both as rootless and rootful. That would also explain the dns issue since you can only resolve dns entries in your own network, right?

@Luap99
Copy link
Member

Luap99 commented Nov 19, 2020

$ sudo podman run --rm --network foo-a --network foo-b -it alpine ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if12: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 3e:40:18:09:d7:7a brd ff:ff:ff:ff:ff:ff
    inet 10.89.21.3/24 brd 10.89.21.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::3c40:18ff:fe09:d77a/64 scope link tentative 
       valid_lft forever preferred_lft forever
$ podman run --rm --network foo-a --network foo-b -it alpine ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if14: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether b6:77:2f:5c:ac:df brd ff:ff:ff:ff:ff:ff
    inet 10.88.6.10/24 brd 10.88.6.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b477:2fff:fe5c:acdf/64 scope link tentative 
       valid_lft forever preferred_lft forever

@Luap99 Luap99 self-assigned this Nov 19, 2020
@Luap99 Luap99 added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Nov 19, 2020
@Luap99
Copy link
Member

Luap99 commented Nov 19, 2020

OK, so there two issues here.
First you are using the wrong syntax to join 2 networks. The correct one would be: --network foo-a,foo-b. However I find this unintuitive and opened a PR #8410 to also allow your syntax.

Second the dns is still not working for both names besides joining both networks.

$ podman run --rm --network foo-a,foo-b -it alpine sh
/ # ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if32: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether b2:c3:6a:99:e1:ec brd ff:ff:ff:ff:ff:ff
    inet 10.88.5.11/24 brd 10.88.5.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::b0c3:6aff:fe99:e1ec/64 scope link 
       valid_lft forever preferred_lft forever
5: eth1@if33: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP 
    link/ether 26:5d:bf:d0:05:71 brd ff:ff:ff:ff:ff:ff
    inet 10.88.6.20/24 brd 10.88.6.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::245d:bfff:fed0:571/64 scope link 
       valid_lft forever preferred_lft forever
/ # cat /etc/resolv.conf 
nameserver 10.88.5.1
nameserver 10.88.6.1
/ # nslookup web-a
Server:         10.88.5.1
Address:        10.88.5.1:53

Name:   web-a
Address: 10.88.5.2

/ # nslookup web-b
Server:         10.88.5.1
Address:        10.88.5.1:53

** server can't find web-b: NXDOMAIN

** server can't find web-b: NXDOMAIN

/ # nslookup web-b 10.88.6.1
Server:         10.88.6.1
Address:        10.88.6.1:53

Name:   web-b
Address: 10.88.6.2

/ # 

So both dns servers are working but the container will only contact the first one. @baude @mheon Ideas?

@simonkrenger
Copy link
Author

@Luap99 Thanks for the hint regarding the syntax, that makes sense and I'll use that instead.

Good to see that with the correct syntax the entries in /etc/resolv.conf are populated correctly. I believe what we see is the expected behaviour, as NXDOMAIN is considered a definitive answer and a client will not try the second nameserver in this case.

So then I am not sure if being able to resolve names in both networks is even a valid expectation in the first place.

@mheon
Copy link
Member

mheon commented Nov 19, 2020

This may be a design limitation based on the architecture of the current dnsname plugin, but we need @baude to comment to be certain.

@mheon mheon closed this as completed Nov 19, 2020
@mheon mheon reopened this Nov 19, 2020
@mheon
Copy link
Member

mheon commented Nov 19, 2020

Did not mean to close, sorry

@baude
Copy link
Member

baude commented Nov 19, 2020

does this fail for rootless and rootful?

@baude
Copy link
Member

baude commented Nov 19, 2020

as for design, I dont think this is a limitation but i would need to poke at this further. It seems reasonable that this should work. @Luap99 i see you have put this in progress which indicates you are working on it. ping me if you have any questions ... im happy to help.

@Luap99
Copy link
Member

Luap99 commented Nov 19, 2020

@baude It fails for rootless and rootful.

The problem is that dnsname plugin creates a dnsserver per network and a single container will always use the first sever in /etc/resolv.conf so if the first one returns NXDOMAIN it will not retry a different server which seems to be normal behavior for dns query's.

I don't think I can do anything here.

@Luap99 Luap99 removed the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Nov 19, 2020
@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@simonkrenger
Copy link
Author

@baude Any other input?

I think this should at least be documented when that is the expected behaviour.

@idelsink
Copy link

idelsink commented Jan 8, 2021

I am too running into this issue where I want to have a setup with a reverse proxy and multiple containers behind it with each it's own network to the reverse proxy container for network segregation.

I assume this is not expected if there is no workaround. Docker, for example, does seem to handle this setup. And not being able to segregate my network is a bit of a show stopper. (for security reasons)

@deuill
Copy link
Contributor

deuill commented Jan 9, 2021

I'm hitting this issue as well, and I agree it's a show-stopper when attempting to segregate the network across various domains. Apparently, there's been some discussion about this over at containers/dnsname#12, and reasonable resolution (running dnsmasq as part of each container) doesn't seem like a quick fix...

@github-actions
Copy link

github-actions bot commented Mar 8, 2021

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Mar 8, 2021

@simonkrenger @Luap99 @deuill Is this fixed now in the dnsname patch that is merged?

@mheon
Copy link
Member

mheon commented Mar 8, 2021

@rhatdan No. This is going to require major changes - it's what we were discussing last week.

@rhatdan
Copy link
Member

rhatdan commented Mar 9, 2021

Ok got it.

@github-actions
Copy link

github-actions bot commented Apr 9, 2021

A friendly reminder that this issue had no activity for 30 days.

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@Luap99 Luap99 added the network Networking related issue or feature label Jun 21, 2021
@dblitt
Copy link

dblitt commented Dec 13, 2021

Is there any update for this? I would love to properly separate networks but have predictable DNS resolution in containers with multiple networks.

@rhatdan rhatdan added 4.0 and removed stale-issue labels Dec 13, 2021
@rhatdan
Copy link
Member

rhatdan commented Dec 13, 2021

We are getting towards completion of rewrite of network stack, so issues like this will be reviewed when the stack is ready.

@baude
Copy link
Member

baude commented Jan 10, 2022

this will work in aardvark with podman 4.0.

@jwhonce jwhonce assigned flouthoc and unassigned baude Jan 12, 2022
@baude
Copy link
Member

baude commented Feb 4, 2022

in podman 4 now

@baude baude closed this as completed Feb 4, 2022
@deuill
Copy link
Contributor

deuill commented Feb 15, 2022

Just to confirm: DNS resolution for containers with multiple networks attached is expected to work as of Podman 4.x only for netavark-type networks -- the old cni network type remains limited in this regard? Is there (going to be) a guide for migrating existing systems from cni to netavark (assuming that changing the type in containers.conf and rebooting isn't enough)?

@Luap99
Copy link
Member

Luap99 commented Feb 15, 2022

@deuill Yes it will not work for cni. If you changed the config setting and reboot it will work except for all your cni configs. We do not migrate the cni config files so you would have to recreate your network with podman network create after you changed the network backend.

@mheon
Copy link
Member

mheon commented Feb 15, 2022

I'm working on a blogpost for the new Netavark feature - I can cover how to upgrade in that. Should be out concurrent with the final release.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. network Networking related issue or feature
Projects
None yet
Development

No branches or pull requests

10 participants