Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker-compose: network communication between containers does not work #16939

Closed
TomaszGasior opened this issue Dec 24, 2022 · 9 comments
Closed
Labels
aardvark kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@TomaszGasior
Copy link

TomaszGasior commented Dec 24, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Containers in docker-compose managed network cannot communicate correctly. Communication sometimes work but sometimes doesn't.

I have simple PHP based project as an example: https://github.com/TomaszGasior/RadioLista-v3 . Apache in http container communicates with PHP-FPM in app container and PHP app in app container communicates with MySQL in db container. Also, http and app containers have until nc -z HOSTNAME PORT; do sleep 1; done; sleep 1 at the beginning of custom entry point scripts to wait until dependent service is ready to receive connections.

Steps to reproduce the issue:

In up-to-date Fedora 37 VM:

sudo dnf install podman podman-docker docker-compose
git clone https://github.com/TomaszGasior/RadioLista-v3.git
cd RadioLista-v3
export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock
systemctl enable --now --user podman.socket
docker-compose up

Describe the results you received:

Go to https://127.0.0.1:2012 in web browser, refresh multiple times. Sometimes apache cannot connect to PHP-FPM with error "DNS lookup failure for: app" but sometimes connection works. Sometimes PHP app cannot connect to MySQL database with similar DNS related error. In docker-compose up output there are errors like nc: bad address 'app' and nc: bad address 'db' which should not happen.

Describe the results you expected:

There is no nc: bad address 'app' and nc: bad address 'db' messages in output. Connection between containers always works.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Client:       Podman Engine
Version:      4.3.1
API Version:  4.3.1
Go Version:   go1.19.2
Built:        Fri Nov 11 16:01:27 2022
OS/Arch:      linux/amd64

Output of podman info:

host:
  arch: amd64
  buildahVersion: 1.28.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.5-1.fc37.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 79.03
    systemPercent: 6.86
    userPercent: 14.11
  cpus: 4
  distribution:
    distribution: fedora
    variant: workstation
    version: "37"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.0.14-300.fc37.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 89931776
  memTotal: 2064703488
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.7.2-2.fc37.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.7.2
      commit: 0356bf4aff9a133d655dc13b1d9ac9424706cac4
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-8.fc37.x86_64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 289206272
  swapTotal: 2064642048
  uptime: 0h 14m 6.00s
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/tomasz/.config/containers/storage.conf
  containerStore:
    number: 4
    paused: 0
    running: 4
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/tomasz/.local/share/containers/storage
  graphRootAllocated: 23957864448
  graphRootUsed: 5125406720
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 25
  runRoot: /run/user/1000/containers
  volumePath: /home/tomasz/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 1668178887
  BuiltTime: Fri Nov 11 16:01:27 2022
  GitCommit: ""
  GoVersion: go1.19.2
  Os: linux
  OsArch: linux/amd64
  Version: 4.3.1


Package info (e.g. output of rpm -q podman or apt list podman or brew info podman):

podman-4.3.1-1.fc37.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

Fedora 37 in VM

[tomasz@fedora RadioLista-v3]$ rpm -q docker-compose 
docker-compose-1.29.2-6.fc37.noarch
[tomasz@fedora RadioLista-v3]$ docker-compose --version
docker-compose version 1.29.2, build unknown

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Dec 24, 2022
@TomaszGasior
Copy link
Author

I found that downgrading podman to older networking stack fixes DNS networking instability:

echo -n cni >  ~/.local/share/containers/storage/defaultNetworkBackend
sudo dnf in containernetworking-plugins podman-plugins

then docker-compose down & docker-compose up

@rhatdan
Copy link
Member

rhatdan commented Dec 29, 2022

@Luap99 @flouthoc PTAL

Could this be an issues with aardvark/netavark.

@Luap99 Luap99 added the aardvark label Jan 4, 2023
@Luap99
Copy link
Member

Luap99 commented Jan 4, 2023

likely related containers/aardvark-dns#248 and #16369

@TomaszGasior Can you check for messages from aardvark-dns in the journal when this happens?

@TomaszGasior
Copy link
Author

TomaszGasior commented Jan 6, 2023

@Luap99 Please take a look. This is right after the DNS error occurred.

[tomasz@fedora RadioLista-v3]$ LANG=C journalctl -r | grep aardvark-dns
Jan 06 22:33:29 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:33:29 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:33:29 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:33:29 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:33:26 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:33:26 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:33:26 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:33:26 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:30:30 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:30:30 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:30:30 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:30:30 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:30:28 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:30:28 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:30:28 fedora aardvark-dns[43002]: None received while parsing dns message, this is not expected server will ignore this message
Jan 06 22:30:28 fedora aardvark-dns[43002]: Failed while parsing message: unexpected end of input reached
Jan 06 22:29:40 fedora aardvark-dns[43002]: Received SIGHUP will refresh servers: 1
Jan 06 22:29:40 fedora aardvark-dns[43002]: Received SIGHUP will refresh servers: 1
Jan 06 22:29:39 fedora aardvark-dns[43002]: Received SIGHUP will refresh servers: 1
Jan 06 22:29:39 fedora systemd[1457]: Started run-r10b933223ab24568abe6ae9b298e0a83.scope - /usr/libexec/podman/aardvark-dns --config /run/user/1000/containers/networks/aardvark-dns -p 53 run.

@DynamoFox
Copy link

@TomaszGasior It sounds like this issue: containers/aardvark-dns#151

@github-actions
Copy link

github-actions bot commented Mar 6, 2023

A friendly reminder that this issue had no activity for 30 days.

@Luap99
Copy link
Member

Luap99 commented Mar 6, 2023

Did you test with aardvark v1.5?

@TomaszGasior
Copy link
Author

@Luap99 Sorry for long delay. With clean & up-to-date Fedora 37 VM the issue still occurs.

[tomasz@fedora RadioLista-v3]$ LANG=C rpm -qi aardvark-dns 
Name        : aardvark-dns
Version     : 1.5.0
Release     : 4.fc37
Architecture: x86_64
Install Date: Fri Mar 10 21:32:58 2023
Group       : Unspecified
Size        : 2330184
License     : ASL 2.0 and BSD and MIT
Signature   : RSA/SHA256, Thu Feb  9 13:05:08 2023, Key ID f55ad3fb5323552a
Source RPM  : aardvark-dns-1.5.0-4.fc37.src.rpm
Build Date  : Wed Feb  8 16:11:01 2023
Build Host  : buildvm-x86-06.iad2.fedoraproject.org
Packager    : Fedora Project
Vendor      : Fedora Project
URL         : https://github.com/containers/aardvark-dns
Bug URL     : https://bugz.fedoraproject.org/aardvark-dns
Summary     : Authoritative DNS server for A/AAAA container records
Description :
Authoritative DNS server for A/AAAA container records

Forwards other request to configured resolvers.
Read more about configuration in `src/backend/mod.rs`.

@TomaszGasior
Copy link
Author

I am not able to reproduce the issue with up-to-date Fedora 38. I think it's fine to close the issue.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Aug 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
aardvark kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants