Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[macOS] podman logs --tail # value not interpreted correctly when used against restarted containers #13098

Closed
PayBas opened this issue Jan 31, 2022 · 8 comments · Fixed by #13857
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. macos MacOS (OSX) related remote Problem is in podman-remote

Comments

@PayBas
Copy link

PayBas commented Jan 31, 2022

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

Steps to reproduce the issue:

brew install podman
podman machine init
podman machine start
podman pull postgres:latest
podman run --name=postgres --detach \
    -v postgresql:/var/lib/postgresql -v postgresql_data:/var/lib/postgresql/data \
    -e POSTGRES_USER=user -e POSTGRES_PASSWORD=pass \
    postgres:latest
sleep 20s
podman logs --tail 3 postgres # correct
podman stop postgres
podman start postgres
sleep 20s
podman logs --tail 3 postgres # wrong

Describe the results you received:
First logs call

2022-01-31 22:27:34.172 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-01-31 22:27:34.176 UTC [56] LOG:  database system was shut down at 2022-01-31 22:27:34 UTC
2022-01-31 22:27:34.179 UTC [1] LOG:  database system is ready to accept connections

Second logs call

2022-01-31 22:28:44.986 UTC [1] LOG:  background worker "logical replication launcher" (PID 62) exited with exit code 1
2022-01-31 22:28:44.987 UTC [57] LOG:  shutting down
2022-01-31 22:28:44.997 UTC [1] LOG:  database system is shut down

PostgreSQL Database directory appears to contain a database; Skipping initialization

2022-01-31 22:28:54.435 UTC [1] LOG:  starting PostgreSQL 14.1 (Debian 14.1-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2022-01-31 22:28:54.436 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-01-31 22:28:54.436 UTC [1] LOG:  listening on IPv6 address "::", port 5432
2022-01-31 22:28:54.440 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"
2022-01-31 22:28:54.444 UTC [20] LOG:  database system was shut down at 2022-01-31 22:28:44 UTC
2022-01-31 22:28:54.448 UTC [1] LOG:  database system is ready to accept connections

Describe the results you expected:
I would expect --tail to not be affected by container restarts.

Additional information you deem important (e.g. issue happens only occasionally):
The reason I am filing this bug report, is because I am relying on things like this:

( timeout 30s podman logs -f --tail 0 postgres & ) | sed '/database system is ready to accept connections$/ q'

This works fine when creating a new container. But restarting an existing container will break this, since the logs -f --tail 0 will very likely already contain database system is ready to accept connections, which makes the whole thing fire while its not ready yet.

As a workaround, I'm resorting to this:

podman start postgres
containerStartTime=$(date +"%s")
# workaround for https://github.com/containers/podman/issues/13098
( timeout 30s podman logs -f --since $containerStartTime postgres & ) | sed '/database system is ready to accept connections$/ q'

Output of podman version:

Client:
Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.17.3
Built:        Wed Dec  8 19:41:11 2021
OS/Arch:      darwin/amd64

Server:
Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.16.8
Built:        Wed Dec  8 22:45:07 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.30-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.30, commit: '
  cpus: 1
  distribution:
    distribution: fedora
    variant: coreos
    version: "35"
  eventLogger: journald
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.15.17-200.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 1420251136
  memTotal: 2061381632
  ociRuntime:
    name: crun
    package: crun-1.4-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4
      commit: 3daded072ef008ef0840e8eccb0b52a7efbd165d
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 12m 0.08s
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 3
  runRoot: /run/user/1000/containers
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 1638999907
  BuiltTime: Wed Dec  8 21:45:07 2021
  GitCommit: ""
  GoVersion: go1.16.8
  OsArch: linux/amd64
  Version: 3.4.4

Package info (e.g. output of rpm -q podman or apt list podman):

$ brew list podman
/usr/local/Cellar/podman/3.4.4/bin/podman
/usr/local/Cellar/podman/3.4.4/bin/podman-remote
/usr/local/Cellar/podman/3.4.4/etc/bash_completion.d/podman
/usr/local/Cellar/podman/3.4.4/libexec/gvproxy
/usr/local/Cellar/podman/3.4.4/share/fish/vendor_completions.d/podman.fish
/usr/local/Cellar/podman/3.4.4/share/man/ (160 files)
/usr/local/Cellar/podman/3.4.4/share/zsh/site-functions/_podman

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

@github-actions github-actions bot added macos MacOS (OSX) related remote Problem is in podman-remote labels Jan 31, 2022
@petr-ujezdsky
Copy link

Hi, I am using docker and docker-compose, but it might be relatable. I have also been parsing containers logs for checking their startup and even on docker this has its own pitfalls, exactly like you describe. But then I have implemented proper HEALTHCHECK and started to listen to event=health_status in docker events instead.

The HEALTHCHECK can internally parse the logfile or (better) do some dummy SQL command.
You might also read the logfile from exec command, something like podman exec postgres tail -f logfile.log | sed '/database system is ready to accept connections$/ q'.

@PayBas
Copy link
Author

PayBas commented Feb 1, 2022

@petr-ujezdsky Thanks for the feedback. Naturally I would prefer to use the built-in health check functionality. Unfortunately, few containers actually implement it. So it might be possible in some situations, but definitely not all.

I'll look into your exec workaround, but imho that is even dirtier ;).

Having podman logs behave like it says in the documentation is desirable in any case.

@petr-ujezdsky
Copy link

I think you should be able to create your own HEALTHCHECK via docker-compose file or from docker run, see documentation. Unsure when using podman run.

Or you might use time restriction instead of lines restriction (here) as a workaround. However note that this solution (and maybe even yours) has a pitfall that when the postgres initializes before you start parsing the log you will end up waiting indefinitely. This might seems unreasonable, but it actually can happen in highly loaded server, you simply never know :)

@github-actions
Copy link

github-actions bot commented Mar 4, 2022

A friendly reminder that this issue had no activity for 30 days.

@scottcwang
Copy link

I ran into this issue too on Fedora 35 -- I'm not using healthchecks. Here's a minimal example.

# podman run --name test --detach alpine sh -c 'echo hello; echo world; sleep infinity'
1dca9c8e069451b9bfd5eb906806b976609844745e24ccdf88af2f6c855a5bbf
# podman logs --tail=1 test
world
# podman restart test
1dca9c8e069451b9bfd5eb906806b976609844745e24ccdf88af2f6c855a5bbf
# podman logs --tail=1 test
world
hello
world

@scottcwang
Copy link

PS additional debug info:

# rpm -q podman
podman-3.4.4-1.fc35.x86_64
host:
  arch: amd64
  buildahVersion: 1.23.1
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - misc
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.30-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.30, commit: '
  cpus: 4
  distribution:
    distribution: fedora
    variant: workstation
    version: "35"
  eventLogger: journald
  hostname: chengyuwang.duckdns.org
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.14.10-300.fc35.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 379691008
  memTotal: 16652582912
  ociRuntime:
    name: crun
    package: crun-1.1-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.1
      commit: 5b341a145c4f515f96f55e3e7760d1c79ec3cf1f
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 8588357632
  swapTotal: 8589930496
  uptime: 8h 21m 5.03s (Approximately 0.33 days)
plugins:
  log:
  - k8s-file
  - none
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  docker.io:
    Blocked: false
    Insecure: false
    Location: docker.io
    MirrorByDigestOnly: false
    Mirrors:
    - Insecure: true
      Location: 100.86.43.60
    Prefix: docker.io
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /etc/containers/storage.conf
  containerStore:
    number: 16
    paused: 0
    running: 16
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageStore:
    number: 43
  runRoot: /run/containers/storage
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 3.4.4
  Built: 1638999907
  BuiltTime: Wed Dec  8 15:45:07 2021
  GitCommit: ""
  GoVersion: go1.16.8
  OsArch: linux/amd64
  Version: 3.4.4

@github-actions
Copy link

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Apr 13, 2022

This seems to be only an error with the journald log driver.

$ podman run --log-driver=k8s-file --name test --detach alpine sh -c 'echo hello; echo world; sleep infinity'
720d9156bdf529bc169b77bd1dd61e870f76607d84288b9f45bf5d10a0da380d
$ podman logs --tail=1 test
world
$ podman restart -t 0 test
720d9156bdf529bc169b77bd1dd61e870f76607d84288b9f45bf5d10a0da380d
$ podman logs --tail=1 test
world

rhatdan added a commit to rhatdan/podman that referenced this issue Apr 13, 2022
--tail=1 is not working f you restart a container with journald logging.

We see the exit status and then call into the logging a second time
causing all of the logs to print.

Removing the tail log on exited seems to fix the problem.

Fixes: containers#13098

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
rhatdan added a commit to rhatdan/podman that referenced this issue Apr 13, 2022
--tail=1 is not working f you restart a container with journald logging.

We see the exit status and then call into the logging a second time
causing all of the logs to print.

Removing the tail log on exited seems to fix the problem.

Fixes: containers#13098

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
rhatdan added a commit to rhatdan/podman that referenced this issue Apr 13, 2022
--tail=1 is not working f you restart a container with journald logging.

We see the exit status and then call into the logging a second time
causing all of the logs to print.

Removing the tail log on exited seems to fix the problem.

Fixes: containers#13098

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
rhatdan added a commit to rhatdan/podman that referenced this issue Apr 13, 2022
--tail=1 is not working f you restart a container with journald logging.

We see the exit status and then call into the logging a second time
causing all of the logs to print.

Removing the tail log on exited seems to fix the problem.

Fixes: containers#13098

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
gbraad pushed a commit to gbraad-redhat/podman that referenced this issue Jul 13, 2022
--tail=1 is not working f you restart a container with journald logging.

We see the exit status and then call into the logging a second time
causing all of the logs to print.

Removing the tail log on exited seems to fix the problem.

Fixes: containers#13098

Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. macos MacOS (OSX) related remote Problem is in podman-remote
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants