Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman fails to daemon postgres on Windows with run -d or -dt, Mac and Linux daemonize fine no issue #13965

Closed
AddictArts opened this issue Apr 21, 2022 · 13 comments · Fixed by #14250
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote windows issue/bug on Windows

Comments

@AddictArts
Copy link

AddictArts commented Apr 21, 2022

/kind bug

When using podman run -dt --name postgres -p 5432:5432 -e POSTGRES_PASSWORD=postgres docker.io/library/postgres:14.2 on Windows the process will exit without an error as though it received a shutdown. The same command on Linux and Mac OS it will properly daemonize. Also, other podman run -d containers do run as daemons on WIndows WSL2 backend.

Steps to reproduce the issue:

  1. Execute the above on WIndows podman run -dt --name postgres --network apls -p 5432:5432 -e POSTGRES_PASSWORD=postgres -v postgres:/var/lib/postgresql/data docker.io/library/postgres:14.2

or -d ends with the same results. Also --network=host does not work either.

Describe the results you received:

It just stop running like a shutdown was received with no errors in podman logs etc.

Describe the results you expected:

Stay as a daemon like Linux and Mac OS.

Output of podman version:

4.0.3

Output of podman info --debug:

host:
  arch: amd64
  buildahVersion: 1.24.3
  cgroupControllers: []
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.0-2.fc35.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.0, commit: '
  cpus: 16
  distribution:
    distribution: fedora
    variant: container
    version: "35"
  eventLogger: file
  hostname: DESKTOP-73DQB37
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.10.16.3-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 50316668928
  memTotal: 53706846208
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun-1.4.4-1.fc35.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.4.4
      commit: 6521fcc5806f20f6187eb933f9f45130c86da230
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.12-2.fc35.x86_64
    version: |-
      slirp4netns version 1.1.12
      commit: 7a104a101aa3278a2152351a082a6df71f57c9a3
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 13958643712
  swapTotal: 13958643712
  uptime: 30h 56m 17.1s (Approximately 1.25 days)
plugins:
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /home/user/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.7.1-2.fc35.x86_64
      Version: |-
        fusermount3 version: 3.10.5
        fuse-overlayfs: version 1.7.1
        FUSE library version 3.10.5
        using FUSE kernel interface version 7.31
  graphRoot: /home/user/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1000/containers
  volumePath: /home/user/.local/share/containers/storage/volumes
version:
  APIVersion: 4.0.3
  Built: 1648837274
  BuiltTime: Fri Apr  1 11:21:14 2022
  GitCommit: ""
  GoVersion: go1.16.15
  OsArch: linux/amd64
  Version: 4.0.3

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/main/troubleshooting.md)

Yes

@openshift-ci openshift-ci bot added the kind/bug Categorizes issue or PR as related to a bug. label Apr 21, 2022
@github-actions github-actions bot added the remote Problem is in podman-remote label Apr 21, 2022
@AddictArts AddictArts changed the title Podman fails to daemon postgres on Windows with run -d, Mac and Linux daemonize fine no issue Podman fails to daemon postgres on Windows with run -d or -dt, Mac and Linux daemonize fine no issue Apr 21, 2022
@vrothberg
Copy link
Member

Thanks for reaching out, @AddictArts, and apologies for the silence.

@containers/podman-maintainers, any ideas?

@rhatdan
Copy link
Member

rhatdan commented May 4, 2022

@n1hility PTAL

@n1hility
Copy link
Member

n1hility commented May 4, 2022

(copied from 13966)

Sure can @n1hility

From a Windows C:\ prompt using CMD.exe run podman run -d --name postgres -p 5432:5432 -e POSTGRES_PASSWORD=postgres docker.io/library/postgres:14.2

Postgres will not stay running. This is easier than using the one in this issue since it requires a running postgres Also note you cannot have any others running. If another is running, example without -d above then launch another on a different port, that one will stay a daemon oddly. I know that sounds odd, but be sure podman ps shows nothing.

Note: After the above run and it terminating shortly after, if I run podman start --attach postgres then it will continue running of course showing the log or stdout in the CMD. If you CTRL-C, the process will terminate. Also, note -dt does not help either for the initial run.

Sorry for the delay in replying. I am having a really hard time reproducing this.

After it exits immediately, if you do echo %errorlevel%, do you see a non-zero value?

When it fails, if you do a wsl -l -v, do you see wsl running?

If you run it from powershell instead of CMD does it work (BTW I highly recommend installing windows terminal BTW, super useful: winget install Microsoft.WindowsTerminal)?

Just to confirm if you run podman --version on the windows prompt you also see a 4.0.3 version?

As to wsl -d podman-machine-default, it is running and it does have poman. However a podman ps or a podman containers list -a does not show the Windows podman executed containers.

Sorry I forgot to mention to do a su user on the wsl prompt first. By default podman is configured to use rootless networking, but when you enter on the wsl prompt you are root, so you need to switch users to the user user to get to the same underlying source.

@AddictArts
Copy link
Author

AddictArts commented May 6, 2022

Thanks for the help @n1hility

C:\>echo %errorlevel%
0

There is no error it just exits gracefully.

Yes podman-machine-default remains running.

C:\>wsl -l -v
  NAME                      STATE           VERSION
* Ubuntu-20.04              Running         2
  podman-machine-default    Running         2

If I open wsl -d podman-machine-default and su user and keep that open the podman postgres does not exit and continues to run. Also podman container list -a does show it as expected. It appears all TTY's for user close and that exits the process or something like that. Something like screen or tmux would be a work around, but I know that is not a real solution and they don't exist in the limited podman-machie-default.

If I close the session, by say exit multiple times, then postgres will exit. Hope this helps.

@n1hility
Copy link
Member

n1hility commented May 9, 2022

@AddictArts glad to help, and thanks for your patience.

Is it only postgres that has the issue? Do you observe the same behavior with other daemons (e.g. httpd, nginx etc)

Can you check the output of dmesg and see if you see any sort of oom_killer events or something that else that might explain a process being terminated?

Do you have any special wslconfig settings (memory constraints etc)?

If you run as rootfull do you observe the same behavior? You can do so without switching the VM by adding a -c to specify the rootfull connection like so

podman -c podman-machine-default-root run .....

Be sure to keep using it when running ps / logs etc

@AddictArts
Copy link
Author

@n1hility No not only postgres. That other issue you helped with, network between wsl and Windows, hasura did exactly the same thing. If I started postgres with podman and then ran hasura using that podman instance, postgres would close and the hasura would stay running. If I run hasura pointing at a Windows service postgres, it will gracefully close and exit just like Postgres does as I describe.

@AddictArts
Copy link
Author

@n1hility I run podman rootless fyi

@n1hility
Copy link
Member

n1hility commented May 9, 2022

@AddictArts thanks on confirming its multiple types of containers. In addition to my questions above I forgot to ask if you tried a full system restart. I assume you already did but I just want to mention that WSL caches the kernel and a hyper-v instance. You can force a kernel restart with --shutdown, but sometimes Hyper-V can have issues as well.

The reason I asked about trying rootfull is the implementations are subtly different in a few areas. I'm hoping to find another clue as to why this is breaking for you.

@AddictArts
Copy link
Author

AddictArts commented May 9, 2022

@n1hility No hyper-v. WSL no longer needs it. So, I did not install it. Yes a full restart was performed. As mentioned it appears the TTY's go away and the process exits due to that. Thanks

@n1hility
Copy link
Member

@AddictArts right to be clear I was just referring to the internal dependency on the core hyper-v hypervisor layer (not the full hyper-v feature and tool chain), which yes you don't need.

For a -d with no -t there wouldn't be a tty but the behavior would match session termination. Can you give the following try:

podman machine stop

wsl --shutdown
wsl -d podman-machine-default 
# touch /var/lib/systemd/linger/user
# chown user:user /var/lib/systemd/linger/user
# chmod 644 /var/lib/systemd/linger/user
# exit

wsl --shutdown
podman machine start

Then retry and see if that solves the issue. I have a feeling it will.

BTW if it doesn't, if you could just confirm that

wsl -d podman-machine-default
# su user
$ loginctl user-status

You should see linger is "yes"

@n1hility
Copy link
Member

Knowing this was the likely issue I was finally able to reproduce and will fix this in 4.1.1. You can use the above fix until then.

@AddictArts
Copy link
Author

AddictArts commented May 20, 2022

Hi sorry @n1hility I've been traveling and gone. Looks like the issue is resolved. Thanks

[root /]# su user
[user /]$ loginctl user-status
Failed to execute 'pager', using next fallback pager: Permission denied
Failed to execute 'less', using next fallback pager: Permission denied
user (1000)
           Since: Fri 2022-05-20 15:43:55 PDT; 11min ago
           State: active
        Sessions: *c18
          Linger: yes
            Unit: user-1000.slice
                  ├─session-c18.scope
                  │ ├─602 su user
                  │ ├─603 bash
                  │ ├─622 loginctl user-status
                  │ └─623 more
                  └─user@1000.service
                    ├─app.slice
                    │ ├─dbus-broker.service
                    │ │ ├─122 /usr/bin/dbus-broker-launch --scope user
                    │ │ └─123 dbus-broker --log 4 --controller 9 --machine-id e158b87efc4e4362b6ab68681717dbf5 --max-bytes 100000000000000 --max-fds 25000000000000 --max-ma
tches 5000000000
                    │ └─linger-example.service
                    │   └─44 /usr/bin/sleep infinity
                    ├─init.scope
                    │ ├─36 /usr/lib/systemd/systemd --user
                    │ └─37 "(sd-pam)"
                    └─user.slice
                      └─podman-pause-7fd93bb8.scope
                        └─100 catatonit -P

@n1hility
Copy link
Member

@AddictArts cool! thanks for your patience in tracking that one down

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. remote Problem is in podman-remote windows issue/bug on Windows
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants