Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random behaviour when using stdout/stderr inside an entrypoint. #7978

Closed
patchon opened this issue Oct 9, 2020 · 7 comments
Closed

Random behaviour when using stdout/stderr inside an entrypoint. #7978

patchon opened this issue Oct 9, 2020 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@patchon
Copy link

patchon commented Oct 9, 2020

/kind bug

Description

Steps to reproduce the issue:

  1. Use the following Dockerfile,
FROM registry.access.redhat.com/ubi8/ubi-minimal:8.2
COPY entrypoint.sh /
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
  1. Use the following as entrypoint.sh,
#!/bin/bash

OC_NAMESPACE_ENV="weird"
oc_namespace_file="/run/secrets/kubernetes.io/serviceaccount/namespace"

echo ""

if ! [[ -f ${oc_namespace_file} ]]; then
  echo " [ WRN ] the namespace file doesn't exist (${oc_namespace_file})," \
       "using the OC_NAMESPACE ENV instead"  >&2

  if [[ -z ${OC_NAMESPACE_ENV} ]]; then
    echo -e " [ ERR ] could not use the namespace file (${oc_namespace_file})," \
            "nor the environment variable OC_NAMESPACE_ENV\n" >&2
    exit
  else
    oc_namespace=${OC_NAMESPACE_ENV}
  fi
else
  oc_namespace=$(cat ${oc_namespace_file})
fi
echo " [ INF ] found namespace '${oc_namespace}'"

Note how the error messages are redirected to stderr.

  1. Build with,
$ > podman build . -t weird:latest
  1. Run with,
$ > podman run weird:latest
  1. Notice the output.
    In my world the "[ INF ]"-string at the end of the entrypoint.sh should never be able to be printed before the "[WRN]"-string. But in random cases, the order is shifted, see output below,
$ >  podman run pisa-new:latest
 [ INF ] found namespace 'manpi-dev'
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead

$ > podman run pisa-new:latest
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead
 [ INF ] found namespace 'manpi-dev'

$ > podman run pisa-new:latest
 [ INF ] found namespace 'manpi-dev'
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead

$ > podman run pisa-new:latest
 [ INF ] found namespace 'manpi-dev'
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead
 
$ > podman run pisa-new:latest
 [ WRN ] the namespace file doesn't exist (/run/secrets/kubernetes.io/serviceaccount/namespace), using the OC_NAMESPACE ENV instead
 [ INF ] found namespace 'manpi-dev'

Note that if I remove the redirection in entrypoint.sh from the "[ERR]"-strings (>&2) the messages always gets printed in the right order (or at least I haven't been able to reproduce the issue).

As you see from the output above, the order is not consistent - which to me indicates that there is something weird going on, or that I'm totally missunderstanding something here.

Describe the results you received:
See point 5 above.

Describe the results you expected:
I expect the messages to be printed in an predictable way and not random, ie. "[INF]"-string should always be printed last.

Additional information you deem important (e.g. issue happens only occasionally):
It seems to me that it happens randomly, some sort of race triggers somewhere.

Output of podman version:

podman version
Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.15.2
Built:        Wed Oct  7 15:22:27 2020
OS/Arch:      linux/amd64

Output of podman info --debug:

$ > podman info --debug
host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.21-3.fc33.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: 0f53fb68333bdead5fe4dc5175703e22cf9882ab'
  cpus: 1
  distribution:
    distribution: fedora
    version: "33"
  eventLogger: journald
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.8.13-300.fc33.x86_64
  linkmode: dynamic
  memFree: 612429824
  memTotal: 4119355392
  ociRuntime:
    name: crun
    package: crun-0.15-5.fc33.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.15
      commit: 56ca95e61639510c7dbd39ff512f80f626404969
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-4.dev.giteecccdb.fc33.x86_64
    version: |-
      slirp4netns version 1.1.4+dev
      commit: eecccdb96f587b11d7764556ffacfeaffe4b6e11
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.0
  swapFree: 2054680576
  swapTotal: 2059399168
  uptime: 18h 18m 4.83s (Approximately 0.75 days)
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/manpi/.config/containers/storage.conf
  containerStore:
    number: 166
    paused: 0
    running: 0
    stopped: 166
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.1.2-1.fc33.x86_64
      Version: |-
        fusermount3 version: 3.9.3
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.9.3
        using FUSE kernel interface version 7.31
  graphRoot: /home/manpi/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 410
  runRoot: /run/user/1000/containers
  volumePath: /home/manpi/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1602076947
  BuiltTime: Wed Oct  7 15:22:27 2020
  GitCommit: ""
  GoVersion: go1.15.2
  OsArch: linux/amd64
  Version: 2.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

$ > rpm -q podman
podman-2.1.1-11.fc33.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?
I've checked the troubleshooting guide - but didn't find anything.
I have not built my own version of podman though =\

Additional environment details (AWS, VirtualBox, physical, etc.):
I'm running this in VirtualBox, with Fedora 33 as guest OS.
My host OS is macOS Catalina 10.15.7 (19H2).
I'm just using a standard terminal (iterm2), ssh'ing into my virtual Fedora and run podman inside there - I dont think that this should be an issue. Everything else is working "as normal", and as stated above, I get expected behaviour if I remove the redirection to stderr from within the entrypoint.sh.

Thanks for an awesome project! ❤️

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 9, 2020
@github-actions
Copy link

github-actions bot commented Nov 9, 2020

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Nov 9, 2020

@baude @vrothberg @nalind @mtrmac @mheon @giuseppe

This looks like are randomly getting the STDERR message back after the STDOUT message even though STDERR happened first. This looks like it is an issue with go threading, and I am not sure we can fix it.

Thoughts?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 9, 2020

UNIX pipes/sockets don’t come with timing information attached to every byte, so if stdout/stderr is consumed by conmon, or whatever it is, from two separate file descriptors, it’s not possible to reconstruct the original order. If you want that, you have to use a single pipe/socket for both, like a terminal (and then it’s not possible to distinguish output directed to stdout vs. stderr).

@patchon
Copy link
Author

patchon commented Nov 9, 2020

Thanks for clearing it up, I guess its not a huge issue, I was just very confused about the output order.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 9, 2020

Basically, with a

while stdout_open && stderr_open:
    stdout_readable, stdout_open, stderr_readable, stderr_open = poll(stdout_fd, stderr_fd)
    if stdout_readable: forward_stdout()
    if stderr_readable: forward_stderr()

it’s possible for the producer to write to stdout1, stderr, stdout2, in that order, and only later for the consumer to wake, and see both stderr_readable and stdout_writable, and the stdout buffer to contain (trailing parts of)stdout1+(leading parts of)stdout2, and no way to reconstruct the timing.

@rhatdan
Copy link
Member

rhatdan commented Nov 10, 2020

Since this can not be fixed easily and is really not a bug. Closing.

@rhatdan rhatdan closed this as completed Nov 10, 2020
@patchon
Copy link
Author

patchon commented Nov 11, 2020

Again, thanks for clearing it up and for the awesome work you're doing with podman. Really appreciate your hard work! 🙋🏼‍♂️

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 22, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

4 participants