Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/dev/null can't be found inside the container #1412

Closed
EmilienM opened this issue Sep 5, 2018 · 25 comments
Closed

/dev/null can't be found inside the container #1412

EmilienM opened this issue Sep 5, 2018 · 25 comments
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@EmilienM
Copy link
Contributor

EmilienM commented Sep 5, 2018

/kind bug

Description
/dev/null can't be found inside the container, which is problematic for some applications like iSCSI or other services in OpenStack which require the access to /dev/null.

Steps to reproduce the issue:
Start a container which needs access to /dev/null

Describe the results you received:
From strace:
[pid 64887] stat("/dev/null", 0xc420106818) = -1 ENOENT (No such file or directory)

Full podman debug:
http://paste.openstack.org/show/hDyRznDV1Wv5K4lqNNVr/

Describe the results you expected:
Container starts correctly.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

Version:       0.9.1-dev                                                                      
Go Version:    go1.10.2
OS/Arch:       linux/amd64

Output of podman info:

host:
  Conmon:
    package: podman-0.8.2.1-1.gitf38eb4f.el7.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 1.12.0-dev, commit: 74bba30adae4fdcacf887d35e4d6331318765028-dirty'
  MemFree: 4904316928
  MemTotal: 16657346560
  OCIRuntime:
    package: runc-1.0.0-37.rc5.dev.gitad0f525.el7.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.0'
  SwapFree: 1610608640
  SwapTotal: 1610608640
  arch: amd64
  cpus: 4
  hostname: undercloud.localdomain
  kernel: 3.10.0-862.11.6.el7.x86_64
  os: linux
  uptime: 5h 29m 8.39s (Approximately 0.21 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
  - registry.centos.org
  - registry.access.redhat.com
store:
  ContainerStore:
    number: 49
  GraphDriverName: overlay
  GraphOptions:
  - overlay.override_kernel_check=true
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
  ImageStore:
    number: 29
  RunRoot: /var/run/containers/storage

Additional environment details (AWS, VirtualBox, physical, etc.):
Running it in a VM managed by libvirt/kvm.

@mheon
Copy link
Member

mheon commented Sep 5, 2018

This looks like it's related to a /dev:/dev mount - we're mounting in /dev/ as a bind-mount, but still leaving the default devtmpfs mount in the config. Need to ensure we don't allow conflicting mounts in the config.

@rhatdan
Copy link
Member

rhatdan commented Sep 6, 2018

What is the exact podman command that is causing this issue?

@EmilienM
Copy link
Contributor Author

EmilienM commented Sep 6, 2018

@rhatdan

"Error running ['podman', '--log-level=debug', 'run', '--name', 'iscsid', '--label', 'config_id=tripleo_step3', '--label', 'container_name=iscsid', '--label', 'managed_by=paunch', '--label', 'config_data={"start_order": 2, "healthcheck": {"test": "/openstack/healthcheck"}, "image": "docker.io/tripleomaster/centos-binary-iscsid:9ad93affedba8870315dd72c714770875ce24759_b72f0c42", "environment": ["KOLLA_CONFIG_STRATEGY=COPY_ALWAYS"], "volumes": ["/etc/hosts:/etc/hosts:ro", "/etc/localtime:/etc/localtime:ro", "/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro", "/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro", "/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro", "/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro", "/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro", "/dev/log:/dev/log", "/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro", "/etc/puppet:/etc/puppet:ro", "/var/lib/kolla/config_files/iscsid.json:/var/lib/kolla/config_files/config.json:ro", "/dev/:/dev/", "/run/:/run/", "/sys:/sys", "/lib/modules:/lib/modules:ro", "/etc/iscsi:/var/lib/kolla/config_files/src-iscsid:ro"], "net": "host", "privileged": true, "restart": "always"}', '--detach=true', '--env=KOLLA_CONFIG_STRATEGY=COPY_ALWAYS', '--net=host', '--privileged=true', '--volume=/etc/hosts:/etc/hosts:ro', '--volume=/etc/localtime:/etc/localtime:ro', '--volume=/etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro', '--volume=/etc/pki/ca-trust/source/anchors:/etc/pki/ca-trust/source/anchors:ro', '--volume=/etc/pki/tls/certs/ca-bundle.crt:/etc/pki/tls/certs/ca-bundle.crt:ro', '--volume=/etc/pki/tls/certs/ca-bundle.trust.crt:/etc/pki/tls/certs/ca-bundle.trust.crt:ro', '--volume=/etc/pki/tls/cert.pem:/etc/pki/tls/cert.pem:ro', '--volume=/dev/log:/dev/log', '--volume=/etc/ssh/ssh_known_hosts:/etc/ssh/ssh_known_hosts:ro', '--volume=/etc/puppet:/etc/puppet:ro', '--volume=/var/lib/kolla/config_files/iscsid.json:/var/lib/kolla/config_files/config.json:ro', '--volume=/dev/:/dev/', '--volume=/run/:/run

@mheon mheon added the bug label Sep 6, 2018
@mheon
Copy link
Member

mheon commented Sep 6, 2018

Going to give this a go after lunch

@rhatdan
Copy link
Member

rhatdan commented Sep 6, 2018

Why would /dev/null disappear?

@mheon
Copy link
Member

mheon commented Sep 6, 2018

@rhatdan @mrunalp was fairly certain it related to an overlapping mount on /dev/ - the default devtmpfs and a bind-mount of host /dev over it. runc seems to have issues dealing with this.

openstack-gerrit pushed a commit to openstack-archive/tripleo-heat-templates that referenced this issue Sep 7, 2018
We currently hit this bug: containers/podman#1412
In order to move forward, let's bind-mount /dev/null into the container
until the bug is fixed. Note, it doesn't hurt docker deployment as we
already mounted /dev.

Related-Bug: #1791167
Change-Id: I0e885c248bb08c04fb9b7efa9e075e692879b450
@rhatdan
Copy link
Member

rhatdan commented Sep 8, 2018

I would like to see a simplifier for this.
@mheon Have you attempted to run this huge command above and seen know /dev/null?

@mheon
Copy link
Member

mheon commented Sep 8, 2018

@rhatdan I haven't managed to reproduce independently, but @EmilienM has a consistent reproducer on his environment, and let me debug over there

@mheon
Copy link
Member

mheon commented Sep 12, 2018

@EmilienM With #1419 merged, this should be fixed, but I'll leave it open until you can confirm

@EmilienM
Copy link
Contributor Author

@mheon ack - I'll test it asap this week. Thanks for the heads-up.

@EmilienM
Copy link
Contributor Author

@mheon I built podman from master and tried a redeploy, it's still failing. Complete trace: https://paste.fedoraproject.org/paste/Rx9jBzlH8VAw2iRUlilzow
I'm now trying to re-deploy without the /dev/null bind-mount that I had to do and see how it would work.

@EmilienM
Copy link
Contributor Author

well, it's still not working: http://paste.openstack.org/show/V7VOIzpvBDLPVbBI02NS/
I'll build again podman and retry on my tomorrow morning.

@EmilienM
Copy link
Contributor Author

I haven't been able to make it work yet. I don't give up. Will update here when I'm done with this bug.

@mheon
Copy link
Member

mheon commented Sep 13, 2018

Alright, we've moved from /dev/null missing to CGroups errors. That's progress, at least

@mheon
Copy link
Member

mheon commented Sep 13, 2018

I bet this is related to the issues with -v /sys

@EmilienM
Copy link
Contributor Author

oh right, we moved forward indeed. I guess I'll revert my workaround. Thanks! I'll probably close the card if we can confirm it's -v /sys or next week.

@EmilienM
Copy link
Contributor Author

This is the latest results from today:

"stderr: container create failed: container_linux.go:336: starting container process caused "process_linux.go:399: container init caused \"rootfs_linux.go:58: mounting \\\"/var/lib/containers/storage/overlay-containers/20c572c4fa8107b525c274c9dcf47ef57de4177d1486651ade322ab9cf78dda8/userdata/cgroup\\\" to rootfs \\\"/var/lib/containers/storage/overlay/c398946e69285ed58a50530082b05958bb2b8e1e2d9b309de022affe30dca57e/merged\\\" at \\\"/sys/fs/cgroup\\\" caused \\\"stat /sys/fs/cgroup/systemd/machine.slice/libpod-20c572c4fa8107b525c274c9dcf47ef57de4177d1486651ade322ab9cf78dda8.scope: no such file or directory\\\"\""",
": internal libpod error",

@mheon
Copy link
Member

mheon commented Sep 14, 2018

Alright, this does sound like /sys - I'll keep looking into it on Monday

@EmilienM
Copy link
Contributor Author

so my iscsid container is not working at all:

 podman logs iscsid
+ sudo -E kolla_set_configs
INFO:__main__:Loading config file at /var/lib/kolla/config_files/config.json
INFO:__main__:Validating config file
INFO:__main__:Kolla config strategy set to: COPY_ALWAYS
INFO:__main__:Copying service configuration files
INFO:__main__:Deleting /etc/iscsi/iscsid.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src-iscsid/iscsid.conf to /etc/iscsi/iscsid.conf
INFO:__main__:Copying /var/lib/kolla/config_files/src-iscsid/initiatorname.iscsi to /etc/iscsi/initiatorname.iscsi
INFO:__main__:Writing out command to execute
++ cat /run_command
+ CMD='/usr/sbin/iscsid -f'
+ ARGS=
+ [[ ! -n '' ]]
+ . kolla_extend_start
++ [[ ! -f /etc/iscsi/initiatorname.iscsi ]]
+ echo 'Running command: '\''/usr/sbin/iscsid -f'\'''
Running command: '/usr/sbin/iscsid -f'
+ exec /usr/sbin/iscsid -f
iscsid: Could not set session1 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection1:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.15,3260] through [iface: default] is operational now
iscsid: Could not set session2 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection2:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.6,3260] through [iface: default] is operational now
iscsid: Could not set session3 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection3:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.8,3260] through [iface: default] is operational now
iscsid: Connection1:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.15,3260] through [iface: default] is shutdown.
iscsid: Connection2:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.6,3260] through [iface: default] is shutdown.
iscsid: Connection3:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.8,3260] through [iface: default] is shutdown.
iscsid: Could not set session4 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection4:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.11,3260] through [iface: default] is operational now
iscsid: Connection4:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.11,3260] through [iface: default] is shutdown.
iscsid: Could not set session5 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection5:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.15,3260] through [iface: default] is operational now
iscsid: Connection5:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.15,3260] through [iface: default] is shutdown.
iscsid: Could not set session6 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection6:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.9,3260] through [iface: default] is operational now
iscsid: Connection6:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.9,3260] through [iface: default] is shutdown.
iscsid: Could not set session7 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection7:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.14,3260] through [iface: default] is operational now
iscsid: Connection7:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.14,3260] through [iface: default] is shutdown.
iscsid: Could not set session8 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection8:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.8,3260] through [iface: default] is operational now
iscsid: Connection8:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.8,3260] through [iface: default] is shutdown.
iscsid: Could not set session9 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection9:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.22,3260] through [iface: default] is operational now
iscsid: Connection9:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.22,3260] through [iface: default] is shutdown.
iscsid: Could not set session10 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection10:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.12,3260] through [iface: default] is operational now
iscsid: Connection10:0 to [target: iqn.2008-10.org.openstack:a07fb57d-fe2e-4f10-9fd4-8cec3d47aa0f, portal: 192.168.24.12,3260] through [iface: default] is shutdown.
iscsid: Could not set session11 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection11:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.15,3260] through [iface: default] is operational now
iscsid: Connection11:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.15,3260] through [iface: default] is shutdown.
iscsid: Could not set session12 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection12:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.11,3260] through [iface: default] is operational now
iscsid: Connection12:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.11,3260] through [iface: default] is shutdown.
iscsid: Could not set session13 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection13:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.13,3260] through [iface: default] is operational now
iscsid: Connection13:0 to [target: iqn.2008-10.org.openstack:3c8e6a6a-a6fc-450b-8c0b-3ce4e97aab5f, portal: 192.168.24.13,3260] through [iface: default] is shutdown.
time="2018-09-14T22:00:43Z" level=error msg="exec failed: container_linux.go:336: starting container process caused \"open /dev/ptmx: no such file or directory\"
"
iscsid: Could not set session14 priority. READ/WRITE throughout and latency could be affected.
iscsid: Connection14:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.16,3260] through [iface: default] is operational now
iscsid: Connection14:0 to [target: iqn.2008-10.org.openstack:ea9c786e-7efa-4efc-b4cc-ca81a3f34237, portal: 192.168.24.16,3260] through [iface: default] is shutdown.
time="2018-09-14T22:01:08Z" level=error msg="exec failed: container_linux.go:336: starting container process caused \"open /dev/ptmx: no such file or directory\"
"
time="2018-09-14T22:01:39Z" level=error msg="exec failed: container_linux.go:336: starting container process caused \"open /dev/ptmx: no such file or directory\"
"

@rhatdan
Copy link
Member

rhatdan commented Sep 16, 2018

Could you try this with podman 0.9.2?

@EmilienM
Copy link
Contributor Author

testing today with master, will post results shortly.

@EmilienM
Copy link
Contributor Author

so I removed my workaround that mounted /dev/null in iscsid container and it now works with podman 0.9.2!
@mheon @rhatdan Thank you!

Next: I'm going to redeploy with my systemd unit files and see how far I can go.

@mheon
Copy link
Member

mheon commented Sep 16, 2018

Huh. I still don't have /sys containers working locally, but that could be an environment issue on my end... I'll check it out on Monday.

@rhatdan
Copy link
Member

rhatdan commented Sep 17, 2018

@mheon I think you are seeing a Red Herring. Or different issue.

@rhatdan rhatdan closed this as completed Sep 17, 2018
@mheon
Copy link
Member

mheon commented Sep 17, 2018

@rhatdan Agree it's a different issue, still tracking it down. I think it has something to do with our /dev/pts mount.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

3 participants