Skip to content
This repository has been archived by the owner on Dec 30, 2020. It is now read-only.

Cannot spawn container - no loop devices available #373

Open
ghost opened this issue Apr 14, 2020 · 2 comments
Open

Cannot spawn container - no loop devices available #373

ghost opened this issue Apr 14, 2020 · 2 comments
Labels
bug Something isn't working

Comments

@ghost
Copy link

ghost commented Apr 14, 2020

What are the steps to reproduce this issue?

  1. Create a kubernetes deployment
  2. Pods fail to be deployed

What happens?

Pods fail to be created with:

Error: could not create container: could not spawn container: could not create oci bundle: could not create SIF bundle: failed to find loop device: failed     to attach image /var/lib/singularity/c894da36f0c207a33e862b9a38b3a66d7e02857aa959493df3ff455830f305f8: no loop devices available

What were you expecting to happen?

Pod to be created.

Any logs, error output, comments, etc?

There are plenty of loop devices:

[k8swrk3]/tmp/singularity-cri% ls -l /dev | grep -i loop | wc -l       
1612

I've tried this with setuid both set and unset on the singularity binary to no joy, as I saw reports that can cause issues.

sycri logs and kube manifest appended at the bottom.

I can also run containers happily on the workers directly:

[robinsla@k8swrk3]/tmp/singularity-cri% singularity run shub://GodloveD/lolcow 
 ________________________________________
/ Tomorrow will be cancelled due to lack \
\ of interest.                           /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Environment?

OS distribution and version: RHEL 7 (3.10.0-1062.12.1.el7.x86_64)

go version: go1.13.3

Singularity-CRI version: 1.0.0-beta.7

Singularity version: 3.5.2-1.1

Kubernetes version: v1.18.0

Manifest

  1 apiVersion: apps/v1
  2 kind: Deployment
  3 metadata:
  4   labels:
  5     run: hello-world
  6   name: hello-world
  7 spec:
  8   selector:
  9     matchLabels:
 10       run: hello-world
 11   replicas: 4 
 12   template:
 13     metadata:
 14       labels:
 15         run: hello-world
 16     spec:
 17       hostNetwork: true
 18       containers:
 19       - name: nginx
 20         image: nginx:1.7.9
 21         ports:
 22         - containerPort: 80
 23           name: web
 24           protocol: TCP

SYCRI logs

Apr 14 07:08:39  sycri[30593]: E0414 07:08:39.924688   30593 main.go:276] /runtime.v1alpha2.RuntimeService/CreateContainer
Apr 14 07:08:39  sycri[30593]: Request: {"pod_sandbox_id":"a03bcbd94bbde213829948d35dd4a58a8d94752c48805185e2a87993d20d6d38","config":{"metadata":{"name":"nginx"},"image":{"image":"a76b355b668c43aca9432a3e8e15b2f17878966fbebadebcb7d45df68b314dd3"},"envs":[{"key":"KUBERNETES_SERVICE_HOST","value":"192.168.240.1"},{"key":"KUBERNETES_SERVICE_PORT","value":"443"},{"key":"KUBERNETES_SERVICE_PORT_HTTPS","value":"443"},{"key":"KUBERNETES_PORT","value":"tcp://192.168.240.1:443"},{"key":"KUBERNETES_PORT_443_TCP","value":"tcp://192.168.240.1:443"},{"key":"KUBERNETES_PORT_443_TCP_PROTO","value":"tcp"},{"key":"KUBERNETES_PORT_443_TCP_PORT","value":"443"},{"key":"KUBERNETES_PORT_443_TCP_ADDR","value":"192.168.240.1"}],"mounts":[{"container_path":"/var/run/secrets/kubernetes.io/serviceaccount","host_path":"/var/lib/kubelet/pods/8bff18c4-3e7b-4a0d-b732-883a3cef54b9/volumes/kubernetes.io~secret/default-token-85h4q","readonly":true,"selinux_relabel":true},{"container_path":"/etc/hosts","host_path":"/var/lib/kubelet/pods/8bff18c4-3e7b-4a0d-b732-883a3cef54b9/etc-hosts","selinux_relabel":true},{"container_path":"/dev/termination-log","host_path":"/var/lib/kubelet/pods/8bff18c4-3e7b-4a0d-b732-883a3cef54b9/containers/nginx/31a5bd44","selinux_relabel":true}],"labels":{"io.kubernetes.container.name":"nginx","io.kubernetes.pod.name":"hello-world-686ff49dc9-pv2rr","io.kubernetes.pod.namespace":"default","io.kubernetes.pod.uid":"8bff18c4-3e7b-4a0d-b732-883a3cef54b9"},"annotations":{"io.kubernetes.container.hash":"cf08d707","io.kubernetes.container.ports":"[{\"name\":\"web\",\"hostPort\":80,\"containerPort\":80,\"protocol\":\"TCP\"}]","io.kubernetes.container.restartCount":"0","io.kubernetes.container.terminationMessagePath":"/dev/termination-log","io.kubernetes.container.terminationMessagePolicy":"File","io.kubernetes.pod.terminationGracePeriod":"30"},"log_path":"nginx/0.log","linux":{"resources":{"cpu_period":100000,"cpu_shares":2,"oom_score_adj":1000},"security_context":{"namespace_options":{"network":2,"pid":1},"run_as_user":{},"seccomp_profile_path":"unconfined","masked_paths":["/proc/acpi","/proc/kcore","/proc/keys","/proc/latency_stats","/proc/timer_list","/proc/timer_stats","/proc/sched_debug","/proc/scsi","/sys/firmware"],"readonly_paths":["/proc/asound","/proc/bus","/proc/fs","/proc/irq","/proc/sys","/proc/sysrq-trigger"]}}},"sandbox_config":{"metadata":{"name":"hello-world-686ff49dc9-pv2rr","uid":"8bff18c4-3e7b-4a0d-b732-883a3cef54b9","namespace":"default"},"log_directory":"/var/log/pods/default_hello-world-686ff49dc9-pv2rr_8bff18c4-3e7b-4a0d-b732-883a3cef54b9","port_mappings":[{"container_port":80,"host_port":80}],"labels":{"io.kubernetes.pod.name":"hello-world-686ff49dc9-pv2rr","io.kubernetes.pod.namespace":"default","io.kubernetes.pod.uid":"8bff18c4-3e7b-4a0d-b732-883a3cef54b9","pod-template-hash":"686ff49dc9","run":"hello-world"},"annotations":{"kubernetes.io/config.seen":"2020-04-14T07:00:40.341542095-04:00","kubernetes.io/config.source":"api"},"linux":{"cgroup_parent":"/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-pod8bff18c4_3e7b_4a0d_b732_883a3cef54b9.slice","security_context":{"namespace_options":{"network":2,"pid":1}}}}}
Apr 14 07:08:39  sycri[30593]: Response: null
Apr 14 07:08:39  sycri[30593]: Error: rpc error: code = Internal desc = could not create container: could not spawn container: could not create oci bundle: could not create SIF bundle: failed to find loop device: failed to attach image /var/lib/singularity/a76b355b668c43aca9432a3e8e15b2f17878966fbebadebcb7d45df68b314dd3: no loop devices available
@ghost ghost added the bug Something isn't working label Apr 14, 2020
@LincolnBryant
Copy link

I want to plus one this issue. We also see it on Kubernetes 1.17.

@LincolnBryant
Copy link

We have found a reason why this happens.

When containers are stuck in a crashloopbackoff state, the Singularity CRI seems to exhaust the pool of loopback devices faster than it can clean them up.

You can confirm this by comparing the number of attached devices to the number of loopback devices in your /dev filesystem:

losetup --list | wc -l
 ls /dev/loop[0-9]* | wc -l

If they match, all of your available loopbacks are used.

I suppose you could try to increase the number of loopbacks, but really it should be investigated why Singularity is unable to clean up the old, unused ones.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant