Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

singularity won't start and shows "Bus error (core dumped)" #5334

Closed
soichih opened this issue Jun 2, 2020 · 2 comments
Closed

singularity won't start and shows "Bus error (core dumped)" #5334

soichih opened this issue Jun 2, 2020 · 2 comments

Comments

@soichih
Copy link
Contributor

soichih commented Jun 2, 2020

I am using singularity version 3.5.2

When I runt the following command on our slurm cluster (and the head node)

$ singularity -d exec -e docker://brainlife/mrtrix3:3.0_RC3 ./run.sh

I get the following error message.

$ singularity -d exec -e docker://brainlife/mrtrix3:3.0_RC3 ./run.sh
DEBUG   [U=1001,P=3485]    persistentPreRunE()           Singularity version: 3.5.2
DEBUG   [U=1001,P=3485]    handleConfDir()               /home/user/.singularity already exists. Not creating.
DEBUG   [U=1001,P=3485]    setValue()                    Updated flag 'bind' value to: [/export]
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/library
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/oci-tmp
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/oci
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/net
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/shub
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/oras
DEBUG   [U=1001,P=3485]    parseURI()                    Parsing docker://brainlife/mrtrix3:3.0_RC3 into reference
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/oci-tmp/c71e252ab45b69ce96ff07e4ea175392811eae8def326e90a74c3a89b4f9a7e9
DEBUG   [U=1001,P=3485]    updateCacheSubdir()           Caching directory set to /export/singularity/cache/oci-tmp/c71e252ab45b69ce96ff07e4ea175392811eae8def326e90a74c3a89b4f9a7e9
DEBUG   [U=1001,P=3485]    execStarter()                 Checking for encrypted system partition
DEBUG   [U=1001,P=3485]    Init()                        Image format detection
DEBUG   [U=1001,P=3485]    Init()                        Check for sandbox image format
DEBUG   [U=1001,P=3485]    Init()                        sandbox format initializer returned: not a directory image
DEBUG   [U=1001,P=3485]    Init()                        Check for sif image format
DEBUG   [U=1001,P=3485]    Init()                        sif image format detected
VERBOSE [U=1001,P=3485]    SetContainerEnv()             Not forwarding SINGULARITY_BINDPATH from user to container environment
VERBOSE [U=1001,P=3485]    SetContainerEnv()             Not forwarding SINGULARITY_CACHEDIR from user to container environment
VERBOSE [U=1001,P=3485]    SetContainerEnv()             HOME=/home/user
DEBUG   [U=1001,P=3485]    init()                        Use starter binary /usr/local/libexec/singularity/bin/starter-suid
VERBOSE [U=0,P=3485]       print()                       Set messagelevel to: 5
VERBOSE [U=0,P=3485]       init()                        Starter initialization
DEBUG   [U=0,P=3485]       load_overlay_module()         Trying to load overlay kernel module
DEBUG   [U=0,P=3485]       load_overlay_module()         Overlay seems supported by the kernel
DEBUG   [U=0,P=3485]       get_pipe_exec_fd()            PIPE_EXEC_FD value: 9
VERBOSE [U=0,P=3485]       is_suid()                     Check if we are running as setuid
VERBOSE [U=0,P=3485]       priv_drop()                   Drop root privileges
DEBUG   [U=1001,P=3485]    init()                        Read engine configuration
DEBUG   [U=1001,P=3485]    init()                        Wait completion of stage1
VERBOSE [U=1001,P=3512]    priv_drop()                   Drop root privileges permanently
DEBUG   [U=1001,P=3512]    set_parent_death_signal()     Set parent death signal to 9
VERBOSE [U=1001,P=3512]    init()                        Spawn stage 1
DEBUG   [U=1001,P=3512]    startup()                     singularity runtime engine selected
VERBOSE [U=1001,P=3512]    startup()                     Execute stage 1
DEBUG   [U=1001,P=3512]    StageOne()                    Entering stage 1
DEBUG   [U=1001,P=3512]    prepareAutofs()               Found "/proc/sys/fs/binfmt_misc" as autofs mount point
DEBUG   [U=1001,P=3512]    prepareAutofs()               Could not keep file descriptor for user bind path /export: no mount point
DEBUG   [U=1001,P=3512]    prepareAutofs()               Could not keep file descriptor for bind path /etc/localtime: no mount point
DEBUG   [U=1001,P=3512]    prepareAutofs()               Could not keep file descriptor for bind path /etc/hosts: no mount point
DEBUG   [U=1001,P=3512]    prepareAutofs()               Could not keep file descriptor for home directory /home/user: no mount point
DEBUG   [U=1001,P=3512]    prepareAutofs()               Could not keep file descriptor for current working directory /export/prod/5eb954f615be15254a4c4011/5ed64bf1529ab43f0984245c: no mount point
DEBUG   [U=1001,P=3512]    Init()                        Image format detection
DEBUG   [U=1001,P=3512]    Init()                        Check for sandbox image format
DEBUG   [U=1001,P=3512]    Init()                        sandbox format initializer returned: not a directory image
DEBUG   [U=1001,P=3512]    Init()                        Check for sif image format
DEBUG   [U=1001,P=3512]    Init()                        sif image format detected
DEBUG   [U=1001,P=3512]    setSessionLayer()             Overlay seems supported and allowed by kernel
DEBUG   [U=1001,P=3512]    setSessionLayer()             Attempting to use overlayfs (enable overlay = try)
VERBOSE [U=1001,P=3485]    wait_child()                  stage 1 exited with status 0
DEBUG   [U=1001,P=3485]    cleanup_fd()                  Close file descriptor 4
DEBUG   [U=1001,P=3485]    cleanup_fd()                  Close file descriptor 5
DEBUG   [U=1001,P=3485]    cleanup_fd()                  Close file descriptor 6
DEBUG   [U=1001,P=3485]    init()                        Set child signal mask
DEBUG   [U=1001,P=3485]    init()                        Create socketpair for master communication channel
DEBUG   [U=1001,P=3485]    init()                        Create RPC socketpair for communication between stage 2 and RPC server
VERBOSE [U=1001,P=3485]    priv_escalate()               Get root privileges
VERBOSE [U=0,P=3485]       priv_escalate()               Change filesystem uid to 1001
VERBOSE [U=0,P=3485]       init()                        Spawn master process
DEBUG   [U=0,P=3530]       set_parent_death_signal()     Set parent death signal to 9
VERBOSE [U=0,P=3530]       create_namespace()            Create mount namespace
VERBOSE [U=0,P=3485]       enter_namespace()             Entering in mount namespace
DEBUG   [U=0,P=3485]       enter_namespace()             Opening namespace file ns/mnt
VERBOSE [U=0,P=3485]       priv_drop()                   Drop root privileges
VERBOSE [U=0,P=3530]       create_namespace()            Create mount namespace
DEBUG   [U=0,P=3531]       set_parent_death_signal()     Set parent death signal to 9
VERBOSE [U=0,P=3531]       init()                        Spawn RPC server
DEBUG   [U=1001,P=3485]    startup()                     singularity runtime engine selected
VERBOSE [U=1001,P=3485]    startup()                     Execute master process
DEBUG   [U=0,P=3531]       startup()                     singularity runtime engine selected
VERBOSE [U=0,P=3531]       startup()                     Serve RPC requests
DEBUG   [U=1001,P=3485]    setupSessionLayout()          Using Layer system: overlay
DEBUG   [U=1001,P=3485]    setupOverlayLayout()          Creating overlay SESSIONDIR layout
DEBUG   [U=1001,P=3485]    addRootfsMount()              Mount rootfs in read-only mode
DEBUG   [U=1001,P=3485]    addRootfsMount()              Image type is 4096
DEBUG   [U=1001,P=3485]    addRootfsMount()              Mounting block [squashfs] image: /export/singularity/cache/oci-tmp/c71e252ab45b69ce96ff07e4ea175392811eae8def326e90a74c3a89b4f9a7e9/mrtrix3_3.0_RC3.sif
DEBUG   [U=1001,P=3485]    addKernelMount()              Checking configuration file for 'mount proc'
DEBUG   [U=1001,P=3485]    addKernelMount()              Adding proc to mount list
VERBOSE [U=1001,P=3485]    addKernelMount()              Default mount: /proc:/proc
DEBUG   [U=1001,P=3485]    addKernelMount()              Checking configuration file for 'mount sys'
DEBUG   [U=1001,P=3485]    addKernelMount()              Adding sysfs to mount list
VERBOSE [U=1001,P=3485]    addKernelMount()              Default mount: /sys:/sys
DEBUG   [U=1001,P=3485]    addDevMount()                 Checking configuration file for 'mount dev'
DEBUG   [U=1001,P=3485]    addDevMount()                 Adding dev to mount list
VERBOSE [U=1001,P=3485]    addDevMount()                 Default mount: /dev:/dev
DEBUG   [U=1001,P=3485]    addHostMount()                Not mounting host file systems per configuration
VERBOSE [U=1001,P=3485]    addBindsMount()               Found 'bind path' = /etc/localtime, /etc/localtime
VERBOSE [U=1001,P=3485]    addBindsMount()               Found 'bind path' = /etc/hosts, /etc/hosts
DEBUG   [U=1001,P=3485]    addHomeStagingDir()           Staging home directory (/home/user) at /usr/local/var/singularity/mnt/session/home/user
DEBUG   [U=1001,P=3485]    addHomeMount()                Adding home directory mount [/usr/local/var/singularity/mnt/session/home/user:/home/user] to list using layer: overlay
DEBUG   [U=1001,P=3485]    addUserbindsMount()           Adding /export to mount list
DEBUG   [U=1001,P=3485]    addUserbindsMount()           Checking for 'user bind control' in configuration file
DEBUG   [U=1001,P=3485]    addTmpMount()                 Checking for 'mount tmp' in configuration file
VERBOSE [U=1001,P=3485]    addTmpMount()                 Default mount: /tmp:/tmp
VERBOSE [U=1001,P=3485]    addTmpMount()                 Default mount: /var/tmp:/var/tmp
DEBUG   [U=1001,P=3485]    addScratchMount()             Not mounting scratch directory: Not requested
DEBUG   [U=1001,P=3485]    addCwdMount()                 Using /export/prod/5eb954f615be15254a4c4011/5ed64bf1529ab43f0984245c as current working directory
VERBOSE [U=1001,P=3485]    addCwdMount()                 Default mount: /export/prod/5eb954f615be15254a4c4011/5ed64bf1529ab43f0984245c: to the container
DEBUG   [U=1001,P=3485]    addLibsMount()                Checking for 'user bind control' in configuration file
DEBUG   [U=1001,P=3485]    addFilesMount()               Checking for 'user bind control' in configuration file
DEBUG   [U=1001,P=3485]    addResolvConfMount()          Adding /etc/resolv.conf to mount list
VERBOSE [U=1001,P=3485]    addResolvConfMount()          Default mount: /etc/resolv.conf:/etc/resolv.conf
DEBUG   [U=1001,P=3485]    addHostnameMount()            Skipping hostname mount, not virtualizing UTS namespace on user request
DEBUG   [U=1001,P=3485]    create()                      Mount all
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting tmpfs to /usr/local/var/singularity/mnt/session
DEBUG   [U=1001,P=3485]    mountImage()                  Mounting loop device /dev/loop0 to /usr/local/var/singularity/mnt/session/rootfs of type squashfs
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting overlay to /usr/local/var/singularity/mnt/session/final
DEBUG   [U=1001,P=3485]    setPropagationMount()         Set RPC mount propagation flag to SLAVE
VERBOSE [U=1001,P=3485]    Passwd()                      Checking for template passwd file: /usr/local/var/singularity/mnt/session/rootfs/etc/passwd
VERBOSE [U=1001,P=3485]    Passwd()                      Creating passwd content
VERBOSE [U=1001,P=3485]    Passwd()                      Creating template passwd file and appending user data: /usr/local/var/singularity/mnt/session/rootfs/etc/passwd
DEBUG   [U=1001,P=3485]    addIdentityMount()            Adding /etc/passwd to mount list
VERBOSE [U=1001,P=3485]    addIdentityMount()            Default mount: /etc/passwd:/etc/passwd
VERBOSE [U=1001,P=3485]    Group()                       Checking for template group file: /usr/local/var/singularity/mnt/session/rootfs/etc/group
VERBOSE [U=1001,P=3485]    Group()                       Creating group content
DEBUG   [U=1001,P=3485]    addIdentityMount()            Adding /etc/group to mount list
VERBOSE [U=1001,P=3485]    addIdentityMount()            Default mount: /etc/group:/etc/group
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /dev to /usr/local/var/singularity/mnt/session/final/dev
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /etc/localtime to /usr/local/var/singularity/mnt/session/final/usr/share/zoneinfo/Zulu
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /etc/hosts to /usr/local/var/singularity/mnt/session/final/etc/hosts
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /usr/local/etc/singularity/actions to /usr/local/var/singularity/mnt/session/final/.singularity.d/actions
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final/.singularity.d/actions
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /proc to /usr/local/var/singularity/mnt/session/final/proc
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final/proc
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting sysfs to /usr/local/var/singularity/mnt/session/final/sys
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /home/user to /usr/local/var/singularity/mnt/session/home/user
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/home/user
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /usr/local/var/singularity/mnt/session/home/user to /usr/local/var/singularity/mnt/session/final/home/user
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /tmp to /usr/local/var/singularity/mnt/session/final/tmp
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final/tmp
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /var/tmp to /usr/local/var/singularity/mnt/session/final/var/tmp
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final/var/tmp
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /export/prod/5eb954f615be15254a4c4011/5ed64bf1529ab43f0984245c to /usr/local/var/singularity/mnt/session/final/export/prod/5eb954f615be15254a4c4011/5ed64bf1529ab43f0984245c
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final/export/prod/5eb954f615be15254a4c4011/5ed64bf1529ab43f0984245c
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /usr/local/var/singularity/mnt/session/etc/resolv.conf to /usr/local/var/singularity/mnt/session/final/etc/resolv.conf
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /usr/local/var/singularity/mnt/session/etc/passwd to /usr/local/var/singularity/mnt/session/final/etc/passwd
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /usr/local/var/singularity/mnt/session/etc/group to /usr/local/var/singularity/mnt/session/final/etc/group
DEBUG   [U=1001,P=3485]    mountGeneric()                Mounting /export to /usr/local/var/singularity/mnt/session/final/export
DEBUG   [U=1001,P=3485]    mountGeneric()                Remounting /usr/local/var/singularity/mnt/session/final/export
DEBUG   [U=1001,P=3485]    create()                      Chroot into /usr/local/var/singularity/mnt/session/final
DEBUG   [U=0,P=3531]       Chroot()                      Hold reference to host / directory
DEBUG   [U=0,P=3531]       Chroot()                      Called pivot_root on /usr/local/var/singularity/mnt/session/final
DEBUG   [U=0,P=3531]       Chroot()                      Change current directory to host / directory
DEBUG   [U=0,P=3531]       Chroot()                      Apply slave mount propagation for host / directory
DEBUG   [U=0,P=3531]       Chroot()                      Called unmount(/, syscall.MNT_DETACH)
DEBUG   [U=0,P=3531]       Chroot()                      Changing directory to / to avoid getpwd issues
DEBUG   [U=1001,P=3485]    create()                      Chdir into / to avoid errors
VERBOSE [U=0,P=3530]       wait_child()                  rpc server exited with status 0
DEBUG   [U=0,P=3530]       apply_container_privileges()  Set user ID to 1001
DEBUG   [U=1001,P=3530]    set_parent_death_signal()     Set parent death signal to 9
DEBUG   [U=1001,P=3530]    startup()                     singularity runtime engine selected
VERBOSE [U=1001,P=3530]    startup()                     Execute stage 2
DEBUG   [U=1001,P=3530]    StageTwo()                    Entering stage 2
DEBUG   [U=1001,P=3485]    PostStartProcess()            Post start process
DEBUG   [U=1001,P=3485]    Master()                      Child exited due to signal 7
Bus error (core dumped)

I am able to run the other container with more recent tag, and I can run the 3.0_RC3 container on under different user just fine, so I believe something went wrong with the caching, or the cache is corrupted (we run a lot of jobs simultaneously on this cluster, so it could be some kind of race condition that causes the cache to go bad?)

The OS I am using is Ubuntu 18.04.4 LST.

Is there a way to force re-caching of a single container (both the blobs and SIF image)?

@soichih
Copy link
Contributor Author

soichih commented Jun 2, 2020

Update. I went through singularity/cache/oci-tmp directory, and found the container that I was having a problem with, and I've removed it and rerun the singularity exec again. The SIF image got recreated and this time I was able to run it just fine.

@dtrudg
Copy link
Contributor

dtrudg commented Jun 2, 2020

The caching in singularity <= 3.5 is not safe for concurrent operations on a cluster. If you are running concurrent operations you should singularity pull docker://xxxx once, and then run the resulting SIF file.

3.6 will have caching that is safe, as long as the underlying filesystem supports atomic rename operations.

singularity cache list and singularity cache clean exist to provide (limited) functionality for identifying and removing things in the cache.

@dtrudg dtrudg closed this as completed Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants