Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Container exited immediately #459

Closed
terrytangyuan opened this issue Jun 18, 2022 · 9 comments
Closed

bug: Container exited immediately #459

terrytangyuan opened this issue Jun 18, 2022 · 9 comments

Comments

@terrytangyuan
Copy link
Member

terrytangyuan commented Jun 18, 2022

Description

It says the container did not start but it actually started but exited immediately. See the result of docker inspect in the following sections.

DEBU[2022-06-18T16:38:49-04:00] loaded docker image successfully              tag="r-basic:dev"
DEBU[2022-06-18T16:38:49-04:00] setting up container working directory        build-context=/Users/yuan.tang/go/src/github.com/tensorchord/envd/examples/r-basic container=r-basic gpu=false mount-path=/Users/yuan.tang/go/src/github.com/tensorchord/envd/examples/r-basic numGPUs=-1 tag="r-basic:dev" working-dir=/home/envd/r-basic
DEBU[2022-06-18T16:38:49-04:00] starting r-basic container                    build-context=/Users/yuan.tang/go/src/github.com/tensorchord/envd/examples/r-basic container=r-basic entrypoint="[tini -- bash -c set -e\n/var/envd/bin/envd-ssh --authorized-keys /var/envd/authorized_keys --port 54455 --shell bash &\n\nwait -n]" gpu=false numGPUs=-1 tag="r-basic:dev" working-dir=/home/envd/r-basic
DEBU[2022-06-18T16:38:49-04:00] waiting to start                              container=/r-basic
error: failed to start the envd environment: failed to wait until the container is running: timeout 30s: container did not start
error: timeout 30s: container did not start

Reproduction

After changes in #457. Run this example: https://github.com/tensorchord/envd/tree/main/examples/r-basic

Additional Info

docker inspect (click to expand)!
[
    {
        "Id": "23d61ce8ad2dec1d83abd727b50ef659e19f120113aa149cc1dfbc5577cc1eef",
        "Created": "2022-06-18T20:38:49.471058096Z",
        "Path": "tini",
        "Args": [
            "--",
            "bash",
            "-c",
            "set -e\n/var/envd/bin/envd-ssh --authorized-keys /var/envd/authorized_keys --port 54455 --shell bash \u0026\n\nwait -n"
        ],
        "State": {
            "Status": "exited",
            "Running": false,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 0,
            "ExitCode": 1,
            "Error": "",
            "StartedAt": "2022-06-18T20:38:49.74131568Z",
            "FinishedAt": "2022-06-18T20:38:49.993284555Z"
        },
        "Image": "sha256:697b54f6bebd73781554d1c424c0096f87cad631006e98c62892f625ec317b00",
        "ResolvConfPath": "/var/lib/docker/containers/23d61ce8ad2dec1d83abd727b50ef659e19f120113aa149cc1dfbc5577cc1eef/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/23d61ce8ad2dec1d83abd727b50ef659e19f120113aa149cc1dfbc5577cc1eef/hostname",
        "HostsPath": "/var/lib/docker/containers/23d61ce8ad2dec1d83abd727b50ef659e19f120113aa149cc1dfbc5577cc1eef/hosts",
        "LogPath": "/var/lib/docker/containers/23d61ce8ad2dec1d83abd727b50ef659e19f120113aa149cc1dfbc5577cc1eef/23d61ce8ad2dec1d83abd727b50ef659e19f120113aa149cc1dfbc5577cc1eef-json.log",
        "Name": "/r-basic",
        "RestartCount": 0,
        "Driver": "overlay2",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": null,
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {}
            },
            "NetworkMode": "default",
            "PortBindings": {
                "54455/tcp": [
                    {
                        "HostIp": "127.0.0.1",
                        "HostPort": "54455"
                    }
                ]
            },
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": null,
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": null,
            "DeviceCgroupRules": null,
            "DeviceRequests": null,
            "KernelMemory": 0,
            "KernelMemoryTCP": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "Mounts": [
                {
                    "Type": "bind",
                    "Source": "/Users/yuan.tang/go/src/github.com/tensorchord/envd/examples/r-basic",
                    "Target": "/home/envd/r-basic"
                }
            ],
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/f618ec47e40fe26f41092321ac49e5b59157ab7f93881cd0fedf26f8cc042100-init/diff:/var/lib/docker/overlay2/00fd5ee141efd06eee43ec1fceb4b10770dad38b8b6646e99a5fbf94202e2378/diff:/var/lib/docker/overlay2/ff2258dc382ff21742fd8a9cd17472721fcbdd696a6b298c7eab1690581c8001/diff:/var/lib/docker/overlay2/8f469f1f20d0efc9488a0b896bb588b8372acf523eed800de87a43aedd231beb/diff:/var/lib/docker/overlay2/3aa0081a308f5616b111d36b6787f228852b32a18f0bba077fb125d51e8071fe/diff:/var/lib/docker/overlay2/65792da369be06c40b025e491e290e22733e4199fba64732041d268cf7c4cc96/diff:/var/lib/docker/overlay2/9aee4b6c784816c38933dd34c5ea7ed6aa880007882e243343d27a7f3807fd33/diff:/var/lib/docker/overlay2/4227b6cc7585abd69f2769315a9e64ab32be04e0b5e1c86bbf2a959cf75228a8/diff:/var/lib/docker/overlay2/6936caabf4f5bbe1e084bd950bd08f8c6c11d6162f0a1c03e5a325ada56acc12/diff:/var/lib/docker/overlay2/ee71884ac029927b725d39085f0c19245cd74e3a30c49de0435d4ce32ba86fb6/diff:/var/lib/docker/overlay2/2ab489ad0ab653961362d22e7ecd609884ccc968c28be1d474ff8a70440164bc/diff:/var/lib/docker/overlay2/43fedf743a7abfa39f3dea350832844180a9aaf209c8937df1df8d8c575f576e/diff:/var/lib/docker/overlay2/bb04e9f6bd732cfc17416de3865b108fc3e3330cbca98c5bd39a42b0e1bfca16/diff:/var/lib/docker/overlay2/01f3679891dd572e678067e1b92e937ff1ee33f10334287ea836862025e3c33a/diff:/var/lib/docker/overlay2/640bccbf3ffa8ef28e683857c3bade488b1bf43241559661e4983305add20801/diff",
                "MergedDir": "/var/lib/docker/overlay2/f618ec47e40fe26f41092321ac49e5b59157ab7f93881cd0fedf26f8cc042100/merged",
                "UpperDir": "/var/lib/docker/overlay2/f618ec47e40fe26f41092321ac49e5b59157ab7f93881cd0fedf26f8cc042100/diff",
                "WorkDir": "/var/lib/docker/overlay2/f618ec47e40fe26f41092321ac49e5b59157ab7f93881cd0fedf26f8cc042100/work"
            },
            "Name": "overlay2"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/Users/yuan.tang/go/src/github.com/tensorchord/envd/examples/r-basic",
                "Destination": "/home/envd/r-basic",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "23d61ce8ad2d",
            "Domainname": "",
            "User": "envd",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "54455/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/conda/bin"
            ],
            "Cmd": null,
            "Image": "r-basic:dev",
            "Volumes": null,
            "WorkingDir": "/home/envd/r-basic",
            "Entrypoint": [
                "tini",
                "--",
                "bash",
                "-c",
                "set -e\n/var/envd/bin/envd-ssh --authorized-keys /var/envd/authorized_keys --port 54455 --shell bash \u0026\n\nwait -n"
            ],
            "OnBuild": null,
            "Labels": {
                "ai.tensorchord.envd.apt.packages": "[]",
                "ai.tensorchord.envd.build.context": "/Users/yuan.tang/go/src/github.com/tensorchord/envd/examples/r-basic",
                "ai.tensorchord.envd.name": "r-basic",
                "ai.tensorchord.envd.pypi.packages": "[]",
                "ai.tensorchord.envd.r.packages": "[\"remotes\",\"rlang\"]",
                "ai.tensorchord.envd.ssh.port": "54455",
                "ai.tensorchord.envd.vendor": "envd"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "6160602bba6af9dcf5943df4781fad15749a18e0e432d8cfc739fc8b38c4a529",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/6160602bba6a",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "bridge": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "1ddd6af6e40bf312674c4c5af037f075fb96c510d6de81080c13ecfb5a9ec81b",
                    "EndpointID": "",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "",
                    "DriverOpts": null
                }
            }
        }
    }
]


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@gaocegege
Copy link
Member

Is it related to the sshd server?

@terrytangyuan
Copy link
Member Author

Is it related to the sshd server?

I think so. It only happens to R environment though. Perhaps we are missing some important steps/stages that are required for it to work.

@gaocegege
Copy link
Member

Should we push the r base image then give it another try?

@terrytangyuan
Copy link
Member Author

terrytangyuan commented Jun 19, 2022

Should we push the r base image then give it another try?

It's pushed to my personal account on DockerHub since I don't have permission to push to tensorchord org. See #457 (comment)

@gaocegege
Copy link
Member

An invitation for terrytangyuan@gmail.com was sent.

@gaocegege
Copy link
Member

https://hub.docker.com/repository/docker/tensorchord/r4.2 tensorchord/r4.2 is created, too.

@terrytangyuan
Copy link
Member Author

@gaocegege I didn't get a chance to test this again but I assume it's fixed by your recent PR? Were you able to reproduce this?

@gaocegege
Copy link
Member

Yep I think it should be fixed in #491

@gaocegege
Copy link
Member

We do not install sshd server in the image, thus the container exits before we attach to it. #491 already resolved it. I am closing the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Archived in project
Development

No branches or pull requests

2 participants