Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[master] [TAR-303] Start docker.service after containerd.service #290

Merged

Conversation

corbin-coleman
Copy link
Contributor

Reproduced the error seen here docker/for-linux#556 on a Centos 7 machine. Placing an After=containerd.service into the docker.service file fixed the issue and new docker containers that were ran would restart properly.

[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker version
Client:
 Version:           18.09.1
 API version:       1.39
 Go version:        go1.10.6
 Git commit:        4c52b90
 Built:             Wed Jan  9 19:35:01 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server: Docker Engine - Community
 Engine:
  Version:          18.09.1
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.6
  Git commit:       4c52b90
  Built:            Wed Jan  9 19:06:30 2019
  OS/Arch:          linux/amd64
  Experimental:     false
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker info | grep -i live
Live Restore Enabled: true
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker run -itd busybox
Unable to find image 'busybox:latest' locally
latest: Pulling from library/busybox
57c14dd66db0: Pull complete
Digest: sha256:7964ad52e396a6e045c39b5a44438424ac52e12e4d5a25d94895f2058cb863a0
Status: Downloaded newer image for busybox:latest
24f67dcb1e15257cd0d3a0552e16c1004b97df0f825cc244e9e4c69500ef47d0
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
24f67dcb1e15        busybox             "sh"                3 seconds ago       Up 2 seconds                            sad_swartz
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ sudo systemctl restart docker
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                       PORTS               NAMES
24f67dcb1e15        busybox             "sh"                20 seconds ago      Exited (255) 6 seconds ago                       sad_swartz
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ ps -ef | grep containerd
root      2477     1  0 23:30 ?        00:00:00 /usr/bin/containerd
root      2697     1  0 23:32 ?        00:00:00 containerd-shim -namespace moby -workdir /var/lib/docker/containerd/daemon/io.containerd.runtime.v1.linux/moby/24f67dcb1e15257cd0d3a0552e16c1004b97df0f825cc244e9e4c69500ef47d0 -address /var/run/docker/containerd/containerd.sock -containerd-binary /usr/bin/containerd -runtime-root /var/run/docker/runtime-runc
docker    2928  1552  0 23:33 pts/0    00:00:00 grep --color=auto containerd
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ cat /usr/lib/systemd/system/docker.service
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
BindsTo=containerd.service
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket

[Service]
Type=notify
# the default is not to use systemd for cgroups because the delegate issues still
# exists and systemd currently does not support the cgroup feature set required
# for containers run by docker
ExecStart=/usr/bin/dockerd -H fd://
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always

# Note that StartLimit* options were moved from "Service" to "Unit" in systemd 229.
# Both the old, and new location are accepted by systemd 229 and up, so using the old location
# to make them work for either version of systemd.
StartLimitBurst=3

# Note that StartLimitInterval was renamed to StartLimitIntervalSec in systemd 230.
# Both the old, and new name are accepted by systemd 230 and up, so using the old name to make
# this option work for either version of systemd.
StartLimitInterval=60s

# Having non-zero Limit*s causes performance problems due to accounting overhead
# in the kernel. We recommend using cgroups to do container-local accounting.
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity

# Comment TasksMax if your systemd version does not supports it.
# Only systemd 226 and above support this option.
TasksMax=infinity

# set delegate yes so that systemd does not reset the cgroups of docker containers
Delegate=yes

# kill only the docker process, not all processes in the cgroup
KillMode=process

[Install]
WantedBy=multi-user.target
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ sudo systemctl daemon-reload
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker run -itd busybox
4ad923ba118e561506de84cefa0e8c74ab9cff203a554c7e31ed2f2eccf72050
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
4ad923ba118e        busybox             "sh"                3 seconds ago       Up 3 seconds                            nervous_blackwell
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ sudo systemctl restart docker
[docker@corbincoleman-testkit-12C2D4-centos-0 ~]$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
4ad923ba118e        busybox             "sh"                12 seconds ago      Up 12 seconds                           nervous_blackwell

Signed-off-by: corbin-coleman corbin.coleman@docker.com

Signed-off-by: corbin-coleman <corbin.coleman@docker.com>
@corbin-coleman corbin-coleman changed the title [master] Start docker.service after containerd.service [master] [TAR-303] Start docker.service after containerd.service Jan 14, 2019
@seemethere
Copy link
Contributor

Going to merge this in so we can get this into nightlies, will revert if this doesn't solve our issue.

@seemethere seemethere merged commit 5e99162 into docker:master Jan 15, 2019
@thaJeztah
Copy link
Member

thaJeztah commented Jan 15, 2019

I wonder why we don't set the --containerd option instead (--containerd=/run/containerd/containerd.sock). If that option is set, dockerd will never try to start containerd; it only starts containerd as a child process if that option is empty;

https://github.com/moby/moby/blob/1edf943dc7b6fc56050ac028a79ba4dc83e7fbe7/cmd/dockerd/daemon.go#L141-L162

	ctx, cancel := context.WithCancel(context.Background())
	if cli.Config.ContainerdAddr == "" && runtime.GOOS != "windows" {
		if !systemContainerdRunning() {
			opts, err := cli.getContainerdDaemonOpts()
			if err != nil {
				cancel()
				return errors.Wrap(err, "failed to generate containerd options")
			}

			r, err := supervisor.Start(ctx, filepath.Join(cli.Config.Root, "containerd"), filepath.Join(cli.Config.ExecRoot, "containerd"), opts...)
			if err != nil {
				cancel()
				return errors.Wrap(err, "failed to start containerd")
			}
			cli.Config.ContainerdAddr = r.Address()

			// Try to wait for containerd to shutdown
			defer r.WaitTimeout(10 * time.Second)
		} else {
			cli.Config.ContainerdAddr = containerddefaults.DefaultAddress
		}
	}

@seemethere
Copy link
Contributor

func systemContainerdRunning() bool {
	_, err := os.Lstat(containerddefaults.DefaultAddress)
	return err == nil
}

https://github.com/moby/moby/blob/1edf943dc7b6fc56050ac028a79ba4dc83e7fbe7/cmd/dockerd/daemon.go#L665-L668

That shouldn't really matter since systemContainerdRunning should automatically default to the containerddefaults which should be

const (
	// DefaultRootDir is the default location used by containerd to store
	// persistent data
	DefaultRootDir = "/var/lib/containerd"
	// DefaultStateDir is the default location used by containerd to store
	// transient data
	DefaultStateDir = "/run/containerd"
	// DefaultAddress is the default unix socket address
	DefaultAddress = "/run/containerd/containerd.sock"
	// DefaultDebugAddress is the default unix socket address for pprof data
	DefaultDebugAddress = "/run/containerd/debug.sock"
	// DefaultFIFODir is the default location used by client-side cio library
	// to store FIFOs.
	DefaultFIFODir = "/run/containerd/fifo"
)

https://github.com/containerd/containerd/blob/master/defaults/defaults_unix.go#L29,

so there's no real need to set the default to /run/containerd/containerd.sock since the default is already /run/containerd/containerd.sock

@thaJeztah
Copy link
Member

thaJeztah commented Jan 15, 2019

There is a need; look at the second line;

if cli.Config.ContainerdAddr == "" && runtime.GOOS != "windows" {

That part of the code is only executed if --containerd="" (ContainerdAddr == ""). If not, the !systemContainerdRunning() isn't executed, and the daemon doesn't start containerd as a child process (r, err := supervisor.Start(... ) is what starts containerd)

chadswen added a commit to chadswen/kargo that referenced this pull request Jan 22, 2019
The `docker-ce` 18.09.1 packaging missed an `After` dependency on containerd in the systemd service. Upstream PR: docker/docker-ce-packaging#290
LuckySB pushed a commit to southbridgeio/kubespray that referenced this pull request Feb 17, 2019
The `docker-ce` 18.09.1 packaging missed an `After` dependency on containerd in the systemd service. Upstream PR: docker/docker-ce-packaging#290
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants