New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18.09: containerd depends on overlay #475

Open
ikus060 opened this Issue Nov 8, 2018 · 22 comments

Comments

Projects
None yet
@ikus060

ikus060 commented Nov 8, 2018

  • This is a bug report
  • This is a feature request
  • I searched existing issues before opening this one

Expected behavior

installing docker using curl -s https://get.docker.com/ | sh should work in all common environment.

Actual behavior

Making use of curl -s https://get.docker.com/ | sh when the overlay driver cannot be used doesn't allow docker service to be started.

Steps to reproduce the behavior

After installing, I need to manually edit the containerd.service to remove the following line:

ExecStartPre=/sbin/modprobe overlay
@geonunez

This comment has been minimized.

geonunez commented Nov 8, 2018

Today I upgraded a server thats run Debian 9.5 (Stretch) and apt said that a new docker's version was available with the label 5:18.09.03-0debian-stretch

After applied the upgrade docker didn't could start, because 'modprobe overlay' was failing.

I didn't find how to fixed and I installed the previous docker's version that is '18.06.1ce3-0~debian' and everything works again.


Additional info:

The server runs a custom kernel (Linode's latest)

modprobe: ERROR: ../libkmod/libkmod.c:514 lookup_builtin_file() could not open builtin file '/lib/modules/4.18.16-x86_64-linode118/modules.builtin.bin'

modprobe: FATAL: Module overlay not found in directory /lib/modules/4.18.16-x86_64-linode118

I inform them too just in case.

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Nov 8, 2018

@andrewhsu

This comment has been minimized.

Collaborator

andrewhsu commented Nov 8, 2018

Hmm...I believe we're good on stock kernel that comes with Debian Stretch. Custom kernels will need overlay kernel module to be avail. Digging a little on google i see this discussion on Linode kernel modules for docker: https://www.linode.com/community/questions/17114/docker-wont-start-using-the-latest-linode-kernel

I'm not a user of Linode nor do we support that env when building the docker-ce packages for testing, but that thread may be a place to start.

@ikus060

This comment has been minimized.

ikus060 commented Nov 8, 2018

Hmm...I believe we're good on stock kernel that comes with Debian Stretch. Custom kernels will need overlay kernel module to be avail.

Problem is not related to the kernel we are running. I'm running "stock" kernel, but overlay module could not be used because we are not using xfs or ext4. For that reason, overlay is not loaded and only aufs is loaded.

@geonunez

This comment has been minimized.

geonunez commented Nov 9, 2018

@andrewhsu It was a excellent start,

I changed the linode's kernel to the generic one and docker's latest seems to works without problems.

Thanks for the help guys.

@onlime

This comment has been minimized.

onlime commented Nov 9, 2018

Same here, we are unable to use overlay as we run Docker in an LXC container which runs on zfs (ZFSonLinux). We have mounted an external btrfs volume as /var/lib/docker inside LXC container and set Docker's storage-driver to btrfs via /etc/docker/daemon.json:

{
  "storage-driver": "btrfs"
}

This works all nicely, but since upgrading to Docker 18.09 (18.09.03-0debian-stretch) and its introduction/dependency of containerd.io, it won't start up. Commenting out that line in /lib/systemd/system/containerd.service does the trick, but seems like a dirty workaround:

#ExecStartPre=/sbin/modprobe overlay

Any idea of how to fix this in a clean way?
Should we mount some extra volume for containerd and also let it use btrfs storage driver?

@lbschenkel

This comment has been minimized.

lbschenkel commented Nov 9, 2018

Got bitten by this as well, reported to upstream: containerd/containerd#2772

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Nov 9, 2018

@onlime Thanks for your report! I opened an internal ticket to get more information about the overlay requirement here, and to see if there are ways to remove that requirement.

While that is pending, I think the reason for this is for images and containers pulled (and run) by containerd itself (instead of managed by dockerd), and that feature is used for the docker engine activate feature introduced in Docker 18.09 (which allows you to upgrade the engine through containerd).

If I'm correct, then disabling the ExecPrestart should be fine, as long as you don't need the docker engine activate feature.

Regarding your workaround; while commenting-out that line "works", I would highly recommend not editing the containerd.service unit file itself.

By editing the unit file, future updates of the package won't automatically update the containerd.service if a newer version is available (rpm/deb will notice the file has been edited, so not automatically replace it with the new version).

Instead, it's best to use a systemd "drop-in" file (an "override" file).

You can through systemd itself, or manually (note: DigitalOcean has a great article on using systemctl: How To Use Systemctl to Manage Systemd Services and Units):

A. Created and edit a drop-in file with systemctl edit

  • First, revert the changes you made to /lib/systemd/system/containerd.service
  • Run systemctl edit containerd.service. This automatically creates a draft override file, and opens it in your editor.
  • Edit the file to look like this;
    [Service]
    ExecStartPre=
    
  • Save the file

B. Manually creating/editing a drop-in file

  • First, revert the changes you made to /lib/systemd/system/containerd.service
  • Create a directory /etc/systemd/system/containerd.service.d/
  • Create a drop-in file (name doesn't matter), e.g. override.conf
  • Edit the file to look like this;
    [Service]
    ExecStartPre=
    
  • Save the file

Verify your edits

Now verify if the edits are correct (again, use sudo if needed);

  • Use systemctl cat containerd.service to see the content of both the containerd.service unit, and all override/drop-in files.
  • Use systemctl show containerd.service to show the full configuration of the service.
    • Use systemctl show containerd.service | grep ExecStartPre to verify there's no ExecStartPre

Reload and restart the service for your edits to become active

If you're satisfied with the changes you made, you can reload and restart the containerd service;

systemctl daemon-reload
systemctl restart containerd.service
@lbschenkel

This comment has been minimized.

lbschenkel commented Nov 9, 2018

@thaJeztah: I thought of doing that, but my understanding is that you cannot override ExecStartPre= via a systemd drop-in file because this entry can be specified multiple times, so you can only add new commands in a drop-in. Is my understanding incorrect?

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Nov 9, 2018

I thought of doing that, but my understanding is that you cannot override ExecStartPre= via a systemd drop-in file because this entry can be specified multiple times

I think the trick there (similar to ExecStart) is (if you want to replace the option) to first reset the option, then specify the new one. So;

To add an extra ExecStartPre;

[Service]
ExecStartPre=echo hello world

Will result in

[Service]
ExecStartPre=/sbin/modprobe overlay
ExecStartPre=echo hello world

To reset / remove the ExecStartPre

[Service]
ExecStartPre=

Will result in the ExecStartPre to be removed

[Service]

To replace / override the ExecStartPre

Effectively a combination of the above; first reset the existing one, and immediately set the new one;

[Service]
ExecStartPre=
ExecStartPre=echo hello world

Will result in

[Service]
ExecStartPre=echo hello world

(at least I think that's how it works; didn't try)

@cpuguy83

This comment has been minimized.

cpuguy83 commented Nov 9, 2018

Seems like this should be ignoring errors from modprobe.

@lbschenkel

This comment has been minimized.

lbschenkel commented Nov 9, 2018

Agreed, at the very least

ExecStartPre=-/sbin/modprobe overlay

Would already be a solution. (Errors are ignored when the command starts with -.)

@onlime

This comment has been minimized.

onlime commented Nov 9, 2018

@thaJeztah Thanks so much for your very detailed and perfectly clear instructions! I though about using Systemd override.conf instead of patching containerd.service but didn't know about that option of resetting ExecStartPre which can be specified multiple times.

Your solution works perfectly, tested (by systemctl show containerd.service | grep ExecStartPre and by restarting the service).

# /etc/systemd/system/containerd.service.d/override.conf
[Service]
ExecStartPre=

Confirmed working 100%.
So now let's wait for upstream to fix this. Hopefully we can use the new docker engine activate feature without the overlay requirement.

@jdespatis

This comment has been minimized.

jdespatis commented Nov 12, 2018

@thaJeztah Thanks a lot for your workaround, it works like a charm with my Ubuntu 16.05.5 LTS !

@geerlingguy

This comment has been minimized.

geerlingguy commented Nov 12, 2018

I've run into this error when running some tests installing Docker-in-Docker on Travis CI on Debian 9, Ubuntu 18.04, CentOS 7, and Ubuntu 16.04 too (see related: geerlingguy/ansible-role-docker#97).

It looks like this was fixed in the containerd project here: containerd/containerd#2776

Is there a timeline for the next Docker package release (e.g. 18.09.1)? Or should I build in the hack to override the ExecStartPre command on all my builds for the forseeable future, or lock my roles to only install 18.06 or earlier? Right now I can't get Docker to start in many of my regular build environments using the instructions in Docker's docs (and same for tons of my downstream users).

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Nov 12, 2018

The same fix as was merged in containerd has been merged in the packaging repository, but I don't have an ETA for when updated packages will be available.

@geerlingguy

This comment has been minimized.

geerlingguy commented Nov 13, 2018

So for now, the fix is:

Add a new file with the ExecStartPre override:

sudo mkdir -p /etc/systemd/system/containerd.service.d
sudo nano /etc/systemd/system/containerd.service.d/override.conf

Contents of override.conf:

[Service]
ExecStartPre=

Reload the systemctl daemon:

$ sudo systemctl daemon-reload

Now you can start Docker successfully (sudo systemctl start docker).

@ericsysmin

This comment has been minimized.

ericsysmin commented Nov 13, 2018

Ha, @geerlingguy , I've been trying to fix this the past 3 days as well....closest I got was figuring out overlay was the problem

@aMozejko1

This comment has been minimized.

aMozejko1 commented Nov 20, 2018

Not sure if the same issue but I fixed by symlinking the /lib/systemd/system/docker.service to /etc/systemd/system.

Also don't know if it's because I'm doing something wrong or not but installing 18.09 on UB16 didn't seem to install a docker binary?

sudo apt-get install docker-ce=18.06.1~ce~3-0~ubuntu
sudo apt-get install docker-ce=5:18.09.0~3-0~ubuntu-xenial
cd /etc/systemd/system
sudo mv docker.service _docker.service
sudo ln -s /lib/systemd/system/docker.service docker.service
sudo systemctl daemon-reload
sudo service docker start
docker version

Client:
Version: 18.09.0
API version: 1.39
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:48:57 2018
OS/Arch: linux/amd64
Experimental: false

Server: Docker Engine - Community
Engine:
Version: 18.09.0
API version: 1.39 (minimum version 1.12)
Go version: go1.10.4
Git commit: 4d60db4
Built: Wed Nov 7 00:16:44 2018
OS/Arch: linux/amd64
Experimental: false

@hswong3i

This comment has been minimized.

hswong3i commented Nov 26, 2018

Upstream PR get merged, see containerd/containerd#2776

@toddpi314

This comment has been minimized.

toddpi314 commented Nov 27, 2018

@aMozejko1's fix confirmed working on PixelBook i7 ChromeOS 70.0.3538.76 (crostini Penguin container).

Also, @thaJeztah works as well.

Thank you both.

@RobertLHarris

This comment has been minimized.

RobertLHarris commented Dec 7, 2018

No go for me with @aMozejko1's version. I still get a message docker.service couldn't start due to a dependancy job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment