Skip to content
Permalink
Browse files

Documentation: add initial troubleshooting doc

It's useful to have common problems with their solutions or workarounds
in one document.

This starts a troubleshooting document.
  • Loading branch information...
iaguis committed Jan 5, 2018
1 parent 8c7fb08 commit 76c821fdda4d6571a04b9cdf99a3f7f56fe5420c
Showing with 98 additions and 10 deletions.
  1. +1 −1 Documentation/dependencies.md
  2. +96 −0 Documentation/troubleshooting.md
  3. +1 −9 README.md
@@ -66,7 +66,7 @@ For the most part the codebase is self-contained (e.g. all dependencies are vend

## Run-time dependencies

* Linux 3.18+ (ideally 4.3+ to have overlay-on-overlay working), with the following options configured:
* Linux 3.18+ (ideally 4.9+ to avoid the issues listed in the [troubleshooting document](troubleshooting.md)), with the following options configured:
* CONFIG_CGROUPS
* CONFIG_NAMESPACES
* CONFIG_UTS_NS
@@ -0,0 +1,96 @@
# Troubleshooting

This document lists common rkt problems and how to fix or work around them.

## Missing container logs

When checking the logs of a container, they might be missing with an error like this:

```
$ journalctl -M rkt-3f045be0-1632-42f1-ba15-df984a82636f
Journal file /var/lib/rkt/pods/run/3f045be0-1632-42f1-ba15-df984a82636f/stage1/rootfs/var/log/journal/3f045be0163242f1ba15df984a82636f/system.journal uses an unsupported feature, ignoring file.
-- No entries --
```

This is because rkt's journald integration is only supported if systemd is compiled with `lz4` compression enabled.

You can check if it is enabled by making sure you see `+LZ4` when running `systemctl --version`:

```
$ systemctl --version
systemd 235
[...] +LZ4 [...]
```

## Bad system call

During rkt execution, you might encounter the message `Bad system call` followed by rkt terminating.
It's most likely a result of a too restrictive seccomp profile.

As a workaround, you can disable seccomp with `--insecure-options=seccomp`.

As a proper fix, you can [tweak the seccomp profile][seccomp-guide].

## Operation not permitted errors

During rkt execution, you might encounter a `Operation not permitted` message followed by rkt exiting.
Your image probably uses more capabilities than allowed in rkt's default list.

As a workaround, you can disable capabilities enforcement with `--insecure-options=capabilities`.

As a proper fix, you can [create your own list][capabilities-guide].

## BTRFS + overlay

```
prepare-app@opt-stage2-alpine\x2dsh-rootfs.service: Job prepare-app@opt-stage2-alpine\x2dsh-rootfs.service/start failed with result 'dependency'.
systemd-journald.service: Unit entered failed state.
systemd-journald.service: Failed with result 'signal'.
systemd-journald.service: Service has no hold-off time, scheduling restart.
```

To solve this update to Linux 4.5.2 or newer (see [#2175](https://github.com/rkt/rkt/issues/2175)).

## SELinux + overlay

You might se an error like this one when starting a rkt pod:

```
/usr/lib/systemd/systemd: error while loading shared libraries: libselinux.so.1: cannot open shared object file: Permission denied
```

The overlay filesystem doesn't work with SELinux in kernels older than 4.9 (see [1727](https://github.com/rkt/rkt/issues/1727)).
Please update your kernel to a newer version.

## Garbage collect not working in old kernels

You might see messages like these when running `rkt gc`:

```
Unable to remove pod "42e78965-c60b-4f4f-b412-484cd381fe90": remove /var/lib/rkt/pods/exited-garbage/42e78965-c60b-4f4f-b412-484cd381fe90/stage1/rootfs: device or resource busy
```

This might be due to using a kernel older than 3.18 (see [lazy umounts on unlinked files and directories](https://github.com/torvalds/linux/commit/8ed936b) and [#1922](https://github.com/rkt/rkt/issues/1922)).
Please update your kernel to a newer version.

## Running rkt on top of an overlay filesystem

Due to limitations in the Linux kernel, using rkt's overlay support on top of an overlay filesystem requires the upperdir and workdir to support the creation of trusted.* extended attributes and valid d_type in readdir responses (see [kernel/Documentation/filesystems/overlayfs.txt](https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt)).

The symptom is an error message like this:

```
stage0: error setting up stage1
└─error rendering overlay filesystem
└─problem mounting overlay filesystem
└─error mounting overlay with options 'lowerdir=/var/lib/rkt/cas/tree/deps-sha512-f3d5f69d7faba1be7067d610f33131c18ac59eb43b1495016ade65bd13912578/rootfs,upperdir=/var/lib/rkt/pods/run/307bd207-7eab-4028-8d12-2d525e5b8ed9/overlay/deps-sha512-f3d5f69d7faba1be7067d610f33131c18ac59eb43b1495016ade65bd13912578/upper,workdir=/var/lib/rkt/pods/run/307bd207-7eab-4028-8d12-2d525e5b8ed9/overlay/deps-sha512-f3d5f69d7faba1be7067d610f33131c18ac59eb43b1495016ade65bd13912578/work' and dest '/var/lib/rkt/pods/run/307bd207-7eab-4028-8d12-2d525e5b8ed9/stage1/rootfs'
└─invalid argument
```

This problem typically happens when trying to run rkt inside rkt.
To successfuly run rkt inside rkt, use one of the following workarounds:
- set up `/var/lib/rkt` in the outer rkt as a host volume
- use `--no-overlay` for either the outer or the inner rkt

[capabilities-guide]: capabilities-guide.md
[seccomp-guide]: seccomp-guide.md
@@ -81,15 +81,7 @@ For more information, see the [CoreOS security disclosure page](https://coreos.c

## Known issues

Due to limitations in the Linux kernel, using rkt's overlay support on top of an overlay filesystem requires the upperdir and workdir to support the creation of trusted.* extended attributes and valid d_type in readdir responses (see [kernel/Documentation/filesystems/overlayfs.txt](https://www.kernel.org/doc/Documentation/filesystems/overlayfs.txt)). When starting rkt inside rkt this means that either:
- the inner `/var/lib/rkt` directory needs to be mounted on a host volume.
- the outer or inner rkt container needs to be started using `--no-overlay`.

Due to a bug in the Linux kernel, using rkt when `/var/lib/rkt` is on btrfs requires Linux 4.5.2+ ([#2175](https://github.com/rkt/rkt/issues/2175)).

Due to a bug in the Linux kernel, using rkt's overlay support in conjunction with SELinux requires a set of patches that are only currently available on some Linux distributions (for example, [CoreOS Linux](https://github.com/coreos/coreos-overlay/tree/master/sys-kernel/coreos-sources/files)). Work is ongoing to merge this work into the mainline Linux kernel ([#1727](https://github.com/rkt/rkt/issues/1727#issuecomment-173203129)).

Linux 3.18+ is required to successfully garbage collect rkt pods when system services such as udevd are in a slave mount namespace (see [lazy umounts on unlinked files and directories](https://github.com/torvalds/linux/commit/8ed936b) and [#1922](https://github.com/rkt/rkt/issues/1922)).
Check the [troubleshooting document](Documentation/troubleshooting.md).

## Related Links

0 comments on commit 76c821f

Please sign in to comment.
You can’t perform that action at this time.