Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CentOS 7 Docker test gets stuck starting Fail2Ban #21

Closed
geerlingguy opened this issue Sep 16, 2016 · 10 comments
Closed

CentOS 7 Docker test gets stuck starting Fail2Ban #21

geerlingguy opened this issue Sep 16, 2016 · 10 comments

Comments

@geerlingguy
Copy link
Owner

During the Travis tests, CentOS 7 gets stuck on:

TASK [role_under_test : Ensure fail2ban is running and enabled on boot.] *******

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

The build has been terminated

I can reproduce the issue on Docker for Mac 1.12.0/.1, and if I run journalctl --no-pager, I notice a lot of the following:

Sep 16 15:33:54 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:33:56 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:33:57 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:33:58 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:34:00 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:34:01 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:34:02 1b08686e974b systemd[1]: Looping too fast. Throttling execution a little.

It seems these messages occur after I install fail2ban, which triggers an update to systemd (with yum install -y fail2ban):

Sep 16 15:39:31 7c6e69e84968 yum[201]: Installed: python-decorator-3.4.0-3.el7.noarch
Sep 16 15:39:31 7c6e69e84968 yum[201]: Updated: systemd-libs-219-19.el7_2.13.x86_64
Sep 16 15:39:31 7c6e69e84968 systemd[1]: Closed udev Control Socket.
Sep 16 15:39:31 7c6e69e84968 systemd[1]: Closed udev Kernel Socket.
Sep 16 15:39:31 7c6e69e84968 systemd[1]: Stopped udev Kernel Device Manager.
Sep 16 15:39:32 7c6e69e84968 systemd[1]: Reexecuting.
Sep 16 15:39:32 7c6e69e84968 systemd[1]: systemd 219 running in system mode. (+PAM +AUDIT +SELINUX +IMA -APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 -SECCOMP +BLKID +ELFUTILS +KMOD +IDN)
Sep 16 15:39:32 7c6e69e84968 systemd[1]: Detected virtualization docker.
Sep 16 15:39:32 7c6e69e84968 systemd[1]: Detected architecture x86-64.
Sep 16 15:39:32 7c6e69e84968 systemd[1]: Failed to install release agent, ignoring: File exists
Sep 16 15:39:57 7c6e69e84968 systemd[1]: Failed to register match for Disconnected message: Connection timed out
Sep 16 15:40:22 7c6e69e84968 systemd[1]: Failed to register match for Disconnected message: Connection timed out
Sep 16 15:40:22 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:22 7c6e69e84968 yum[201]: Updated: systemd-219-19.el7_2.13.x86_64
Sep 16 15:40:23 7c6e69e84968 systemd[1]: Configuration file /usr/lib/systemd/system/ebtables.service is marked executable. Please remove executable permission bits. Proceeding anyway.
Sep 16 15:40:23 7c6e69e84968 systemd[1]: Configuration file /usr/lib/systemd/system/ebtables.service is marked executable. Please remove executable permission bits. Proceeding anyway.
Sep 16 15:40:23 7c6e69e84968 systemd[1]: Reloading.
Sep 16 15:40:23 7c6e69e84968 yum[201]: Installed: ebtables-2.0.10-13.el7.x86_64
Sep 16 15:40:24 7c6e69e84968 yum[201]: Installed: systemd-python-219-19.el7_2.13.x86_64
Sep 16 15:40:24 7c6e69e84968 yum[201]: Installed: grubby-8.28-17.el7.x86_64
Sep 16 15:40:24 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:24 7c6e69e84968 yum[201]: Installed: ssmtp-2.64-14.el7.x86_64
Sep 16 15:40:24 7c6e69e84968 yum[201]: Installed: libselinux-python-2.2.2-6.el7.x86_64
Sep 16 15:40:24 7c6e69e84968 yum[201]: Installed: python-slip-0.4.0-2.el7.noarch
Sep 16 15:40:24 7c6e69e84968 yum[201]: Installed: python-slip-dbus-0.4.0-2.el7.noarch
Sep 16 15:40:25 7c6e69e84968 systemd[1]: Configuration file /usr/lib/systemd/system/ebtables.service is marked executable. Please remove executable permission bits. Proceeding anyway.
Sep 16 15:40:25 7c6e69e84968 systemd[1]: Reloading.
Sep 16 15:40:25 7c6e69e84968 systemd[1]: Configuration file /usr/lib/systemd/system/ebtables.service is marked executable. Please remove executable permission bits. Proceeding anyway.
Sep 16 15:40:25 7c6e69e84968 yum[201]: Installed: firewalld-0.3.9-14.el7.noarch
Sep 16 15:40:25 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:26 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:27 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:28 7c6e69e84968 yum[201]: Installed: linux-firmware-20150904-43.git6ebf5d5.el7.noarch
Sep 16 15:40:29 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:30 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:31 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.
Sep 16 15:40:32 7c6e69e84968 systemd[1]: Looping too fast. Throttling execution a little.

Some references that could be helpful:

@geerlingguy
Copy link
Owner Author

Steps to reproduce locally:

  1. docker run --detach --volume=pwd:/etc/ansible/roles/role_under_test:ro --privileged --volume=/sys/fs/cgroup:/sys/fs/cgroup:ro geerlingguy/docker-centos7-ansible:latest /usr/lib/systemd/systemd
  2. docker exec --tty [id] env TERM=xterm journalctl --no-pager
  3. docker exec --tty [id] env TERM=xterm yum install -y fail2ban
  4. docker exec --tty [id] env TERM=xterm journalctl --no-pager

@geerlingguy
Copy link
Owner Author

Unfortunately, it looks like for many, the solution was just waiting for a newer release of systemd (past 219). But using CentOS / EPEL with fail2ban, that's the release that's current and it might be some time (if ever) before a newer systemd version comes out.

So... what to do?

@geerlingguy
Copy link
Owner Author

It's not just fail2ban installation, apparently just updating yum breaks things with systemd inside the container... After running docker exec --tty [id] env TERM=xterm yum update -y, I got the same issues and systemd was effectively broken.

@geerlingguy
Copy link
Owner Author

Likely because systemd is the PID1 inside the container... anyways, I have to mothball this right now, and will have to take it up again later. Not sure where to go with it unless/until CentOS updates the repo upstream.

Maybe I can do a yum update in my downstream docker container so it's already running the latest systemd version on first boot?

@geerlingguy
Copy link
Owner Author

@geerlingguy
Copy link
Owner Author

The above seems to have fixed the issue locally. Will re-run latest tests on Travis too.

@geerlingguy
Copy link
Owner Author

To summarize: The problem was systemd was out of date, and since it was updated alongside fail2ban (while the container was running, with PID 1 as systemd), systemd started getting funky problems.

To fix, I made sure yum update was run as part of the Docker image build so the newer systemd version was installed on the fresh image, so when the image/container was run in this test, the yum install wouldn't also install a new systemd version and break systemctl operations.

@Ranjandas
Copy link

@geerlingguy Thanks a ton. This saved me.

@geerlingguy
Copy link
Owner Author

Lol, this just saved me again on a separate role. I should really search my own issue queues for these weird problems before spending another 3 hours debugging the problem!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants