Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On container creation: Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms) #3808

Closed
nefelim4ag opened this issue May 29, 2020 · 22 comments

Comments

@nefelim4ag
Copy link

nefelim4ag commented May 29, 2020


BUG REPORT INFORMATION

Description
We moved to use CRI-O 1.17.2 on our clusters
OS latest U18.04 used
K8s 1.16.8

After that we sometime experience different errors from DBus daemon, like that:

May 29 12:11:22 K8S-VA1 dbus-daemon[1516]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
May 29 12:11:47 K8S-VA1 dbus-daemon[1516]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
May 29 12:12:12 K8S-VA1 dbus-daemon[1516]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
May 29 12:12:37 K8S-VA1 dbus-daemon[1516]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)

systemctl reload dbus - helps for some time, because that drop all dbus connections IIRC.

Also i see some messages about limits, so i rise them in hope that will help:

<busconfig>
  <limit name="max_connections_per_user">2048</limit>   
  <limit name="max_replies_per_connection">2048</limit>
</busconfig>

I currently think that happens because too many pods try start at same time, i.e. we mosly experience that when updating workloads.

Maybe you can direct where to look?

Steps to reproduce the issue:
(WIll be added later, when i have 99% steps to reproduction)

Thanks!

@mrunalp
Copy link
Member

mrunalp commented May 29, 2020

@nefelim4ag Can you check if there are any leaks in the number of units/scopes
systemctl list-units ?

@nefelim4ag
Copy link
Author

nefelim4ag commented Jun 1, 2020

@mrunalp
You mean something like this?

K8S-VA1:~# crictl ps | wc -l
242
K8S-VA1:~# systemctl list-units | grep scope | wc -l
843

@haircommander
Copy link
Member

@nefelim4ag I have a suspicion that many of your units are inactive, and could be cleaned up
what's the output of systemctl list-units --all | grep crio | grep inactive ?

for context, we are working on having systemd do this automatically (by settting the CollectMode to inactive-or-failed), but we need to do some branching on systemd version to make sure we have the option. you can follow that work here

@nefelim4ag
Copy link
Author

nefelim4ag commented Jun 2, 2020

@haircommander all of them in failed state:

K8S-VA1:~# systemctl status crio-fd525157e9bc5c56c9b89d50d5cefa768577e215a9b2a07b7aa87248087d9547.scope
● crio-fd525157e9bc5c56c9b89d50d5cefa768577e215a9b2a07b7aa87248087d9547.scope - libcontainer container fd525157e9bc5c56c9b89d50d5cefa768577e215a9b2a07b7aa87248087d9547
   Loaded: loaded (/run/systemd/transient/crio-fd525157e9bc5c56c9b89d50d5cefa768577e215a9b2a07b7aa87248087d9547.scope; transient)
Transient: yes
   Active: failed (Result: resources)
      CPU: 0

i.e. i not have inactive units :)

@haircommander
Copy link
Member

ah! per: https://www.freedesktop.org/software/systemd/man/systemd.unit.html, CollectMode defaults to inactive and failed scopes are not cleaned up. changing the CollectMode would solve the leak
(I should have asked for failed scopes, but we arrived at the same conclusion despite my being confused 😃 )

@nefelim4ag
Copy link
Author

nefelim4ag commented Jun 3, 2020

@haircommander, can you please explain why that units leads to problems with Dbus?
And can i someway fix that?
i.e. write simple service which will collect / delete failed units?

@mrunalp
Copy link
Member

mrunalp commented Jun 4, 2020

@nefelim4ag Can you try running systemctl reset-failed and see if the failed units get cleaned up? Meanwhile, we will fix this in CRI-O by setting the CollectMode for the runc containers.

@nefelim4ag
Copy link
Author

@mrunalp doesn't help, i.e. we recreate many containers and got again:

Jun 04 16:53:53 P20B-SR7-1R1-K8S-VA6 dbus-daemon[1620]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Jun 04 16:54:18 P20B-SR7-1R1-K8S-VA6 dbus-daemon[1620]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Jun 04 16:54:43 P20B-SR7-1R1-K8S-VA6 dbus-daemon[1620]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Jun 04 16:55:08 P20B-SR7-1R1-K8S-VA6 dbus-daemon[1620]: [system] Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Jun 04 13:38:11 P20B-SR7-1R1-K8S-VA6 crio[1661]: time="2020-06-04 13:38:11.072010439+03:00" level=info msg="exec'd [rm /tmp/rbd-remounter.health] in kube-system/rbd-remounter-gjdwk/rbd-remounter" id=e8c726d9-8265-4105-8240-17583cef186f
Jun 04 13:38:43 P20B-SR7-1R1-K8S-VA6 crio[1661]: time="2020-06-04 13:38:43.476211150+03:00" level=error msg="Container creation error: time=\"2020-06-04T13:38:18+03:00\" level=warning msg=\"signal: killed\"\ntime=\"2020-06-04T13:38:43+03:00\" level=error msg=\"container_linux.go:349: starting container process caused \\\"process_linux.go:297: applying cgroup configuration for process caused \\\\\\\"Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)\\\\\\\"\\\"\"\ncontainer_linux.go:349: starting container process caused \"process_linux.go:297: applying cgroup configuration for process caused \\\"Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)\\\"\"\n"

Run systemctl reset-failed - doesn't resolve problem

@nefelim4ag
Copy link
Author

nefelim4ag commented Jun 29, 2020

lnykryn/systemd-rhel#266

Looks like bug in old systemd/dbus which fixed in 242

@haircommander
Copy link
Member

ahh good find @nefelim4ag . Can we close this issue then?

@nefelim4ag
Copy link
Author

@haircommander
Not tested, i will close after testing of new systemd

@nefelim4ag
Copy link
Author

@haircommander
We update our clusters to systemd version: https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.41
With fix for cookie overflow - that not fix timeout problem

@nefelim4ag
Copy link
Author

#  dbus-send --system --dest=org.freedesktop.systemd1 --type=method_call --print-reply /org/freedesktop/systemd1 org.freedesktop.DBus.Introspectable.Introspect
Failed to open connection to "system" message bus: The maximum number of active connections for UID 0 has been reached
# dbus-monitor --system
signal time=1594289017.252996 sender=org.freedesktop.DBus -> destination=:1.4767 serial=2 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameAcquired
   string ":1.4767"
signal time=1594289017.253031 sender=org.freedesktop.DBus -> destination=:1.4767 serial=4 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=NameLost
   string ":1.4767"
signal time=1594289027.003428 sender=:1.3 -> destination=(null destination) serial=13 path=/org/freedesktop/Accounts/User1001; interface=org.freedesktop.Accounts.User; member=Changed
error time=1594289037.374962 sender=org.freedesktop.DBus -> destination=:1.8 error_name=org.freedesktop.DBus.Error.TimedOut reply_serial=4311
   string "Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)"
method call time=1594289037.375331 sender=:1.8 -> destination=org.freedesktop.systemd1 serial=4312 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=StartTransientUnit
   string "kubepods-burstable-pod14dede99_34ea_42f8_98fd_36097937b515.slice"
   string "replace"
   array [
      struct {
         string "Description"
         variant             string "libcontainer container kubepods-burstable-pod14dede99_34ea_42f8_98fd_36097937b515.slice"
      }
      struct {
         string "Wants"
         variant             array [
               string "kubepods-burstable.slice"
            ]
      }
      struct {
         string "Delegate"
         variant             boolean true
      }
      struct {
         string "MemoryAccounting"
         variant             boolean true
      }
      struct {
         string "CPUAccounting"
         variant             boolean true
      }
      struct {
         string "BlockIOAccounting"
         variant             boolean true
      }
      struct {
         string "DefaultDependencies"
         variant             boolean false
      }
      struct {
         string "MemoryLimit"
         variant             uint64 2151677952
      }
      struct {
         string "CPUShares"
         variant             uint64 870
      }
      struct {
         string "CPUQuotaPerSecUSec"
         variant             uint64 1050000
      }
   ]
   array [
   ]
method call time=1594289047.602720 sender=:not.active.yet -> destination=org.freedesktop.DBus serial=1 path=/org/freedesktop/DBus; interface=org.freedesktop.DBus; member=Hello
error time=1594289047.602736 sender=org.freedesktop.DBus -> destination=:not.active.yet error_name=org.freedesktop.DBus.Error.LimitsExceeded reply_serial=1
   string "The maximum number of active connections for UID 0 has been reached"
error time=1594289062.384765 sender=org.freedesktop.DBus -> destination=:1.8 error_name=org.freedesktop.DBus.Error.TimedOut reply_serial=4312
   string "Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)"
method call time=1594289062.385155 sender=:1.8 -> destination=org.freedesktop.systemd1 serial=4313 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=StartTransientUnit
   string "kubepods-burstable-pod1f4c3e81_e98e_4523_bd88_734405c91afd.slice"
   string "replace"
   array [
      struct {
         string "Description"
         variant             string "libcontainer container kubepods-burstable-pod1f4c3e81_e98e_4523_bd88_734405c91afd.slice"
      }
      struct {
         string "Wants"
         variant             array [
               string "kubepods-burstable.slice"
            ]
      }
      struct {
         string "Delegate"
         variant             boolean true
      }
      struct {
         string "MemoryAccounting"
         variant             boolean true
      }
      struct {
         string "CPUAccounting"
         variant             boolean true
      }
      struct {
         string "BlockIOAccounting"
         variant             boolean true
      }
      struct {
         string "DefaultDependencies"
         variant             boolean false
      }
      struct {
         string "MemoryLimit"
         variant             uint64 1742733312
      }
      struct {
         string "CPUShares"
         variant             uint64 665
      }
      struct {
         string "CPUQuotaPerSecUSec"
         variant             uint64 750000
      }
   ]
   array [
   ]

@haircommander
Copy link
Member

if you set max sessions to something really big (100000 or something) does this still happen? I wonder if we're still leaking, or if your container workload really needs > 2048 connections

@haircommander
Copy link
Member

unrelated aside: why are you k8s and cri-o versions not in sync? usually users should match minor versions (cri-o 1.17.z with k8s 1.17.z)

@nefelim4ag
Copy link
Author

@haircommander, because when we did move to CRI-O, we was not ready for K8s update, and i just think about version compatibility like match in go/python k8s libs, i.e. client must have at least cluster version in my opinion - so i'm not afraid of issues with that.

That really matter in that case with sessions?

@nefelim4ag
Copy link
Author

nefelim4ag commented Jul 9, 2020

@haircommander
you talk about:

<busconfig>
<limit name="max_connections_per_user">2048</limit>
</busconfig>

?

On ubuntu default value was 256, i've try 2048 on one node (i assume at least 1-3 per container in bad scenario, we have pod limit 500 per node, but never reach that)

I want to ask - 256 are just very small limit for CRI-O? i.e. you assume what every moderm linux system have at least limit like 2048?

@haircommander
Copy link
Member

bumping the max_connections_per_user is a sanity check for me at this point. no prob if you think 2048 is sufficient, but I'm mostly curious what happens with a preposterously large value

also, what version of runc are you using?
are you using systemd cgroup manager (default in cri-o)? does this happen with cgroupfs?

I don't think it's related, I'm mostly curious why you didn't try cri-o 1.16.z first

@nefelim4ag
Copy link
Author

nefelim4ag commented Jul 9, 2020

@haircommander
Latest runc from CRI-O ppa,

~# apt-cache policy cri-o-runc 
cri-o-runc:
  Installed: 1.0.0-3~dev2
  Candidate: 1.0.0-3~dev2
  Version table:
 *** 1.0.0-3~dev2 500

We use systemd cgroup manager - yep,
nope i never try cgroupfs with CRI-O, and i'm not sure if that needed. Of course i never experienced such dbus problems with docker/cgroupfs, but seen different, more interesting issues :)

IIRC i did a try 1.16 - that's not change dbus behavior

@nefelim4ag
Copy link
Author

I think you can try reproduce that behavior, i just use that as test and scale it up/down several times (one time in my case),

---
#application/deployment.yaml 
apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: default
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 100
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 50m
            ephemeral-storage: 32Mi
            memory: 128Mi
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - node-name

@nefelim4ag
Copy link
Author

I tried set 131768 as connection limit, yep it harder to hit (more time), but reproducible

@nefelim4ag
Copy link
Author

nefelim4ag commented Nov 20, 2020

Working on 20.04 (systemd 245.4-4ubuntu3.3), with k8s 1.18 & cri-o 1.18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants