Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High journald CPU usage #1162

Closed
ragnard opened this Issue Mar 8, 2016 · 19 comments

Comments

@ragnard
Copy link

ragnard commented Mar 8, 2016

Summary

We're seeing the systemd-journald process consistently occupying ~100% of one CPU when running a docker container that uses the journald log driver, running an application that logs roughly ~50 messages/s at peak.

Setup

  • CoreOS-beta-899.5.0-hvm on eu-west-1 (ami-f961d48a)
  • Tried on t2.medium, c4.large, c4.xlarge and c4.x2large

Findings

We've straced the systemd-journald process and one thing that stood out was comparatively slow writev calls to /dev/console.

11:22:29.363557 open("/dev/console", O_WRONLY|O_NOCTTY|O_CLOEXEC) = 30 <0.000029>
11:22:29.363613 ioctl(30, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig -icanon -echo ...}) = 0 <0.000021>
11:22:29.363664 writev(30, [{"[71658.329457] ", 15}, {"docker", 6}, {"[804]: ", 7}, {"{\"@timestamp\":\"2016-03-08T11:22:29.361+00:00\",\"@version\":1,\"logger_name\":\"nocrm-bigquery-transform.core\",\"thread_name\":\"main\",\"level_name\":\"INFO\",\"message\":\"inserted batch with 1000 rows\"}", 188}, {"\n", 1}], 5) = 217 <0.003830>
11:22:29.367539 close(30)               = 0 <0.000021>

As you can see, that particular writev call took 0.003830 seconds. but we're seeing numbers between 0.002328 and 0.055282 for that particular trace.

We tried turning off ForwardToConsole using a drop-in, but were unsuccesful in doing so. We see that CoreOS sets this option to "yes" using the boot options https://github.com/coreos/coreos-overlay/blob/master/coreos-base/oem-ec2-compat/files/grub-ec2.cfg#L7, not sure if that means it can't actually be overriden.

We resorted to setting the TTYPath to /dev/null in which case the CPU usage dropped to low ~1% levels again.

@kayrus

This comment has been minimized.

Copy link

kayrus commented Mar 8, 2016

Have reproduced the issue inside QEMU VM.

Let's use this unit file:

[Service]
TimeoutStartSec=10s
Type=notify
ExecStart=/bin/bash -c "i=0; while true; do ((i++)); echo Hello World $i; done"

And run it inside qemu VM with the following /usr/share/oem/grub.cfg:

set linux_append="systemd.journald.forward_to_console=yes"

And without it. The result without grub hack is:

$ journalctl -n1 -u hello
Mar 08 14:25:26 coreos1 bash[757]: Hello World 384795

And with it:

$ journalctl -n1 -u hello
Mar 08 14:27:41 coreos1 bash[762]: Hello World 22080

I ran these tests several time, so the results are approximations.

Relates to this commit.

@kayrus

This comment has been minimized.

Copy link

kayrus commented Mar 8, 2016

probably it worth an investigation at systemd journald / kernel side why it takes so much time to print logs.

@vcaputo

This comment has been minimized.

Copy link

vcaputo commented Mar 8, 2016

Yes, we turned on forwarding to the console on these providers to aid in troubleshooting situations when the machine is inaccessible, since there's no interactive console provided.

But the writev() calls should not be so costly, this suggests a console or serial driver issue on aws.

We may have to shut it off if there's no obvious fix to make the console output less costly to perform.

@kayrus

This comment has been minimized.

Copy link

kayrus commented Mar 8, 2016

@vcaputo it is reproducible even inside the qemu VM. So I don't thinks it is related to serial driver issue on aws.

@vcaputo

This comment has been minimized.

Copy link

vcaputo commented Mar 8, 2016

@kayrus What you reproduced in the qemu VM is that logging from a service goes slower when forwarding to console. But you haven't necessarily reproduced the high CPU usage @ragnard is reportedly correcting by wiring the console forwarding to /dev/null.

In your qemu vm there's a serial port being emulated and the data is getting output to the emulated monitor, this can block those writev() calls down in journald when it's forwarding to the console on a serial port.

Additionally, your qemu vm has to draw the console, so you're generating additional load on the system in general, which may also contend with the log-producing test to affect the results. Did you use SDL output? -nographic? did you redirect the qemu console to a file or /dev/null?

What is strange in this issue (to me anyways) is the supposed high CPU usage. The writev() taking a relatively long time could simply be blocking on the emulated serial port, but I'm not clear on why that would also entail high CPU usage unless the driver is polling.

Addendum:
If changing the journald TTYPath to /dev/null lowered the CPU utilization substantially it suggests that the serial driver (since /dev/console is a console over serial port) is introducing substantial CPU utilization. As far as journald is concerned, when forwarding to the console, all the forwarding machinery is the same when you've only changed the TTYPath.

However, /dev/null is never going to block those writev() calls in any significant way, the serial console is going to introduce some delays. Delays due to blocking are one thing, delays accompanied by high CPU utilization are another, and suggest there may be a deficiency with the serial port emulation.

@vcaputo

This comment has been minimized.

Copy link

vcaputo commented Mar 8, 2016

In any case, since journald implements the console forwarding synchronously we'll have to turn it off.

For what we intended, an asynchronous journal follower piping to the console would be more appropriate, something like how rsyslog consumes journals.

@kayrus

This comment has been minimized.

Copy link

kayrus commented Mar 8, 2016

@vcaputo thanks. in my case I used -chardev socket,id=charmonitor,path=/home/user/coreos1.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 127.0.0.1:0. Since service was stopped qemu instance still printed logs from the queue for ~5 seconds. And CPU load graph was longer rather than without forwarding.

@ragnard

This comment has been minimized.

Copy link
Author

ragnard commented Mar 8, 2016

@kayrus @vcaputo We just ran the following experiment on a node that experiences the reported issue:

  1. Create a service unit similar to what @kayrus did above, but with a (admittedly crude) limit on logs/s:

    [Service]
    TimeoutStartSec=10s
    Type=notify
    ExecStart=/bin/bash -c "i=0; while true; do ((i++)); sleep 0.005; echo Hello World $i; done"
    
  2. systemctl start logtest

  3. Observe CPU usage using top, reaches ~60% during multiple test runs.

  4. Change /etc/systemd/journald.conf to have:

    [Journal]
    TTYPath=/dev/null
    
  5. systemctl restart systemd-journald

  6. Observe CPU usage using top, reaches ~4% during multiple test runs.

On a side note, we were unable to set TTYPath=/dev/null through a drop-in.

@kayrus

This comment has been minimized.

Copy link

kayrus commented Mar 9, 2016

Quick workaround:

#cloud-config
write_files:
  - path: "/etc/systemd/journald.conf.d/10-override-tty.conf"
    permissions: "0644"
    owner: "root"
    content: |
      [Journal]
      # disable console output at all
      # TTYPath=/dev/null
      # https://www.freedesktop.org/software/systemd/man/journald.conf.html#MaxLevelStore=
      MaxLevelConsole=crit
coreos:
  units:
    - name: systemd-journald.service
      command: restart

@vcaputo vcaputo referenced this issue Mar 9, 2016

Open

Asynchronous forwarding options for journald #2815

0 of 1 task complete
@vcaputo

This comment has been minimized.

Copy link

vcaputo commented Mar 9, 2016

You may also set MaxLevelConsole to something like "crit" or "emerg" to reduce its utilization without completely losing this troubleshooting facility, if it's something you value.

@kayrus

This comment has been minimized.

Copy link

kayrus commented Mar 9, 2016

@vcaputo thanks. updated.

@vcaputo

This comment has been minimized.

Copy link

vcaputo commented Mar 9, 2016

@kayrus no, thank you sir. 👍

@joe1chen

This comment has been minimized.

Copy link

joe1chen commented Mar 28, 2016

Thanks to both @kayrus and @vcaputo for helping us work around this issue. This problem combined with kubernetes/kubernetes#8620 created 50%+ cpu load, which immediately dropped to almost nothing once I set MaxLevelConsole=notice.

@brycechesternewman

This comment has been minimized.

Copy link

brycechesternewman commented Mar 30, 2016

Is there a way to change this permanently when oem-cloudinit.service runs?
For example...

         write_files:
             - path: /etc/systemd/journald.conf
              permissions: 0644
              owner: root
              content: |
             [Journal]
             MaxLevelConsole=crit

This seems to have no effect and /etc/systemd/journald.conf still has the default value of #MaxLevelConsole=info

@crawford

This comment has been minimized.

Copy link
Member

crawford commented Mar 30, 2016

@brycechesternewman cloudinit runs way too late to configure this. You can use Ignition instead.

crawford added a commit to crawford/coreos-overlay that referenced this issue Apr 5, 2016

coreos-overlay: cease ForwardToConsole in journald
This was introduced to assist users in troubleshooting on cloud
providers without an interactive console but with read-only access to
console output.

However, due to how journald implements the forwarding (synchronously)
it can have a dramatic impact on the overall logging performance.

Fixes coreos/bugs#1162

@crawford crawford added this to the CoreOS 1011.0.0 milestone Apr 5, 2016

@crawford crawford self-assigned this Apr 6, 2016

@tilgovi

This comment has been minimized.

Copy link

tilgovi commented Apr 7, 2016

Does this mean that SSH host keys fingerprints won't show in the console output? That could make verifying host keys difficult.

@tilgovi

This comment has been minimized.

Copy link

tilgovi commented Apr 7, 2016

I think I am mistaken and the host keys are printed directly to the console, not through the journal. Apologies for the noise.

@egoldschmidt

This comment has been minimized.

Copy link

egoldschmidt commented Jun 4, 2016

We are running CoreOS 1010.4.0 on EC2 and just ran into this issue, symptom for symptom. We too found that @kayrus's workaround resolves the problem immediately.

I see that coreos/coreos-overlay#1823 ought to help, and I can confirm that systemd.journald.forward_to_console=yes is absent from my grub.cfg, but shouldn't the journald config change be added to the stock distro to guarantee that other folks won't run into this ticking time bomb? At least until journald changes go in (though the linked issue doesn't look particularly active).

@crawford

This comment has been minimized.

Copy link
Member

crawford commented Jun 20, 2016

@egoldschmidt it looks like this change was mistakenly left out of the non-Alpha builds. We'll be sure to include it in the next Stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.