-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
journalctl slowness #246
Comments
@bmr-cymru, what do you think? sounds ok to me but I'm not fully up to speed on systemd yet to make any suggestions. |
I think there's a few things we need to understand here:
Even so we may need to take steps to detect this and do something. I'm very surprised though that --this-boot is so slow even with a large journal. Surely it doesn't need to read $everything to show the latest records? This doesn't feel like the Brave New World we were sold on with this newfangled logamathing. :-) |
'journalctl is slow' seems to be a rather common meme... |
So I did a quick show of hands at work regarding the sizes of /var/log/journal. Here's the result: So I guess I definitely fall outside of the average. It might be due to the fact that I never reinstall and always just yum update. So I've constantly upgraded this box since about 2years and 3months. I'll try and profile these commands and raise BZs against that to get the systemd's folks opinion on this |
There are a few threads on systemd-devel discussing the problem but so far not even a comment from a maintainer: http://lists.freedesktop.org/archives/systemd-devel/2013-September/013376.html |
I added some simple command profiling in commit fd68a0c:
This allows you to easily see the most expensive commands sos is running, by command name or by plugin:
See Measuring command run times for the scripts to process the raw logs. Obviously RPM is the bigger problem for most systems right now; there are some changes I've got planned to improve this somewhat. But even on my 8 week old F20 install I'm seeing quite long run times in journalctl (~35s typically). This is with around 600M in /var/log/journal:
|
On my FC20 system, the "journalctl --verify" took well over a minute to run. Is that necessary to run given we just want to collect necessary data for post-processing? I have been disabling this plug-in for our uses today, just too heavy for a summary report. |
It's one of the pieces the systemd folks asked us to include: https://bugzilla.redhat.com/show_bug.cgi?id=879619#c0 I've been keeping an eye on its cost ever since with a view to making this optional. This and the additional package verification now done in the rpm plugin account for about 90% of the runtime of a default invocation on my test host. One option I'm considering is to introduce a global '--verify' command line option. This can then be tested by individual plugins to determine whether or not to run costly verification actions. |
I had almost 4 GB of journalctl files on my system, so a verify operation took about 10 minutes. I changed my settings so that I only keep 2 weeks of data, and a maximum of 200 MB. |
That's similar to the results Michele was seeing with a 4.1GB journal. In some ways we have a bit of a get-out here in that only Fedora is currently enabling the on-disk journal (so RHEL7 won't see this problem). With the numbers I'm hearing though I'm more inclined to work towards having a --verify switch (default off) for 3.2. |
Would be great to get some feedback on how this behaves now with and without the new |
Ping, anyone had a chance to test this? I'll try to update the f21 / rawhide builds this week for anyone testing on Fedora or RHEL7. |
Hi Bryn, apologies for the late reply (moving countries is one big PITA). I like the --verify approach. Those that need that level of assurance can specify it and the default case is much more reasonable. Here are my results: If you agree, I'd say we can close this one? cheers, |
2 minutes is still a long time to wait for sosreport to finish |
We're improving; journalctl is no longer the worst offender in my tests but I'd like to see some more general improvement before we get 3.2 out. |
I think the current state's reasonably acceptable here; the longer runtimes are only seen on systems with huge journals or when running with |
We currently collect the following in the systemd plugin:
self.add_cmd_output("journalctl --verify")
self.add_cmd_output("journalctl --all --this-boot --no-pager")
self.add_cmd_output("journalctl --all --this-boot --no-pager -o verbose")
At least on my F20 system which has a 4.1G /var/log/journal dir this takes:
5mins for journalctl --verify
~2mins for journalctl --all --this-boot --no-pager > /tmp/foo
~2mins for journalctl --all --this-boot --no-pager -o verbose > /tmp/foo
Without the systemd plugin it takes 1min 40seconds for the full sosreport to run.
Two thoughts on this:
If this has been discussed before and consensus was already reached, feel free to close this one out ;)
The text was updated successfully, but these errors were encountered: