New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boot time regressed 20 times after upgrading from 244.1 to 244.2 #14828
Comments
|
Version numbers 244.1 and 244.2 are meaningless outside of Arch Linux. They likely contain some number of back-ported changes. Please attempt to identify the specific commit that introduced the problem. |
|
that's not true... since 243 the systemd-stable does tag .1 and .2 releases, specifically for that kind of bug reports... that said arch might carry some specific patches... |
Ah ha, that's a nice improvement. It really deserves an announcement on the mailing list. |
|
@floppym - after a bit of git bisect and few reboots, here's the git bisect result. After rebooting so many times, I think I hear the HDD spinning when I get the 30-40 seconds boot times, vs no noise when I get the 3-4 seconds. I am not running SecureBoot (or so I understand this output): I was a bit confused whether I messed up the git bisect, so I checked out 4c2d72b (the commit before 84c0487), rebuilt, installed it, and rebooted three times using this commit. And I got 3 seconds boot time on this commit consistently. Did the same with 84c0487 and got 3 out 3 reboots about 40 seconds boot time. So it seems to be indeed the one that introduced the slower boot up. |
|
Colleagues, do you have any estimates of the resolution of the problem? Original author is not alone and, honestly, author did a certain job, from which, at least, you can push off. |
|
so LVM takes 20s to initialize? can you reproduce without LVM in the mix? |
|
somehow the bisect doesn't look right to me, it's unlikely that one would cause any slowdown. |
|
I don't use LVM at all (in no form). It is present in critical-chain but I can’t imagine how it can influence my situation. I can try to bisect myself. I could not find the time for this at all, but it seems that this needs to be done. I'll report parallel surveys but I need a couple of days to find spare time. |
So, here we are: c7d26ac (actually, exact the same as for original author). Can the following help somehow? |
|
@poettering - do you suggest any additional info to be gathered between me or @stellarator ? The EFI related changes indeed look odd to be the root cause of this regression, and I recall doing the bisect twice before posting. Glad to see confirmation from stellarator too on the same commit. It does look like the original commit (c7d26ac) is also associated with this issue #14864 @openmindead FYI I know very little of EFI - not sure even how to debug this further. I am stuck on a previous systemd that does not include this change. Instead of investing the time to compile it with the commit revert whenever I need to update systemd, I'd rather invest that time in helping investigate and resolve this. Please do let me know if there is anything else I can try to help narrow this down. Thank you! |
|
The associated issue is about a bug in the test suite (that I should fix) but is not related. I've been staring at my patch for the past half hour; but for the the life of me can not figure out why it would have caused this regression. Most of it is moving functions around. The only functional change is that before reading we now check if is false. Could it be that reading the Any time we read an option from the kernel cmdline we check if this variable is set before checking if |
|
I think this is just fall-out from the behaviour we work around here: i.e. the fix for this issue is simple: just cache the value once we read it, so that we don't keep hitting efivarfs all the time. Otherwise the kernel will ratelimit us... |
|
Also see #15598 |
|
Ah so we're just more likely to hit this kernel rate limit now because we're doing more do we generically want to cache all UEFI makes destinction between immutable and mutable variables. Seems like a generic caching makes more sense to not run into these rate-limits into the future |
EFI variable access is nowadays subject to rate limiting by the kernel. Thus, let's cache the results of checking them, in order to minimize how often we access them. Fixes: systemd#14828
|
Fix waiting in #15627 |
EFI variable access is nowadays subject to rate limiting by the kernel. Thus, let's cache the results of checking them, in order to minimize how often we access them. Fixes: systemd#14828
EFI variable access is nowadays subject to rate limiting by the kernel. Thus, let's cache the results of checking them, in order to minimize how often we access them. Fixes: #14828
|
Would be fantastic if anyone could test if #15627 (or git master) fixes this issue properly. |
|
@poettering thank you for looking into this. I cherry-picked the change onto stable and tested. Boot time is now 20 seconds user space. It is a significant improvement compared to the 40 seconds user space start up I was getting before. But it is still a huge increase from the 2-3 seconds I used to have before. I will try repeating some few more times and report back if I see additional differences. EDIT: I have been running the patch since yesterday and it get constantly very close to 20 seconds user space all the time. This is an improvement over the original regression, but hopefully there is still a way to get back to 2-3 seconds. |
|
Seems like we should keep this issue open then still, as the regression is still 10x |
|
@poettering I had another idea to get rid of this reading How about we move the reading of The EFI stub already has the "Only do this if SecureBoot is disabled" logic that I now duplicated into systemd and made things slower here https://github.com/systemd/systemd/blob/master/src/boot/efi/stub.c#L66-L76 So we get the 'dont do that unless secure boot is off' for free as far as I can see. |
|
Tested #15627 today, and at least it makes my system boot correctly, and makes it usable again (I was down to almost 10 minutes for some devices to be detected, now it's back to normal). |
Quoting systemd#14828 (comment): > [kernel uses] msleep_interruptible() and that means when the process receives > any kind of signal masked or not this will abort with EINTR. systemd-logind > gets signals from the TTY layer all the time though. > Here's what might be happening: while logind reads the EFI stuff it gets a > series of signals from the TTY layer, which causes the read() to be aborted > with EINTR, which means logind will wait 50ms and retry. Which will be > aborted again, and so on, until quite some time passed. If we'd not wait for > the 50ms otoh we wouldn't wait so long, as then on each signal we'd > immediately retry again.
|
Tested #15986 here is the result efi-log.txt |
|
It looks like we should merge #15986, but also consider adding a global cache. |
Quoting systemd#14828 (comment): > [kernel uses] msleep_interruptible() and that means when the process receives > any kind of signal masked or not this will abort with EINTR. systemd-logind > gets signals from the TTY layer all the time though. > Here's what might be happening: while logind reads the EFI stuff it gets a > series of signals from the TTY layer, which causes the read() to be aborted > with EINTR, which means logind will wait 50ms and retry. Which will be > aborted again, and so on, until quite some time passed. If we'd not wait for > the 50ms otoh we wouldn't wait so long, as then on each signal we'd > immediately retry again.
Quoting #14828 (comment): > [kernel uses] msleep_interruptible() and that means when the process receives > any kind of signal masked or not this will abort with EINTR. systemd-logind > gets signals from the TTY layer all the time though. > Here's what might be happening: while logind reads the EFI stuff it gets a > series of signals from the TTY layer, which causes the read() to be aborted > with EINTR, which means logind will wait 50ms and retry. Which will be > aborted again, and so on, until quite some time passed. If we'd not wait for > the 50ms otoh we wouldn't wait so long, as then on each signal we'd > immediately retry again.
Summary: There is a known issue upstream where boot time is regressed due to an API breakage in kernel and code change in systemd (systemd/systemd#14828). You can track the root cause in the issue linked. In FB this manifests as SMI increases when Chef runs. So far the patches upstream don't resolve fix the problem but reverting the blame commit (systemd/systemd@c7d26ac) does. Reviewed By: NaomiReeves Differential Revision: D21896056 fbshipit-source-id: fa8f0b515ac9b06b7bc376c1123f0bd0b6d49848
|
The problem with #15627 is that it's a per-process cache and as reported in #16097 a bunch of processes tries to parse @arianvp :
Why not systemd PID 1 itself but early boot? See my other suggestion about caching kernel command line somewhere under |
|
With all those fixes merged now, including the latest (@filbranden's #16139), where are we now with this? Do people still see slowdowns? (would be great to get a new log output with the time measurements with everything merged) |
|
I compiled 246.0 from master (6fe95d3) and ran the debug log. Here is the grep on efi variables access efi-vars.txt |
|
Linux kernel fixes for this issue have made a step forward. Accepted by EFI maintainer. They should appear in the linux next tree tomorrow (when tag next-20200616 is created). |
…operty call Apparently some clients use GetAll() to acquire all properties we expose at once (gdbus?). Given that EFI variables are no longer cheap, due to recent kernel changes, let's simply mark them so that they are not included in GetAll(). This is an API break by some level, but I don't see how we can fix this otherwise. It's not our own API break after all, but the kernel's, we just propagate it. Yes, sucks. See: systemd#14828
|
@oleastre So, in a build with all of #16139 and also #16190 are we back at good performance? You output only showed one slow variable access left. #16190 is an experiment. We might not need it if we cache more. I just want to verify that is caused by some client issuing the D-Bus property GetAll() call on our interface which makes us acquire everything we know and which means going to EFI variables. Which was fast before, but due to the kernel compat borkage is now extremely slow in some cases... |
|
The build I tested only included #16139; did not tried #16190, @poettering do you want a trace with that one ? |
Yes, please. |
|
@poettering - I am happy to report that running systemd from tip of master (a51a324) + cherry pick of #16190 has brought my boot times back to 244.1 numbers. In case it helps, here is the debug dmesg logs: dmesg.debug.txt Here are the times with debug flag set. Slightly up from 244.1, but probably because of the extra I/O on console due to the debug messages. And without the debug flag set. Almost exactly what it was back then. Thank you! |
EFI variable access is nowadays subject to rate limiting by the kernel. Thus, let's cache the results of checking them, in order to minimize how often we access them. Fixes: systemd#14828 (cherry picked from commit f46ba93)
With this we are now caching all EFI variables that we expose as property in logind. Thus a client invoking GetAllProperties() should only trgger a single read of each variable, but never repeated ones. Obsoletes: systemd#16190 Fixes: systemd#14828
|
@andreesteve thanks for playing around with this. Based on your info it appears it's simply the EFI-backed dbus props of logind that triggered the porblem. I prepped #16281 now which adds caches to them both. With that in place all EFI-backed dbus properties should be unconditionally cheap again, as they used to be, and the problems should go away. I am pretty sure this wil fix this issue for good. Would of course be great if you could verify that. I will now close #16190 since I think #16281 is a much nicer, less invasive fix. |
With this we are now caching all EFI variables that we expose as property in logind. Thus a client invoking GetAllProperties() should only trgger a single read of each variable, but never repeated ones. Obsoletes: systemd#16190 Fixes: systemd#14828
Quoting systemd/systemd#14828 (comment): > [kernel uses] msleep_interruptible() and that means when the process receives > any kind of signal masked or not this will abort with EINTR. systemd-logind > gets signals from the TTY layer all the time though. > Here's what might be happening: while logind reads the EFI stuff it gets a > series of signals from the TTY layer, which causes the read() to be aborted > with EINTR, which means logind will wait 50ms and retry. Which will be > aborted again, and so on, until quite some time passed. If we'd not wait for > the 50ms otoh we wouldn't wait so long, as then on each signal we'd > immediately retry again. (cherry picked from commit eee9b30)
Quoting systemd/systemd#14828 (comment): > [kernel uses] msleep_interruptible() and that means when the process receives > any kind of signal masked or not this will abort with EINTR. systemd-logind > gets signals from the TTY layer all the time though. > Here's what might be happening: while logind reads the EFI stuff it gets a > series of signals from the TTY layer, which causes the read() to be aborted > with EINTR, which means logind will wait 50ms and retry. Which will be > aborted again, and so on, until quite some time passed. If we'd not wait for > the 50ms otoh we wouldn't wait so long, as then on each signal we'd > immediately retry again. (cherry picked from commit eee9b30)
As suggested by: systemd/systemd#14828 (comment)
Quoting systemd/systemd#14828 (comment): > [kernel uses] msleep_interruptible() and that means when the process receives > any kind of signal masked or not this will abort with EINTR. systemd-logind > gets signals from the TTY layer all the time though. > Here's what might be happening: while logind reads the EFI stuff it gets a > series of signals from the TTY layer, which causes the read() to be aborted > with EINTR, which means logind will wait 50ms and retry. Which will be > aborted again, and so on, until quite some time passed. If we'd not wait for > the 50ms otoh we wouldn't wait so long, as then on each signal we'd > immediately retry again.
systemd version the issue has been seen with
Used distribution
Expected behaviour you didn't see
Unexpected behaviour you saw
Steps to reproduce the problem
Boot times & critical chain on 244.1
Boot times & critical chain on 244.2
The text was updated successfully, but these errors were encountered: