-
-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
systemd-journald: very large virt memory leak (heap) #11502
Comments
|
From the debian changelog: |
|
|
|
Confirmed that rolling back to 240-3 fixes the issue. Looks like it might be unique to downstream distros (e.g. debian, rh), possibly related to https://salsa.debian.org/systemd-team/systemd/commit/ce0e48e43979e955df2413dca23c64088a729ed8 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=918848 |
This fixes a crash where we would read the commandline, whose length is under control of the sending program, and then crash when trying to create a stack allocation for it. CVE-2018-16864 https://bugzilla.redhat.com/show_bug.cgi?id=1653855 The message actually doesn't get written to disk, because journal_file_append_entry() returns -E2BIG.
|
FWIW, I does seem that something is wrong. But I can't pinpoint the issue. I see a growth of RSS and VIRT over the time, and it doesn't seem to go back down. I don't think this is related to recent changes though. I get very similar behaviour with git master, v240-stable~10, v239-stable~10, and v238-stable~10, and finally with v238 (built with some work-arounds for the unrelated build issues). So right now I don't think that the recent patches are responsible. There was 3de8ff5 'journald: bump rate limits'. It makes it easier to reach high memory use, because there's less throttling. So maybe the problem was there before, but it's been exposed by the limit changes. @nyetwurk note that the debian package didn't restart journald on restart (I didn't check recently, but it was certainly true in the past). So it is possible that despite some upgrades, you were in fact running a yet-older version. If you can reliably reproduce the issue, then please bisect it to a specific patch. |
|
I'll also note that |
|
@keszybz Note that RSS is not apparently growing, so it might not be a memory leak that valgrind can detect; it may be memory fragmentation or some other mmap related leak that only leaks vm address space. Regarding journald restarting - downgrading to deb 240-3 definitely fixes it, and reinstalling 240-4 definitely brings the problem back. There were no cases where swapping between the two had no effect. Note that this is using deb packages which do not track this upstream repo 1:1. Time permitting I will try to narrow it down to a single deb patch. |
|
Maybe ef30f7c? |
|
I'm also suffering from this issue, also using 240-4 and also on Debian, thank you for trying to find out what is wrong. Everything worked fine previously. This was after upgrade from 240-3 to 240-4, I've rebooted the machine now and checking if it fixes the issue. |
I thought it's a candidate too, but it seems correct. Also, please note that people report that is started occuring between -3 and -4 for them, which leaves 084eeb8 as the only good candidate. My thinking is that somehow allocating this string creates an allocation pattern that causes the address space to grow. |
|
Reboot didn't fix the issue for me, every line being logged to logfiles seems to cause a memory leak, this is visible even by just executing With The workaround for now is having |
|
I just rebuild Debians systemd-240-4 with 084eeb8 removed and I can still reproduce the problem: |
|
It appears this is purely a distribution-backport introduced leak, has anyone reproduced it in vanilla upstream v240? If it's only a distro problem we don't really have anything to do here. |
|
deb 240-4 is fairly close to the vanilla 240 tag here. Still haven't had time to do bisect or test specific debian changes yet... I will add to this bug as I gather more information. |
|
I recompiled the current HEAD of the vanilla systemd git, running inside a nspawn container and I am able to reproduce the problem. |
|
@shartge any chance you have time to run a bisect? I may abandon my deb specific efforts and focus on trying to fix HEAD here then instead. |
|
I am bisecting as I write this. 5 more steps to go. |
|
Here are my first results: |
|
I tried a second time on a different machine, same results. |
|
Confirmation: Using current HEAD and reverting 2d5d2e0 fixes this problem for me. |
|
But of course I have no idea what other problems this causes or will cause. I really don't know enough C to be of any more use here. |
|
Now that we know the exact commit that caused it, let's just tag for @keszybz and wait for a bugfix, thanks for debugging this guys 🙂 |
|
@shartge Thank you very much for your help with the bisect. @keszybz @poettering I suspect what's going on here just looks like a leak but is actually the process metadata caching exploding in size because get_process_cmdline() is always allocating |
|
Arch Linux, v240.34-3, which is pretty much upstream commit f02b547. It also cherry picks 8ca9e9 and ee0b9e. I have a virtual machine with 1GB of memory, and this bug crashed the system. After this occurred, if I typed in open ssh sessions, I saw the characters echoing, but it wouldn't execute any commands. Opening new ssh sessions got to the point where it would show me my last login, but then nothing else - I let it go for an hour, before hard rebooting the VM. Trying to login from console (web virtual console for the VM) never got to a shell prompt either. Can't remember if it asked for my password or gave me last login info. (Also let this go for an hour, at the same time as trying a new ssh.) Juicy part of log below, full log after the problem here: That's the end of the log. Since I gave it an hour before rebooting, it should have synced, so I'm guessing it was just done at that point. Hopefully not only the leak can be fixed, but whatever prevented it from gracefully restarting properly and crashed the system can be fixed. On 12/21/2018, upgraded to 239.370-1 (3bf819c.) I don't think it had this bug, because it stayed up until 1/16/2019, when I upgraded to 240.34-3 (f02b547.) It then took the system down in about 5 days. |
…line Allocate new string as a return value and free our "scratch pad" buffer that is potentially much larger than needed (up to _SC_ARG_MAX). Fixes systemd#11502
|
@shartge I posted a PR in attempt to fix this bug. I'd be super grateful if you could apply it on top of the HEAD and run it against your bisect script. |
|
Will do, as soon as I have access to my testbed again. But you can easily test this yourself, the reproducer is just running
and checking if the number in the second column grows or stays the same. (Using |
|
The heap does still grow with your change, but at a much much slower rate. But it still grows with ever message processed. On a busy system this will still cause resource exhaustion after some time. |
|
To put things in perspective: vanilla HEAD: after 1000 iterations the heap is up to 2GiB in size:
vanilla + #11527: after 2000 iterations we are only up to ~13MiB:
It will keep growing, roughly 8MiB per 2000 iterations. |
Does the growth ever stop? |
…line Allocate new string as a return value and free our "scratch pad" buffer that is potentially much larger than needed (up to _SC_ARG_MAX). Fixes systemd#11502
|
|
Scratch that, I was looking at the wrong terminal. The test with your first attempt seems to have stabilized itself at So after roughlly 2^14 log messages we seem to hit an equilibrium. |
|
The cache is defined as 16*1024, which is exactly 2^14. Excellent guess! |
|
JFTR: Applying #11527 on top of Debians 240-4 also fixes Bug#920018 |
…line Allocate new string as a return value and free our "scratch pad" buffer that is potentially much larger than needed (up to _SC_ARG_MAX). Fixes systemd#11502
…line Allocate new string as a return value and free our "scratch pad" buffer that is potentially much larger than needed (up to _SC_ARG_MAX). Fixes #11502
…line Allocate new string as a return value and free our "scratch pad" buffer that is potentially much larger than needed (up to _SC_ARG_MAX). Fixes systemd#11502 (cherry picked from commit eb1ec48) (cherry picked from commit b70b1e0)
Fixes #11911 Systemd-journald would leak memory when recording process info. Add patch files from upstream systemd. Note that the patch from 2d5d2e0cc5 was taken as well in order to make the needed commit apply cleanly. Bug report: systemd/systemd#11502 Accepted patch: systemd/systemd#11527 Signed-off-by: Jonah Petri <jonah@petri.us> [Peter: add bz reference, add s-o-b to patches, drop numbering] Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
This is removed beacuse the bug systemd/systemd#11502 is already closed. Also de degraded test was failing because of this: Error: 2021-03-12 11:28:01 Error executing google:debian-sid-64:tests/main/degraded (mar121119-891844) : ----- + . /home/gopath/src/github.com/snapcore/snapd/tests/lib/systemd.sh + wait_for_service multi-user.target + local service_name=multi-user.target + local state=active ++ seq 300 + for i in $(seq 300) + grep -q ActiveState=active + systemctl show -p ActiveState multi-user.target + return + case "$SPREAD_SYSTEM" in + grep 'State: [d]egraded' + systemctl status State: degraded + echo 'systemctl reports the system is in degraded mode' systemctl reports the system is in degraded mode + systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● systemd-journal-flush.service loaded failed failed Flush Journal to Persistent Storage
…st is restored (#10033) * Run the reset.sh helper while the tests on each suite ares restored * Check test invariants during the restore of each test * Remove systemd-journald restart This is removed beacuse the bug systemd/systemd#11502 is already closed. Also de degraded test was failing because of this: Error: 2021-03-12 11:28:01 Error executing google:debian-sid-64:tests/main/degraded (mar121119-891844) : ----- + . /home/gopath/src/github.com/snapcore/snapd/tests/lib/systemd.sh + wait_for_service multi-user.target + local service_name=multi-user.target + local state=active ++ seq 300 + for i in $(seq 300) + grep -q ActiveState=active + systemctl show -p ActiveState multi-user.target + return + case "$SPREAD_SYSTEM" in + grep 'State: [d]egraded' + systemctl status State: degraded + echo 'systemctl reports the system is in degraded mode' systemctl reports the system is in degraded mode + systemctl --failed UNIT LOAD ACTIVE SUB DESCRIPTION ● systemd-journal-flush.service loaded failed failed Flush Journal to Persistent Storage * Restore the check for invariant as part of the prepare tests section New the invariante clean up and check is done as part of the prepare and restore of each test

Debian 240-4
VSS/VSIZE growing very rapidly without bound.
#9141
https://bugzilla.redhat.com/show_bug.cgi?id=1665931
-XX says its heap
Grows without bound until it is restarted
More importantly, you can see it was fine until I "upgraded" systemd on around Jan 14
The text was updated successfully, but these errors were encountered: