Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-logind - memory leak on SSH connections #8015

Closed
guilhermepiccoli opened this issue Jan 26, 2018 · 24 comments
Closed

systemd-logind - memory leak on SSH connections #8015

guilhermepiccoli opened this issue Jan 26, 2018 · 24 comments
Labels
bug 🐛 Programming errors, that need preferential fixing login lxc/lxd

Comments

@guilhermepiccoli
Copy link

Submission type

  • Bug report

systemd version the issue has been seen with

Upstream version (at commit 6cddc79).
But it's reproducible in much older versions, tested with 204 and 229 too.

Used distribution

Ubuntu 18.04 (Bionic) - I've built the upstream version of systemd on top of it.

Bug description

The systemd-logind tool presents a clear memory leak on an event of SSH connection. Basically, at each SSH connection, some memory is allocated and this portion of memory stays there even after the SSH is disconnected - code seems to lack a free somewhere.

It has come to my knowledge a case of OOM in this tool due to the huge memory footprint of systemd-logind after some weeks of machine uptime.

Valgrind analysis showed many potential leaks in session creation routines:
valgrind_bionic.txt

Steps to reproduce the problem

To reproduce the issue and expose the memory leak, a simple ssh loop would be enough:

while true; do ssh <hostname> "whoami" 1>/dev/null; done

A 5min run led to these results:

upstream_anon

The numbers of the graph:
upstreamAnon.txt

We measured the anonymous pages of systemd-logind based on /proc smaps of the process.

@yuwata yuwata added bug 🐛 Programming errors, that need preferential fixing login labels Jan 27, 2018
@poettering
Copy link
Member

Hmm, so valgrind output is generated during normal runtime when the process is abnormally terminated by SIGINT. It shows all memory allocated at that time, which is different from leaked memory...

@guilhermepiccoli
Copy link
Author

I think I understand what you're saying, and I disagree. Correct me if I figured it wrongly:
you're saying that Valgrind is measuring all the memory allocated during the process lifetime, and since we are terminating the process in an abnormal way (SIGINT), that memory isn't freed. The way you're saying seems that systemd-logind would free all memory in a regular program termination. Is my understanding of your statement right?

Well, the reason I consider it a clearly wrong behavior is simple: systemd-logind will consume all memory of a machine, if we continue doing SSH connection during it's lifetime. This behavior shouldn't be acceptable, do you agree? If all applications behaved this way, we couldn't get a machine running for 24 or 48h, because the applications would end up getting OOM'ed all the time.
As the graph (and data) showed, the memory consumption of logind is continuous increasing...

It is considered a leak if you allocate a memory, use it and don't free that memory in a feasible time. What is the point of freeing all the memory in the end of a program, letting it consume all RAM of a machine during the lifetime of the process? Basically it's the same as saying "this program should be terminated in a regular basis or it'll break your system" heheh

@poettering
Copy link
Member

Well, I am not saying there wasn't a leak somewhere, I am just saying that the tool you used (or specifically, the way you used it) is not useful for finding it...

What does "loginctl" actually report when this happen? how many open sessions?

@guilhermepiccoli
Copy link
Author

Thanks for your clarification Lennart!

I did the following experiment: ran the "while true; ssh" for 1 minute, after that captured the output of loginctl:

loginctl_1min.txt

Then, waited another 9 minutes and re-captured the output of loginctl - I was hoping it maybe cleared the sessions due to a delayed mechanism (something in the line of garbage collection), but the results were the same:

loginctl_10min.txt

Cheers,

Guilherme

@boucman
Copy link
Contributor

boucman commented Jan 29, 2018

I had that behaviour once, which was due to an upgrade of logind without reboot of the machine (just saying in case it helps diagnose...)

@guilhermepiccoli
Copy link
Author

Thanks boucman ...in my case it's consistent, I mean you can start a machine, run the ssh loop aforementioned, and you'll realize the continous increase of RAM.

BTW, I noticed that those sessions created from the ssh loop are kept on "closing" state - what does prevent them to be released? Seems to me if after the session is on closing state for a while, a timeout was triggered and the session was removed, we wouldn't see the memory issue.

@yuwata
Copy link
Member

yuwata commented Jan 31, 2018

Hmm... I cannot reproduce this (with recent snapshot of systemd on Fedora 27 x86_64)...

@poettering
Copy link
Member

@guilhermepiccoli it appears you are leaking full sessions. Question is of course why. If you look into those sessions with "loginctl session-status", what do you see? Is this in some container env or so? or anything else weird? do those sessions possibly leave processes around? if so, we won't close them.

@guilhermepiccoli
Copy link
Author

yuwata, I was able to reproduce using upstream systemd, built from my own. Maybe the distro version is a bit different and somewhat does not show the issue?

Lennart: I've been testing using a LXD container, but the issue reproduces on bare-metal system, just re-checked. I'm using Ubuntu 18.04 candidate with upstream systemd.

I proposed a pull request that fixed it for me: #8062
I'm not sure how to relate issues/pull requests in GitHub, feel free to do it your way.
Thanks,

Guilherme

@poettering
Copy link
Member

Lennart: I've been testing using a LXD container, but the issue reproduces on bare-metal system, just re-checked.

cgroup empty notifications are not reliable inside containers, hence the LXD and the baremetal case are actually very different. Before looking into the LXD case I'd hence focus on the baremetal case.

@guilhermepiccoli
Copy link
Author

Thanks for the hint! I'll focus on bare-metal then

@fr33l
Copy link

fr33l commented Aug 14, 2018

Any update on this?

We'r experiencing the same on bionic LXD container, and it's rather annoying since we have munin monitoring which logs in/off every 10 minutes, so we'r hitting session limit in a week or so...

@yuwata
Copy link
Member

yuwata commented Aug 14, 2018

@fr33l Please provide results of loginctl session-status.

@fr33l
Copy link

fr33l commented Aug 14, 2018

loginctl session-status 194762
194762 - munin-async (109)
           Since: Tue 2018-08-14 11:55:06 UTC; 5min ago
          Leader: 2306
          Remote: 10.201.6.61
         Service: sshd; type tty; class user
           State: closing
            Unit: session-194762.scope

Aug 14 11:55:06 bionic systemd[1]: Started Session 194762 of user munin-async.

This one for regular user:

loginctl session-status 194767
194767 - andrii (1016)
           Since: Tue 2018-08-14 11:56:18 UTC; 3min 36s ago
          Leader: 2370
          Remote: 94.153.147.214
         Service: sshd; type tty; class user
           State: closing
            Unit: session-194767.scope

Aug 14 11:56:18 bionic systemd[1]: Started Session 194767 of user andrii.
Aug 14 11:56:29 bionic sshd[2370]: pam_unix(sshd:session): session closed for user andrii

@fr33l
Copy link

fr33l commented Sep 27, 2018

Just in case if someone will find this in google, we were able to solve this by enabling security nesting for LXD container:

lxc config set $HOST security.nesting true

@guilhermepiccoli
Copy link
Author

Thanks @fr33l , pretty useful!

@yuwata yuwata added the lxc/lxd label Sep 27, 2018
@resmo
Copy link

resmo commented Nov 19, 2018

I could reproduce this on Debian 9.5 as a VM on vSphere 5.5:

systemd --version
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

with 2 parallel executions

while true; do ssh <hostname> "free"; done

I had to ctrl-c and re-execute it once to see it. (no lxc involved)

@rohityadavcloud
Copy link

Is there a workaround or setting to limit memory usage?

@rohityadavcloud
Copy link

Hi @poettering, we're seeing this with CloudStack virtual routers that are based on Debian 9.6 and as @resmo we see the issue of memory growth and the version as follows:

systemd --version
systemd 232
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN

Can you help, advise any workaround?

@resmo
Copy link

resmo commented Nov 20, 2018

I also tried a newer version from debian backports, seeing the same issue:

systemd 239
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid

@rfrail3
Copy link

rfrail3 commented Mar 21, 2019

On Debian I fix the memory consumption commenting the following line on /etc/pam.d/common-session:

#session optional pam_systemd.so

But, with that, the ssh sessions doesn't terminate correctly. The fix is already done:

cp /usr/share/doc/openssh-client/examples/ssh-session-cleanup.service /etc/systemd/system/
systemctl  enable ssh-session-cleanup.service

@rfrail3
Copy link

rfrail3 commented Mar 25, 2019

Other alternative solution for Debian, add to the end of the file /etc/pam.d/systemd-user :

@include null

It doesn't include any file, because null doesn't exists.

With that change, running the ssh loop doesn't increment the memory.
Please, @resmo @rhtyd can someone test it?

PaulAngus pushed a commit to shapeblue/cloudstack that referenced this issue Mar 26, 2019
rohityadavcloud pushed a commit to shapeblue/cloudstack that referenced this issue Jun 24, 2019
Add vm.min_free_kbytes to sysctl
periodically clear disk cache (depending on memory size)
only start guest services specific to hypervisor
use systemvm code to determine hypervisor type (not systemd)
start cloud service at end of post init rather than through systemd
reduce initial threads started for httpd
fix vmtools config file
disable all required services (do not start on boot)
start only required services during post init.

add '@include null' to /etc/pam.d/systemd-user
as per systemd/systemd#8015 (comment)

remove cloud agent service startup from VR
@pmhahn
Copy link

pmhahn commented Jul 31, 2019

There is a known bug in Linux Kernel CGrouop handling until 5.3: pam_systemd creates a CGroup per session, which is not freed completely and slowly eats all memory.
I tried to post my report to the ML, but my post was denied for spam protection. See https://www.spinics.net/lists/cgroups/msg22853.html instead.

@poettering
Copy link
Member

let's close this. There was a kernel bug in this, and it has long been fixed. If this is reproducible on current systemd systems, please file a new bug

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing login lxc/lxd
Development

No branches or pull requests

9 participants