Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

session dbus-daemon crashed (SIGABRT) in libnss-systemd #15859

Closed
pabs3 opened this issue May 20, 2020 · 24 comments · Fixed by #16041
Closed

session dbus-daemon crashed (SIGABRT) in libnss-systemd #15859

pabs3 opened this issue May 20, 2020 · 24 comments · Fixed by #16041
Labels
bug 🐛 Programming errors, that need preferential fixing nss

Comments

@pabs3
Copy link

pabs3 commented May 20, 2020

systemd version the issue has been seen with

245.5-2

Used distribution

Debian bullseye

Unexpected behaviour you saw

session dbus-daemon crashed (SIGABRT) in libnss-systemd

Steps to reproduce the problem

I don't know how to reproduce the problem but from my systemd journal it appears to be associated with something running as root using the su command to my user. I think this is the needrestart package using notify-send to switch to my user and notify me of processes needing a restart but I am not sure.

@poettering poettering added bug 🐛 Programming errors, that need preferential fixing nss labels Jun 2, 2020
poettering added a commit to poettering/systemd that referenced this issue Jun 2, 2020
This might fix systemd#15859, a bug which I find very puzzling.
@poettering
Copy link
Member

This is very puzzling. I prepped a possible fix in #16041. But I am not sure if it actually fixes anything, but it's the only thing that remotely makes sense to me.

We see EBADF on fclose() of an open_memstream() FILE*, and I am not sure how that possibly could ever happen...

Does this happen regularly for you?

@poettering poettering added the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Jun 2, 2020
@pabs3
Copy link
Author

pabs3 commented Jun 2, 2020 via email

@poettering
Copy link
Member

do you have any special NSS setup btw? ldap or so? lots of users/groups or so?

If the issue doesn't pop up with the patch applied anymore we should probably close this and assume it fixed until it pops up again and then reopen, or so?

@pabs3
Copy link
Author

pabs3 commented Jun 2, 2020 via email

poettering added a commit that referenced this issue Jun 2, 2020
This might fix #15859, a bug which I find very puzzling.
@pabs3
Copy link
Author

pabs3 commented Jun 3, 2020 via email

@pabs3
Copy link
Author

pabs3 commented Jul 17, 2020 via email

@mbiebl mbiebl reopened this Jul 20, 2020
@mrc0mmand mrc0mmand removed the needs-reporter-feedback ❓ There's an unanswered question, the reporter needs to answer label Jul 20, 2020
@poettering
Copy link
Member

Does the version you tested include 75f6d5d?

@mbiebl
Copy link
Contributor

mbiebl commented Jul 21, 2020

Does the version you tested include 75f6d5d?

I assume so, given the comment "...with the patch cherry-picked on top."

@poettering
Copy link
Member

did you reboot after patching/rebuilding/installing systemd? NSS modules remain pinned in running processes... only way to update them safely is to reboot?

@pabs3
Copy link
Author

pabs3 commented Jul 22, 2020 via email

@keszybz
Copy link
Member

keszybz commented Jul 22, 2020

It is possible that the crash is caused by memory corruption in some other part of the code. I looked at the code involved and don't see anything obvious either. I guess we'll need to wait and see if other people hit this.

@keszybz
Copy link
Member

keszybz commented Jul 27, 2020

@keszybz
Copy link
Member

keszybz commented Jul 28, 2020

@fweimer, @codonell maybe you could take a look? The code seems correct, but when we do fclose() on the stream allocated with open_memstream(), we get EBADF.

@codonell
Copy link

@keszybz The storage backing the FILE* is allocated by malloc by __open_memstream() and so is easily susceptible to buffer overflows from nearby chunks. In general it looks like you only use open_memstream_unlocked() from src/basic/fileio.c, and so any failure to coordinate by the callers could result in corruption. I looked over the code in src/basic/fd-util.c and I don't see anything immediately wrong. These cases are hard to track down :-(

@keszybz
Copy link
Member

keszybz commented Jul 31, 2020

In general it looks like you only use open_memstream_unlocked() from src/basic/fileio.c, and so any failure to coordinate by the callers could result in corruption.

There is always exactly one caller — the memstream object is never passed outside of the originating function. (In the whole codebase there is one exception in dbus introspection code, but that's code path is not touched here.) So there is no question of coordination, afaict.

@keszybz
Copy link
Member

keszybz commented Aug 1, 2020

fclose may need to allocate space for the terminating NUL byte.

But can it return EBADF in that case? We only check that the errno we got is not EBADF.

@fweimer
Copy link

fweimer commented Aug 2, 2020

No, you won't get EBADF in that case, and the allocation during fclose will not happen anyway because of the previous fflush call, which is what actually allocates.

@pabs3
Copy link
Author

pabs3 commented Aug 4, 2020 via email

@keszybz
Copy link
Member

keszybz commented Aug 4, 2020

I think we need to go over the glibc code with a fine comb and figure out in what circumstances it can return EBADF. Maybe EBADF is a legitimate return value for memstreams?

@fweimer
Copy link

fweimer commented Sep 14, 2020

@keszybz I rather suspect this is the consequence of unrelated memory corruption (but I could be wrong).

@pabs3
Copy link
Author

pabs3 commented Oct 11, 2020

FTR: I got another pair of crashes with libnss-systemd 246.6-1 from Debian bullseye. I'm assuming that the backtrace isn't going to be interesting but if it is please let me know before it is auto-deleted in a week's time.

vbatts pushed a commit to kinvolk/systemd that referenced this issue Nov 12, 2020
This might fix systemd#15859, a bug which I find very puzzling.

(cherry picked from commit 75f6d5d)
vbatts pushed a commit to kinvolk/systemd that referenced this issue Nov 12, 2020
This might fix systemd#15859, a bug which I find very puzzling.

(cherry picked from commit 75f6d5d)
@poettering
Copy link
Member

Is this still reproducible with current versions of systemd/glibc? If not, let's close this

@pabs3
Copy link
Author

pabs3 commented Jun 6, 2023 via email

@yuwata
Copy link
Member

yuwata commented Jun 6, 2023

Thanks. Then, let's close this.

@yuwata yuwata closed this as completed Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing nss
Development

Successfully merging a pull request may close this issue.

8 participants