Useradd segmentation fault #628

kaizushi · 2023-01-18T05:17:03Z

Operating system: Gentoo
Kernel version: Linux 6.1.2
GCC version: Gentoo Hardened 12.2.1_p20221231 p8
Shadow version: 4.13-r1

This is a production system and a workaround would be appreciated. I suspect my kernel is too new and one of its hardening features is getting in the way, so I am rebuilding an older kernel for now. My kernel configuration might be an issue, as I started from 'make tinyconfig' to build a minimal kernel for KVM/qemu virtio with pretty much every optional security feature enabled.

I was having issues with useradd which I discussed on Libera IRC channel #gentoo and the person helping me told me to make this bug report. They ran me through doing the debugging below with gdb...

GDB output...

# gdb --args useradd -m -G users test1298
GNU gdb (Gentoo 12.1 vanilla) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from useradd...
Reading symbols from /usr/lib/debug//usr/sbin/useradd.debug...
(gdb) run
Starting program: /usr/sbin/useradd -m -G users test1298
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

Program received signal SIGSEGV, Segmentation fault.
0x00006441cfa84319 in __sgr_dup (sgent=0x1000) at sgroupio.c:36
36      sgroupio.c: No such file or directory.
(gdb) bt
#0  0x00006441cfa84319 in __sgr_dup (sgent=0x1000) at sgroupio.c:36
#1  0x00006441cfa85c29 in commonio_open (db=db@entry=0x6441cfa92ca0 <gshadow_db>, mode=2, mode@entry=66)
    at commonio.c:694
#2  0x00006441cfa846ce in sgr_open (mode=mode@entry=66) at sgroupio.c:246
#3  0x00006441cfa7856e in open_group_files () at useradd.c:1856
#4  0x00006441cfa790fd in get_groups (list=0x7ffff616ef6d "users") at useradd.c:766
#5  process_flags (argc=argc@entry=5, argv=argv@entry=0x7ffff616ddb8) at useradd.c:1354
#6  0x00006441cfa73e46 in main (argc=5, argv=0x7ffff616ddb8) at useradd.c:2499
(gdb) p *sgent
Cannot access memory at address 0x1000

The text was updated successfully, but these errors were encountered:

anthonyryan1 · 2023-02-17T17:15:50Z

I'm also seeing this in 4.13 but not in 4.12.3.

I've seen the same segfault in useradd and gpasswd. Will update this issue as I investigate.

thesamesam · 2023-02-19T20:31:13Z

I'm also seeing this in 4.13 but not in 4.12.3.

Could you bisect?

anthonyryan1 · 2023-02-20T04:22:39Z

I'll aim to have a bisect ready tomorrow. What's interesting is that it seems to be tied to the existing system users/groups.

Running the same command on a couple hundred different machines that all have unique user/group, a handful will repeatedly segfault while most will succeed.

I'm going to try and figure out what's special about the group file on the affected machines. One theory I'm still considering is that this may be some sort of subtle malformation in /etc generated by a much older version of shadow tools.

I'm also using Gentoo, and the group files on some machines I'm testing were created and modified for over a decade. I'm going to run pwck and grpck on a few of the affected ones to try and rule that out.

kaizushi · 2023-02-20T15:28:02Z

After opening this I found a workaround which is to run version 4.12.3.

anthonyryan1 · 2023-02-20T17:27:22Z

I did successfully bisect this, although it held some extra surprises.

It turns out, the culprit commit is not in the release. Building 4.13 from git failed to reproduce the bug. It turns out the Gentoo developers decided to backport a commit that has not yet made it into a stable release and is the culprit.

The problematic commit is:
a281f24
#595

And here you can see the Gentoo developers backporting it:
https://github.com/gentoo/gentoo/blob/master/sys-apps/shadow/files/shadow-4.13-configure-clang16.patch

I'm going to CC @fweimer-rh here as the author of that commit to see if he has any insight into what might be going on.

fweimer-rh · 2023-02-20T17:43:57Z

I backported this patch into Fedora rawhide as well, but cannot reproduce the problem there. sgent=0x1000 is certainly suspicious, but I don't know where that could be coming from.

anthonyryan1 · 2023-02-20T18:56:26Z

Recompiling with -O0 to avoid anything being optimized out. I'm getting the following from GDB:

#0  0x00007ffff79e3e86 in ?? () from /usr/lib/gcc/x86_64-pc-linux-gnu/12/libasan.so.8
#1  0x00007ffff797e25e in strdup () from /usr/lib/gcc/x86_64-pc-linux-gnu/12/libasan.so.8
#2  0x000055555559d9ed in __sgr_dup (sgent=0x5555555c1908) at sgroupio.c:62
#3  0x000055555559f32a in gshadow_dup (ent=0x5555555c1908) at sgroupio.c:116
#4  0x00005555555a58f7 in commonio_open (db=0x5555555e1620 <gshadow_db>, mode=0) at commonio.c:694
#5  0x00005555555a0318 in sgr_open (mode=0) at sgroupio.c:246
#6  0x000055555558f3b0 in open_files () at grpck.c:294
#7  0x0000555555593b44 in main (argc=2, argv=0x7fffffffdfc8) at grpck.c:835

The code looks correct to me, in that gshadow_dup is being run with the eptr of gshadow_parse, which is just a wrapper around glibc's sgetsgent. It doesn't seem likely, but is it possible we're getting faulty pointers out of sgetsgent?

It would fit with the bug. With proper detection for sgetsgent we've changed from the function in lib/gshadow.c to the function in glibc.

I've tested against glibc 2.36 and glibc 2.37 (possibly including distro backports), and still get segfaults in both after the fixed sgetsgent detection.

fweimer-rh · 2023-02-20T20:22:44Z

Well, we used to have a glibc bug in this area:

But this should be fixed in current glibc. The fix went into glibc 2.32 and has been backported widely, too.

But looking at sgetsgent_r and sgetsgent, it looks like the ERANGE protocol is not correctly implemented within glibc. Do you have a long line in /etc/gshadow?

Could you check that the sgetsgent in glibc is actually called, and not the version from shadow-utils?

thesamesam · 2023-02-20T22:21:55Z

@anthonyryan1 Just want to add that there's nothing really unusual about the backporting part (it's not a controverisal patch and distros, including us, do it all the time when it's required), I can't reproduce the bug, and there's a good reason for backporting all of this work. But I won't distract from the debugging effort here. If you want to discuss that side of it more, feel free to email me at sam@g.o though.

fweimer-rh · 2023-02-21T08:41:05Z

I fixed the glibc bug:

Of course I don't know if that's the problem you are seeing, @anthonyryan1.

anthonyryan1 · 2023-02-21T16:13:37Z

@fweimer-rh I do have a one very long line in gshadow, over 1200 bytes.

I expect that line length will likely explain all the machines the segfault vs the ones which do not. I'll explore this a bit more later today.

Additionally, I agree that the patch doesn't look to be the problem. Rather I feel like it's revealed a different bug that was previously masked by using the alternate code path.

kaizushi · 2023-02-21T17:25:21Z

Interesting the length comes into play, as in the command above I used to create this bug report the line for group users has a lot of entries, and is 515 characters long.

hannob · 2023-02-21T18:47:50Z

I also noticed this bug, however with grpck (on gentoo). A very simple reproducer for me is to run grpck on an empty group file and a gshadow file with a 1024 byte line (e.g. just "a"s). Reproducer:

touch 1
for x in $(seq 1 1024); do echo -n a; done > 2
grpck 1 2

I can confirm that @fweimer-rh 's glibc patch fixes this issue.

anthonyryan1 · 2023-03-09T03:41:24Z

I can also confirm the glibc patch is working.

It looks like glibc merged the patch from the mailing list yesterday: https://sourceware.org/git/?p=glibc.git;a=commit;h=969e9733c7d17edf1e239a73fa172f357561f440

I think we're good to close this issue. The bug wasn't in shadow-utils, and the fix is now in glibc master. Anyone who still hits this combination can find the necessary information here in the closed issue just fine.

thesamesam · 2023-03-09T04:11:13Z

It'd be worth waiting for it to trickle down to glibc's stable branches first, especially if a new shadow release ends up being made in the interim.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Useradd segmentation fault #628

Useradd segmentation fault #628

kaizushi commented Jan 18, 2023 •

edited

anthonyryan1 commented Feb 17, 2023

thesamesam commented Feb 19, 2023

anthonyryan1 commented Feb 20, 2023 •

edited

kaizushi commented Feb 20, 2023

anthonyryan1 commented Feb 20, 2023

fweimer-rh commented Feb 20, 2023

anthonyryan1 commented Feb 20, 2023

fweimer-rh commented Feb 20, 2023 •

edited

thesamesam commented Feb 20, 2023 •

edited

fweimer-rh commented Feb 21, 2023

anthonyryan1 commented Feb 21, 2023

kaizushi commented Feb 21, 2023

hannob commented Feb 21, 2023

anthonyryan1 commented Mar 9, 2023

thesamesam commented Mar 9, 2023

Useradd segmentation fault #628

Useradd segmentation fault #628

Comments

kaizushi commented Jan 18, 2023 • edited

anthonyryan1 commented Feb 17, 2023

thesamesam commented Feb 19, 2023

anthonyryan1 commented Feb 20, 2023 • edited

kaizushi commented Feb 20, 2023

anthonyryan1 commented Feb 20, 2023

fweimer-rh commented Feb 20, 2023

anthonyryan1 commented Feb 20, 2023

fweimer-rh commented Feb 20, 2023 • edited

thesamesam commented Feb 20, 2023 • edited

fweimer-rh commented Feb 21, 2023

anthonyryan1 commented Feb 21, 2023

kaizushi commented Feb 21, 2023

hannob commented Feb 21, 2023

anthonyryan1 commented Mar 9, 2023

thesamesam commented Mar 9, 2023

kaizushi commented Jan 18, 2023 •

edited

anthonyryan1 commented Feb 20, 2023 •

edited

fweimer-rh commented Feb 20, 2023 •

edited

thesamesam commented Feb 20, 2023 •

edited