Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemd-sysusers segfaults #6512

Closed
p5n opened this issue Aug 2, 2017 · 15 comments
Closed

systemd-sysusers segfaults #6512

p5n opened this issue Aug 2, 2017 · 15 comments
Labels
bug 🐛 Programming errors, that need preferential fixing not-our-bug sysusers
Milestone

Comments

@p5n
Copy link

p5n commented Aug 2, 2017

Submission type

  • Bug report

systemd version the issue has been seen with

systemd 234.11-3

Used distribution

Archlinux

In case of bug report: Expected behaviour you didn't see

do not crash

In case of bug report: Unexpected behaviour you saw

segfault

In case of bug report: Steps to reproduce the problem

just run systemd-sysusers

GDB log and description

# gdb systemd-sysusers 
...
(gdb) r
Starting program: /usr/bin/systemd-sysusers 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
Creating group kvm with gid 992.
Creating group systemd-coredump with gid 991.
Creating user systemd-coredump (systemd Core Dumper) with uid 991 and gid 991.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b613b4 in __strpbrk_sse42 () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7b613b4 in __strpbrk_sse42 () from /usr/lib/libc.so.6
#1  0x00007ffff7b4347d in __nss_valid_list_field () from /usr/lib/libc.so.6
#2  0x00007ffff7b2716a in putsgent () from /usr/lib/libc.so.6
#3  0x00005555555595de in ?? ()
#4  0x000055555555a43f in ?? ()
#5  0x000055555555685c in ?? ()
#6  0x00007ffff7a534ca in __libc_start_main () from /usr/lib/libc.so.6
#7  0x000055555555699a in ?? ()

I guess it is because of I have few users with dot, "firstname.lastname" format. Yes, I know that dot should not be used, but sigsegv is not good anyway.

@poettering
Copy link
Member

Ouch... any chance you can redo that with debug symbols in place?

@poettering poettering added bug 🐛 Programming errors, that need preferential fixing sysusers labels Aug 7, 2017
@poettering poettering added this to the v235 milestone Aug 7, 2017
@p5n
Copy link
Author

p5n commented Aug 7, 2017

here is stack trace from v234-stable built with buildtype=debug

(gdb) bt
#0  0x00007ffff76c83b4 in __strpbrk_sse42 () from /usr/lib/libc.so.6
#1  0x00007ffff76aa47d in __nss_valid_list_field () from /usr/lib/libc.so.6
#2  0x00007ffff768e16a in putsgent () from /usr/lib/libc.so.6
#3  0x00005555555576ee in putsgent_with_members (sg=0x7ffff793d6e0 <resbuf>, gshadow=0x55555576ea40)
    at ../src/sysusers/sysusers.c:341
#4  0x00005555555587e3 in write_temporary_gshadow (gshadow_path=0x55555555dc93 "/etc/gshadow", 
    tmpfile=0x7fffffffe790, tmpfile_path=0x7fffffffe7b0) at ../src/sysusers/sysusers.c:665
#5  0x00005555555590cb in write_files () at ../src/sysusers/sysusers.c:727
#6  0x000055555555d7e4 in main (argc=1, argv=0x7fffffffea48) at ../src/sysusers/sysusers.c:1845

(gdb) p *sg
$1 = {sg_namp = 0x55555576eea0 "xxx_xxx_xxxxx", sg_passwd = 0x55555576eeae "!", sg_adm = 0x55555576eec0, 
  sg_mem = 0x55555576eec8}
(gdb) p *sg->sg_adm
$2 = 0x6e696b6e656a2c6e <error: Cannot access memory at address 0x6e696b6e656a2c6e>
(gdb) p *sg->sg_mem
$3 = 0x6e6178656c612c73 <error: Cannot access memory at address 0x6e6178656c612c73>

@p5n
Copy link
Author

p5n commented Aug 7, 2017

gshadow line length is:

grep xxx_xxx_xxxxx /etc/gshadow | wc -c
1018

@poettering
Copy link
Member

hmmm, did i get this right? you have a gshadow line in place there that is extremely long?

our code uses fgetsgent() to read a shadow entry, and then patches in .sg_mem and writes it out directly via putsgent() again, without making any other changes. The only thing that might be going wrong here is that putsgent() alters the buffer fgetsgent() uses to store its data...

@p5n
Copy link
Author

p5n commented Aug 9, 2017

Yes, and this is first very long line in gshadow. Others are near 200-300 characters length.

@poettering
Copy link
Member

Any chance you can run this through valgrind? That should tell us enough about what we are doing wrong there...

Either this is actually a bug in glibc, or we are supposed to do a deep copy of the group entry here, which isn't entirely clear to me...

@p5n
Copy link
Author

p5n commented Aug 9, 2017

# valgrind ./systemd-sysusers 
==12815== Memcheck, a memory error detector
==12815== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==12815== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==12815== Command: ./systemd-sysusers
==12815== 
Creating group kvm with gid 992.
Creating group systemd-coredump with gid 991.
Creating user systemd-coredump (systemd Core Dumper) with uid 991 and gid 991.
==12815== Invalid read of size 1
==12815==    at 0x4C33D02: strpbrk (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12815==    by 0x53E147C: __nss_valid_list_field (in /usr/lib/libc-2.25.so)
==12815==    by 0x53C5169: putsgent (in /usr/lib/libc-2.25.so)
==12815==    by 0x10B6ED: putsgent_with_members (sysusers.c:341)
==12815==    by 0x10C7E2: write_temporary_gshadow (sysusers.c:665)
==12815==    by 0x10D0CA: write_files (sysusers.c:727)
==12815==    by 0x1117E3: main (sysusers.c:1845)
==12815==  Address 0x6e696b6e656a2c6e is not stack'd, malloc'd or (recently) free'd
==12815== 
==12815== 
==12815== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==12815==  General Protection Fault
==12815==    at 0x4C33D02: strpbrk (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12815==    by 0x53E147C: __nss_valid_list_field (in /usr/lib/libc-2.25.so)
==12815==    by 0x53C5169: putsgent (in /usr/lib/libc-2.25.so)
==12815==    by 0x10B6ED: putsgent_with_members (sysusers.c:341)
==12815==    by 0x10C7E2: write_temporary_gshadow (sysusers.c:665)
==12815==    by 0x10D0CA: write_files (sysusers.c:727)
==12815==    by 0x1117E3: main (sysusers.c:1845)
==12815== 
==12815== HEAP SUMMARY:
==12815==     in use at exit: 59,262 bytes in 680 blocks
==12815==   total heap usage: 981 allocs, 301 frees, 295,151 bytes allocated
==12815== 
==12815== LEAK SUMMARY:
==12815==    definitely lost: 0 bytes in 0 blocks
==12815==    indirectly lost: 0 bytes in 0 blocks
==12815==      possibly lost: 0 bytes in 0 blocks
==12815==    still reachable: 59,262 bytes in 680 blocks
==12815==         suppressed: 0 bytes in 0 blocks
==12815== Rerun with --leak-check=full to see details of leaked memory
==12815== 
==12815== For counts of detected and suppressed errors, rerun with: -v
==12815== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

@p5n
Copy link
Author

p5n commented Aug 9, 2017

I think, I have steps-to-reproduce for you:

$ echo "qwe.asd:x:1011:andrey.xxxxxxxxx,jenxxxx,alexander.xxxxxxx,alexander.xxxxxxxxx,ivan.xxxxxxxx,alexey.xxxxxxrov,alexandr.lxxxxxxx,alexander.voxxxxx,andrey.pexxxxxxx,andrey.ixxxx,alexey.mxxxxxx,dmitry.txxxxxxxxo,denis.axxxxx,denis.bxxxxxxx,dmitriy.kxxxxxx,dmitry.dxxxxxxx,dmitry.axxxxxov,egor.nixxxxxxxx,eugene.exxxxxxxx,evgeny.bxxxxxxxs,ivan.txxxxxxxxx,igor.vaxxxxxx,irina.lexxxxxxx,maxim.maxxxxxx,maxim.voxxxx,grigoriy.gxxxxxxx,nikolay.kxxxxx,sergej.pxxxxxx,sergey.shxxxxxxxxxx,sergey.grxxxxxx,sergey.proxxxxxxx,stas.shxxxxxxxxxx,stanislav.ixxxxxxx,kirill.nexxxxxx,valery.tixxxxxxx,vyacheslav.gxxxxx,vladimir.sxxxxxxxxxxx,dmitry.dxxxxxxxxxf,evgeny.exxxxx,elena.kxxxxxxxx,aleksey.roxxxxxx,sergey.kuxxxxxxx,roman.laxxxxxxxx,elena.mxxxxxxx,alexander.fxxxxxxx,anastasiya.txxxxxxa,vladislav.ryxxx,dmitry.voxxxxxxxx,dmitry.mxxxxx,dmitry.rexxxxxxxxxx,daria.kxxxxxxxxxx,konstantin.gxxxxxxxx,mihail.loxxxxx,evgeniy.raxxxxxv,sergey.mxxxxx,vladimir.sxxxxxxx,anastasia.pxxxxxxxxx,andrey.kxxxxxxxxxxx,alexey.xxxxxxev,vladlen.pxxxx" >>/etc/group
$ echo "qwe.asd:!::andrey.xxxxxxxxx,jenxxxx,alexander.xxxxxxx,alexander.xxxxxxxxx,ivan.xxxxxxxx,alexey.xxxxxxrov,alexandr.lxxxxxxx,alexander.voxxxxx,andrey.pexxxxxxx,andrey.ixxxx,alexey.mxxxxxx,dmitry.txxxxxxxxo,denis.axxxxx,denis.bxxxxxxx,dmitriy.kxxxxxx,dmitry.dxxxxxxx,dmitry.axxxxxov,egor.nixxxxxxxx,eugene.exxxxxxxx,evgeny.bxxxxxxxs,ivan.txxxxxxxxx,igor.vaxxxxxx,irina.lexxxxxxx,maxim.maxxxxxx,maxim.voxxxx,grigoriy.gxxxxxxx,nikolay.kxxxxx,sergej.pxxxxxx,sergey.shxxxxxxxxxx,sergey.grxxxxxx,sergey.proxxxxxxx,stas.shxxxxxxxxxx,stanislav.ixxxxxxx,kirill.nexxxxxx,valery.tixxxxxxx,vyacheslav.gxxxxx,vladimir.sxxxxxxxxxxx,dmitry.dxxxxxxxxxf,evgeny.exxxxx,elena.kxxxxxxxx,aleksey.roxxxxxx,sergey.kuxxxxxxx,roman.laxxxxxxxx,elena.mxxxxxxx,alexander.fxxxxxxx,anastasiya.txxxxxxa,vladislav.ryxxx,dmitry.voxxxxxxxx,dmitry.mxxxxx,dmitry.rexxxxxxxxxx,daria.kxxxxxxxxxx,konstantin.gxxxxxxxx,mihail.loxxxxx,evgeniy.raxxxxxv,sergey.mxxxxx,vladimir.sxxxxxxx,anastasia.pxxxxxxxxx,andrey.kxxxxxxxxxxx,alexey.xxxxxxev,vladlen.pxxxx" >>/etc/gshadow
$ systemd-sysusers

@evverx
Copy link
Member

evverx commented Aug 11, 2017

The issue seems to be the same as coreos/bugs#1394, which was fixed by https://sourceware.org/ml/libc-alpha/2016-06/msg01015.html. Is there any chance you could give the patch a try?

@p5n
Copy link
Author

p5n commented Aug 21, 2017

Yes, patching glibc fixes segfault.

@poettering
Copy link
Member

Excellent. Closing this hence, given that this is a bug in glibc and has already been fixed there.

@foutrelis
Copy link
Contributor

It does look like a bug in glibc and it's not fixed there yet; the related glibc report has been pretty much completely ignored.

@codonell
Copy link

codonell commented Jul 9, 2020

This bug has higher visibility and priority given that it's impacting more users. We don't actively try to ignore bugs, but like the kernel, we have limited resources to review and fix bugs. We need to get filtered detailed reports like this one which indicate (a) the the bug is real, and (b) that a patched glibc with the fix for bug 20338 fixes the issue.

I've linked up the various downstream bugs:
https://sourceware.org/bugzilla/show_bug.cgi?id=20338
https://bugzilla.redhat.com/show_bug.cgi?id=1793577
We'll look at this see if we can fix ASAP as part of the bug fix phase for glibc 2.32 (releasing August 1st 2020).

@maltris
Copy link

maltris commented Aug 6, 2020

Reported to Ubuntu/launchpad as this affects systemd-sysusers on Ubuntu 20.04: https://bugs.launchpad.net/glibc/+bug/1890535

@codonell
Copy link

codonell commented Aug 7, 2020

Fixed for glibc 2.32 via:

commit 2add4235ef674988948155f9a8f60a8c7b09bcff
Author: Florian Weimer fweimer@redhat.com
Date: Thu Jul 16 17:31:20 2020 +0200

gshadow: Implement fgetsgent_r using __nss_fgetent_r (bug 20338)

Tested-by: Carlos O'Donell <carlos@redhat.com>
Reviewed-by: Carlos O'Donell <carlos@redhat.com>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐛 Programming errors, that need preferential fixing not-our-bug sysusers
Development

No branches or pull requests

6 participants