Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpmbuild double-free #2826

Closed
svarshavchik opened this issue Dec 28, 2023 · 2 comments · Fixed by #2843
Closed

rpmbuild double-free #2826

svarshavchik opened this issue Dec 28, 2023 · 2 comments · Fixed by #2843
Assignees
Labels

Comments

@svarshavchik
Copy link

Recent version of rpmbuild have a small chance of crashing with a double-free. Typical crash:

Wrote: /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-
authlib-userdb-debuginfo-0.72.0.20231223-101.fc39.x86_64.rpm
Wrote: /__w/courier-libs/courier-libs/courier-authlib/rpm/RPMS/x86_64/courier-
authlib-devel-0.72.0.20231223-101.fc39.x86_64.rpm
double free or corruption (fasttop)

I was able to extract a backtrace:

#7  0x00007f05dd8b93de in __GI___libc_free (mem=0x7f05500df0e0)
    at malloc.c:3391
#8  0x00007f05dda984ec in rpmugUid (thisUname=0x563c250203a4 "daemon", 
    uid=0x7f05cbbf790c)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmug.c:148
#9  0x00007f05dda84255 in rpmfilesStat (fi=0x563c250b57f0, ix=3, 
    flags=flags@entry=0, sb=sb@entry=0x7f05cbbf78f0)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:824
#10 0x00007f05dda8438f in rpmfiStat (fi=fi@entry=0x7f04d0057420, 
--Type <RET> for more, q to quit, c to continue without paging--
    flags=flags@entry=0, sb=sb@entry=0x7f05cbbf78f0)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:1992
#11 0x00007f05dda84444 in rpmfiArchiveWriteHeader (fi=fi@entry=0x7f04d0057420)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:2078
#12 0x00007f05dda871c9 in iterWriteArchiveNextFile (fi=0x7f04d0057420)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:2158
#13 iterWriteArchiveNext (fi=0x7f04d0057420)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:2171
#14 0x00007f05dda811ce in rpmfiNext (fi=<optimized out>)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:873
#15 rpmfiNext (fi=fi@entry=0x7f04d0057420)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/lib/rpmfi.c:868
#16 0x00007f05ddad660c in rpmPackageFilesArchive (isSrc=<optimized out>, 
    failedFile=<synthetic pointer>, archiveSize=<synthetic pointer>, 
    dpaths=0x563c2467b490, cfd=0x7f04d00574f0, fi=<optimized out>)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/build/pack.c:32
#17 cpio_doio (pldig=0x7f05cbbf7aa8, archiveSize=<synthetic pointer>, 
    pld_algo=<optimized out>, fmodeMacro=0x7f04d0774090 "w19.zstdio", 
    pkg=0x563c244154e0, fdo=0x7f04d0061f00)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/build/pack.c:86
#18 writeRPM (pkg=0x563c244154e0, pkgidp=0x0, 
    fileName=0x7f04d0774010 "/home/mrsam/src/courier.git/courier-authlib/rpm/RPMS/x86_64/courier-authlib-ldap-0.72.0-114.fc39.x86_64.rpm", 
--Type <RET> for more, q to quit, c to continue without paging--
    cookie=<optimized out>, buildTime=<optimized out>, 
    buildHost=<optimized out>)
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/build/pack.c:523
#19 0x00007f05ddadbdfb in packageBinaries._omp_fn.1 ()
    at /usr/src/debug/rpm-4.19.0-1.fc39.x86_64/build/pack.c:721
#20 0x00007f05dd6aa759 in gomp_barrier_handle_tasks (state=state@entry=272)
    at ../../../libgomp/task.c:1650
#21 0x00007f05dd6b39e0 in gomp_team_barrier_wait_end (bar=0x563c245c3780, 
    state=272) at ../../../libgomp/config/linux/bar.c:116
#22 0x00007f05dd6b0d7e in gomp_thread_start (xdata=<optimized out>)
    at ../../../libgomp/team.c:129
#23 0x00007f05dd8a8897 in start_thread (arg=<optimized out>)
    at pthread_create.c:444
#24 0x00007f05dd92f6fc in clone3 ()

I believe that the ball starts rolling in the

#pragma omp parallel

in packageBinaries, see stack frame #19.

It keeps rolling downhill into rpmug.c (stack frame #8), which is very, very much thread-unsafe. Boom.

All usage of static variables in rpmug.c is thread-unsafe.

It should be possible to reliably reproduce this crash by sticking, say, sleep (1); just before free(lastUname); on line 148, and then building an srpm that packages a bunch of binaryrpms.\

textlive will do nicely...

rpmuguid() is not thread safe.

rpmuggid() is not thread safe.

rpmugUname() is not thread safe.

rpmugGname() is not thread safe.

@pmatilai pmatilai added the bug label Jan 2, 2024
@pmatilai
Copy link
Member

pmatilai commented Jan 2, 2024

Thanks for the report and backtrace. The rpmug thread-unsafety is of course painfully obvious, that it ends up called during the threaded packaging operation less so because the call happens from librpm side. It's curious that nobody has run into this before now. The core issue has been there since 4.15 already (so four years), it's just that in 4.19 it causes a double-free, before that I think one could've gotten silently corrupted user/group names in packages.

We'll look into it.

@pmatilai
Copy link
Member

pmatilai commented Jan 2, 2024

The height of embarrasment with this crash is that rpm has zero use for the uid/gid info when writing an archive. We could just skip the rpmug lookups entirely because the uid/gid is never written to the archive, only the names are (and even that only with cpio).

@pmatilai pmatilai self-assigned this Jan 9, 2024
pmatilai added a commit to pmatilai/rpm that referenced this issue Jan 15, 2024
This seems like a huge overkill when in 4.19.1 there's exactly one
rpmfiStat() call that unnecessarily invokes an rpmug lookup in a
threaded scenario, but then rpmfiStat() and rpmfilesStat() are public
APIs that people expect to be safe for use within the originating
thread.

Collect all the caching data into a struct and allocate on per-thread
basis, this seems like the path of least trouble in this case and
git diff --stat agrees.

It's worth noting that this also simplifies the caching, we no longer
keep separate name to id and id to name, totaling caches totaling four.
We simply cache one id/name for for user and another for group data.
Also, reset ids to 0 rather than -1, this is far more obviously a safe
value as we have special cases to handle id 0.

rpmugUname() and rpmugGname() are have no users in the current codebase,
so this was developed wrt commit c576d96
where they still are used, to avoid nasty surprises later on if somebody
finds a new case for these.

Fixes: rpm-software-management#2826
pmatilai added a commit to pmatilai/rpm that referenced this issue Jan 15, 2024
Collect all the cached data into a struct and which is allocated on
per-thread basis, this seems like the path of least trouble in this case
and git diff --stat agrees.

It's worth noting that this also simplifies the caching, we no longer
keep separate name to id and id to name caches (totaling four).
We simply cache one id/name for for user and another for group data.
Also, reset ids to 0 rather than -1, this is far more obviously a safe
value as we have special cases to handle id 0.

rpmugUname() and rpmugGname() are have no users in the current codebase,
so this was developed wrt commit c576d96
where they still are used, to avoid nasty surprises later on if somebody
finds a new case for these.

This all seems like a huge overkill when in 4.19.1 there's exactly one
rpmfiStat() call that unnecessarily invokes an rpmug lookup in a
threaded scenario, but then rpmfiStat() and rpmfilesStat() are public
APIs that people expect to be safe for use within the originating
thread.

Fixes: rpm-software-management#2826
pmatilai added a commit to pmatilai/rpm that referenced this issue Jan 15, 2024
Collect all the cached data into a struct and which is allocated on
per-thread basis, this seems like the path of least trouble in this case
and git diff --stat agrees.

It's worth noting that this also simplifies the caching, we no longer
keep separate name to id and id to name caches (totaling four).
We simply cache one id/name for for user and another for group data.
Also, reset ids to 0 rather than -1, this is far more obviously a safe
value as we have special cases to handle id 0.

rpmugUname() and rpmugGname() are have no users in the current codebase,
so this was developed wrt commit c576d96
where they still are used, to avoid nasty surprises later on if somebody
finds a new case for these.

This all seems like a huge overkill when in there's exactly one user
in a rpmfiStat() call that unnecessarily invokes an rpmug lookup in a
threaded scenario, but then rpmfiStat() and rpmfilesStat() are public
APIs that people expect to be safe for use within the originating
thread.

Fixes: rpm-software-management#2826
pmatilai added a commit to pmatilai/rpm that referenced this issue Jan 16, 2024
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: rpm-software-management#2826
pmatilai added a commit that referenced this issue Jan 22, 2024
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: #2826
dmnks pushed a commit to dmnks/rpm that referenced this issue Jan 23, 2024
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: rpm-software-management#2826
dmnks pushed a commit to dmnks/rpm that referenced this issue Feb 5, 2024
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: rpm-software-management#2826
(cherry picked from commit c86f088)
dmnks pushed a commit to dmnks/rpm that referenced this issue Feb 5, 2024
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: rpm-software-management#2826
(cherry picked from commit c86f088)
dmnks pushed a commit that referenced this issue Feb 7, 2024
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: #2826
(cherry picked from commit c86f088)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants