Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make user/group lookup caching thread-safe #2843

Merged
merged 4 commits into from Jan 22, 2024

Conversation

pmatilai
Copy link
Member

This seems like a huge overkill when in 4.19.1 there's exactly one rpmfiStat() call that unnecessarily invokes an rpmug lookup in a threaded scenario, but then rpmfiStat() and rpmfilesStat() are public APIs that people expect to be safe for use within the originating thread.

Collect all the caching data into a struct and allocate on per-thread basis, this seems like the path of least trouble in this case and git diff --stat agrees.

It's worth noting that this also simplifies the caching, we no longer keep separate name to id and id to name, totaling caches totaling four. We simply cache one id/name for for user and another for group data. Also, reset ids to 0 rather than -1, this is far more obviously a safe value as we have special cases to handle id 0.

rpmugUname() and rpmugGname() are have no users in the current codebase, so this was developed wrt commit c576d96 where they still are used, to avoid nasty surprises later on if somebody finds a new case for these.

Fixes: #2826

@pmatilai pmatilai force-pushed the rpmug-safe branch 2 times, most recently from 657079d to c320341 Compare January 15, 2024 11:26
@pmatilai
Copy link
Member Author

(commit message tweaked somewhat in the later pushes)

@ffesti
Copy link
Contributor

ffesti commented Jan 16, 2024

I wonder if this leaks memory. We create a new struct for each thread. Yes, this is freed in rpmChrootSet but that is relying a lot on the right call order.

Edit: Also it ends itself right away if there is no chroot.

@pmatilai
Copy link
Member Author

It does leak, should've mentioned that in the commit message. In the sense that each thread calling it will leave one cache around and reachable. It's far from ideal but it's basically an emergency bandaid.

What do you mean about ending itself?

@ffesti
Copy link
Contributor

ffesti commented Jan 16, 2024

OK, "ending itself" is a bit over dramatic... rpmChrootSet instantly returns if there is now chroot so rpmugFree() is never called at all.

@pmatilai
Copy link
Member Author

Yup, but this patch doesn't change that, the "leak" is already there. This only makes it per-thread.

@pmatilai
Copy link
Member Author

It wouldn't be hard to introduce per-thread locking instead. The bigger deal here is the new central struct that makes it possible to do stuff, perhaps I should actually split that into a separate commit and then consider the thread safety separately. The reason its lumped into one is basically wanting a single patch to fix the issue and people to cherry-pick, but that's clearly bending other "rules" here.

@pmatilai
Copy link
Member Author

Oh, except that rpmugGname() and rpmugUname() additionally rely on central storage of the returned values so simple mutex locking doesn't work for those. So those would have to be changed to return malloced data for a simple fix (it wont break any users because there aren't any, at the moment).

I think this was here previously but has gotten erreonously removed
at some point - we do use rpmug from librpm so we need to free it too.
Collect all the cached data into a struct for sanity.
It's worth noting that this also simplifies the caching, we no longer
keep separate name to id and id to name caches (totaling four).
We simply cache one id/name for for user and another for group data.
Also, reset ids to 0 rather than -1, this is far more obviously a safe
value as we have special cases to handle id 0.

rpmugUname() and rpmugGname() are have no users in the current codebase,
so this was developed wrt commit c576d96
where they still are used, to avoid nasty surprises later on if somebody
finds a new case for these.
Now with the caching data in a single struct from the previous commit,
we can easily make this per-thread for a minimal fix for thread-safety.

Fixes: rpm-software-management#2826
@pmatilai
Copy link
Member Author

Split to refactor + thread-safe fix, and restored that rpmugFree() on librpm exit that had gone missing at some point.

@ffesti
Copy link
Contributor

ffesti commented Jan 17, 2024

Looks good to me.

lib/rpmug.c Show resolved Hide resolved
@dmnks
Copy link
Contributor

dmnks commented Jan 18, 2024

It seems like the commit hash mentioned in the commit message isn't correct (it's the "Bump CI" commit which doesn't seem to have anything to do with this).

Copy link
Contributor

@dmnks dmnks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the commit note above, the patches look good to me.

Copy link
Member

@Conan-Kudo Conan-Kudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo note as @dmnks stated, it looks good to me.

@pmatilai
Copy link
Member Author

It seems like the commit hash mentioned in the commit message isn't correct (it's the "Bump CI" commit which doesn't seem to have anything to do with this).

It's actually explained in the message:

rpmugUname() and rpmugGname() are have no users in the current codebase,
so this was developed wrt commit c576d96
where they still are used

Ie, that commit is the one before the one that removed all rpmugUname() and rpmugGname() users, so I had to develop against that to be sure this rework actually works.

@pmatilai pmatilai merged commit 593059d into rpm-software-management:master Jan 22, 2024
1 check passed
@pmatilai pmatilai deleted the rpmug-safe branch January 22, 2024 11:09
@dmnks dmnks added the packaging Package building, SPEC files, etc. label Feb 5, 2024
@dmnks dmnks mentioned this pull request Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
packaging Package building, SPEC files, etc. REGRESSION
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rpmbuild double-free
4 participants