Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sssd not thread-safe in innetgr() #5540

Closed
sumit-bose opened this issue Mar 18, 2021 · 1 comment
Closed

sssd not thread-safe in innetgr() #5540

sumit-bose opened this issue Mar 18, 2021 · 1 comment
Assignees
Labels
Bugzilla Closed: Fixed Issue was closed as fixed.

Comments

@sumit-bose
Copy link
Contributor

Ticket was cloned from Red Hat Bugzilla (product Red Hat Enterprise Linux 8): Bug 1703436

Created attachment 1559039
example source code

Description of problem:

"man 3 innetgr" states "The function is thread-safe." On our system it normally
is but fails to be thread-safe if sssd is activated.

We hit this bug while debugging a GPFS (Spectrum Scale) issue. IBM support came
up with the attached code example that triggers the problem without gpfs being
involved.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Adapt "groups" and "hosts" arrays in the two attached source code files. The
code will check if the machine in the hosts array is member of the netgroup
mentioned in the  groups array. The host must be a member in the respective
netgroup for this test.
2. compile the first example: gcc -lpthread -o netg-lock netg-lock.c
3. compile the second example: gcc -lpthread -o netg-lock2 netg-lock2.c
4. run netg-lock. It will call innetgr() with 256 threads, protected by
pthread_mutex_lock/unlock(). This run will print out statistics that report
every single check having been successful.
5. run netg-lock2. It will call innetgr() with 256 threads, _not_ protected by
pthread_mutex_lock/unlock(). This run will print out statistics that report
many failing requests.



Actual results:

netg-lock is successful, netg-lock2 is not

Expected results:

both examples should be successful

Additional info:
Our netgroups configuration is quite big and nested. We have 66 netgroups
altogether that have 19037 host entries.

The bug is triggered for a host that is member of a nested group.

$ ldapsearch -x -b <searchbase> ObjectClass=nisNetgroup dn | grep dn: | wc -l
66

$ ldapsearch -x -b dc=rd,dc=corpintra,dc=net ObjectClass=nisNetgroup | grep
nisNetgroupTriple: | wc -l
19037
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Mar 18, 2021
The innetgr() call is expected to be thread safe but SSSD's the current
implementation isn't. In glibc innetgr() is implementend by calling the
setnetgrent(), getnetgrent(), endgrent() sequence with a private context
(struct __netgrent) with provides a member where NSS modules can store
data between the calls.

With this patch setnetgrent() will read all required data from the NSS
responder and store it in the data member of the __netgrent struct.
Upcoming getnetgrent() calls will only operate on the stored data and
not connect to the NSS responder anymore. endgrent() will free the data.
Since the netgroup data is read in a single request to the NSS responder
protected by a mutex and stored in private context of innetgr() this
call is now thread-safe.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Mar 18, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Mar 18, 2021
The innetgr() call is expected to be thread safe but SSSD's the current
implementation isn't. In glibc innetgr() is implementend by calling the
setnetgrent(), getnetgrent(), endgrent() sequence with a private context
(struct __netgrent) with provides a member where NSS modules can store
data between the calls.

With this patch setnetgrent() will open a new connection to the NSS
responder and stores the file descriptor in the data member of
__netgrent struct so that the following getnetgrent() and endgrent()
will use the same connection. Since the NSS responder stores the
netgroup lookups related data in a per connection context and a new
thread will open a new connection the implementation is thread safe.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Mar 18, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Apr 16, 2021
The innetgr() call is expected to be thread safe but SSSD's the current
implementation isn't. In glibc innetgr() is implementend by calling the
setnetgrent(), getnetgrent(), endgrent() sequence with a private context
(struct __netgrent) with provides a member where NSS modules can store
data between the calls.

With this patch setnetgrent() will read all required data from the NSS
responder and store it in the data member of the __netgrent struct.
Upcoming getnetgrent() calls will only operate on the stored data and
not connect to the NSS responder anymore. endgrent() will free the data.
Since the netgroup data is read in a single request to the NSS responder
protected by a mutex and stored in private context of innetgr() this
call is now thread-safe.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Apr 16, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Apr 16, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Apr 20, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Apr 20, 2021
The innetgr() call is expected to be thread safe but SSSD's the current
implementation isn't. In glibc innetgr() is implementend by calling the
setnetgrent(), getnetgrent(), endgrent() sequence with a private context
(struct __netgrent) with provides a member where NSS modules can store
data between the calls.

With this patch setnetgrent() will read all required data from the NSS
responder and store it in the data member of the __netgrent struct.
Upcoming getnetgrent() calls will only operate on the stored data and
not connect to the NSS responder anymore. endgrent() will free the data.
Since the netgroup data is read in a single request to the NSS responder
protected by a mutex and stored in private context of innetgr() this
call is now thread-safe.

Resolves: SSSD#5540
sumit-bose added a commit to sumit-bose/sssd that referenced this issue Apr 20, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540
pbrezina pushed a commit that referenced this issue Apr 21, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: #5540

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
@pbrezina
Copy link
Member

Pushed PR: #5541

  • master
    • 29abf94 - intg test: test is innetgr() is thread-safe
    • 88eec1c - nss client: make innetgr() thread safe

@pbrezina pbrezina added the Closed: Fixed Issue was closed as fixed. label Apr 21, 2021
akuster pushed a commit to akuster/sssd that referenced this issue May 18, 2021
The innetgr() call is expected to be thread safe but SSSD's the current
implementation isn't. In glibc innetgr() is implementend by calling the
setnetgrent(), getnetgrent(), endgrent() sequence with a private context
(struct __netgrent) with provides a member where NSS modules can store
data between the calls.

With this patch setnetgrent() will read all required data from the NSS
responder and store it in the data member of the __netgrent struct.
Upcoming getnetgrent() calls will only operate on the stored data and
not connect to the NSS responder anymore. endgrent() will free the data.
Since the netgroup data is read in a single request to the NSS responder
protected by a mutex and stored in private context of innetgr() this
call is now thread-safe.

Resolves: SSSD#5540

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
akuster pushed a commit to akuster/sssd that referenced this issue May 18, 2021
This integration test adds 2 large netgroups in LDAP and runs a program
with 2 threads looking up those netgroups in parallel.

Resolves: SSSD#5540

Reviewed-by: Alexey Tikhonov <atikhono@redhat.com>
Reviewed-by: Pavel Březina <pbrezina@redhat.com>
jakub-vavra-cz added a commit to jakub-vavra-cz/sssd that referenced this issue Jun 4, 2021
jakub-vavra-cz added a commit to jakub-vavra-cz/sssd that referenced this issue Jun 4, 2021
jakub-vavra-cz added a commit to jakub-vavra-cz/sssd that referenced this issue Jun 4, 2021
jakub-vavra-cz added a commit to jakub-vavra-cz/sssd that referenced this issue Jun 7, 2021
jakub-vavra-cz added a commit to jakub-vavra-cz/sssd that referenced this issue Jun 7, 2021
jakub-vavra-cz added a commit to jakub-vavra-cz/sssd that referenced this issue Jun 11, 2021
pbrezina pushed a commit that referenced this issue Jun 17, 2021
Verifies
Issue: #5540
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1703436

Reviewed-by: Steeve Goveas <sgoveas@redhat.com>
pbrezina pushed a commit that referenced this issue Jun 17, 2021
Verifies
Issue: #5540
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1703436

Reviewed-by: Steeve Goveas <sgoveas@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bugzilla Closed: Fixed Issue was closed as fixed.
Projects
None yet
3 participants