Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix group renaming issue when "id_provider = ldap" is set #128

Closed
wants to merge 9 commits into from

Conversation

fidencio
Copy link
Contributor

Those two patches fix https://bugzilla.redhat.com/show_bug.cgi?id=1401241

The sssd.conf used in order to reproduce this issue looks like:

[sssd]
config_file_version = 2
services = nss, pam
domains = ad.fidencio.lan

[nss]

[pam]

[domain/ad.fidencio.lan]
ad_domain = ad.fidencio.lan
krb5_realm = AD.FIDENCIO.LAN
realmd_tags = manages-system joined-with-adcli 
cache_credentials = True
krb5_store_password_if_offline = True
default_shell = /bin/bash
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
ldap_referrals = false
enumerate = false
id_provider = ldap
#id_provider = ad
auth_provider = krb5
chpass_provider = krb5
access_provider = ldap
ldap_sasl_mech = GSSAPI
ldap_schema = ad
ldap_user_object_class = user
ldap_user_home_directory = unixHomeDirectory
ldap_user_principal = userPrincipalName
ldap_group_object_class = group
ldap_access_order = expire
ldap_account_expire_policy = ad
ldap_force_upper_case_realm = true

The reproducer can be found in the bug report.

@fidencio
Copy link
Contributor Author

fidencio commented Jan 18, 2017

CI: http://sssd-ci.duckdns.org/logs/job/60/62/summary.html

Debian testing failure is unrelated.

@jhrozek
Copy link
Contributor

jhrozek commented Jan 20, 2017

I haven't read the patches, I just realized we might want to have a ticket and not just a downstream bugzilla.

@jhrozek
Copy link
Contributor

jhrozek commented Jan 20, 2017

@fidencio
Copy link
Contributor Author

Updating the patch set with a test for this fix ...

@fidencio fidencio force-pushed the wip/group_renaming branch 2 times, most recently from 990b813 to 984cf33 Compare January 23, 2017 17:59
@jhrozek
Copy link
Contributor

jhrozek commented Feb 13, 2017

The code looks good and manual tests went fine so far. Also the CI run passed: http://sssd-ci.duckdns.org/logs/job/62/55/summary.html

@jhrozek jhrozek self-assigned this Feb 13, 2017
@jhrozek
Copy link
Contributor

jhrozek commented Feb 14, 2017

The tests didn't find any regressions, ACK

@lslebodn
Copy link
Contributor

lslebodn commented Feb 14, 2017 via email

@lslebodn lslebodn removed the Accepted label Feb 14, 2017
@fidencio
Copy link
Contributor Author

@lslebodn:
Firstly, my answer may be incomplete due to the lack of knowledge, but let's try ...

  1. As far as I understand SSSD does not deal properly with multiple groups having the same GID and I'm saying that based on both AD's and LDAP's code, where the search is done by the GID and we expect only one result;
  2. We already have at least one bug opened for this situation (https://fedorahosted.org/sssd/ticket/2982) and in case we decide to deal properly with this my feeling is that it will have to be done in all different parts of the code.

I understand why you're worried and I see we can hit this situation. But we can hit this situation even without my fix. So I'd like to propose to fix this situation when someone has time to work on this and in a better way than just "don't deal with group renaming".

Does this make sense for you?

@lslebodn
Copy link
Contributor

lslebodn commented Feb 14, 2017 via email

@fidencio
Copy link
Contributor Author

fidencio commented Feb 14, 2017 via email

@jhrozek
Copy link
Contributor

jhrozek commented Feb 14, 2017 via email

@lslebodn
Copy link
Contributor

lslebodn commented Feb 14, 2017 via email

@fidencio
Copy link
Contributor Author

So, as far as I remember, the conclusion about this patch is that we should also have a really loud debug message saying that the group has been renamed.

Is everyone here in agreement about this? @lslebodn, @jhrozek, @sumit-bose

@fidencio
Copy link
Contributor Author

Just updated the patches adding the loud debug message saying the group has been renamed.

@jhrozek
Copy link
Contributor

jhrozek commented Aug 10, 2017

So I was thinking about this PR a bit more and I'm no longer sure we can rename groups at will. I think we can only support that if we support multiple objects with the same ID (which we currently don't and which would be a big task) or if we implement some heuristics to see that it's indeed the same, just renamed group and not a conflict.

So, I suggest that we, before renaming the group check that at least one of these conditions applies:

  • both objects have the same SID
  • both objects have the same UUID
  • both objects have the same originalDN

if these don't apply, then we can't know if the object is the same or different and we thrown an error. What do you think?

@fidencio
Copy link
Contributor Author

@jhrozek, your suggestion is good.

I've updated the patchset and I'm waiting for CI's results.

@fidencio
Copy link
Contributor Author

same_original_dn = true;
} else {
same_original_dn = false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid copy and paste here.
Have you considered for loop?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't. I'll re-work the code and re-submit the PR.

@fidencio
Copy link
Contributor Author

fidencio commented Apr 3, 2018

@jhrozek, @pbrezina, @sumit-bose, @mzidek-rh ... Please, could someone take a look on #128 (comment) ?

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

So I don't like passing the provider from everywhere. But because this PR was going on for so long, I actually prepared alternative version where the data provider is part of sdap_options and passed this way. Can you check my branch called "group_rename" ? This was we could get rid of all the "SDAP: Pass struct data_provider to XYZ" patches.

@fidencio
Copy link
Contributor Author

fidencio commented Apr 9, 2018

Your version seems okay, please, force-push it to my branch.

Just a note: Although I do appreciate you took some time to come up with a better solution, for the next time, if it's possible, I'd strongly prefer if you could give me your suggestion and then I'd rework the patch ... that would help get a better understanding of the parts that I'm still lacking knowledge in the project.

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

OK, I'll prettify the patches and force push.

About just nacking the patch, I was both not sure if'd you'd appreciate 10th time someone tells you to rework the PR and at the same time, I wasn't sure if the approach is viable..

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

...viable until I actually sat down and coded the approach myself.

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

Hmm, I'm getting permission rejected from pushing to your repository, any chance you can just fetch my branch and push into yours?

@fidencio
Copy link
Contributor Author

fidencio commented Apr 9, 2018

I've squashed your patches into mines and updated the PR.

My ask for help still stand: #128 (comment) and on #128 (comment)

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

Have you considered changing the ldb expiration timestamps with pyldb or shelling out to ldbmodify?

@fidencio
Copy link
Contributor Author

fidencio commented Apr 9, 2018

Have you considered changing the ldb expiration timestamps with pyldb or shelling out to ldbmodify?

I don't understand how it would help, Jakub. How doing changing the ldb expiration timestamps with pyldb would trigger sdap_handle_id_collision_for_incomplete_groups()?

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

Maybe I don't understand the problem. What is it you're trying to test, the memcache invalidation?

@fidencio
Copy link
Contributor Author

fidencio commented Apr 9, 2018

I really failed to understand how to write a test without calling sss_cache to invalidate the cache as done in https://github.com/SSSD/sssd/pull/128/files#diff-cb96335bb2dbee3face9f5ba04f18ce0R1526 ... That was @lslebodn's suggestion.

When I reworked the patches I didn't find an easy wai to have sdap_handle_id_collision_for_incomplete_groups() triggered (sorry, I can't give you more details from the top of my head) and the test was not passing when removing the sss_cache commands.

So, seems that there are a lot of things to be understood in this part ... but that I'd really need someone's help to do so. If you have the time and the willing to do this, please, let me know. We can do this over IRC ... just gimme an "one day head's up" so I could try to remember the last state of the patch set and then we can talk.

@jhrozek
Copy link
Contributor

jhrozek commented Apr 9, 2018

I would suggest sometimes this week because currently I do have the patch in my head :)

@fidencio
Copy link
Contributor Author

fidencio commented Apr 9, 2018

Here's the CI results for the current version of this patch set: http://vm-031.${abc}/logs/job/87/73/summary.html

fidencio and others added 8 commits April 16, 2018 16:08
There are some situations where, from the backend, the NSS responder
will have to be notified to invalidate a group.

In order to achieve this in a clean way, let's add the
InvalidateGroupById handler and make use of it later in this very same
series.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
This function will be called from the data provider to the NSS
responder, which will invalidate a group in the memcache.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
This new error will be returned from sysdb_add_incomplete_group()
when renaming a group which will case gid collision.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
In order to be able to use the Data Provider methods from the SDAP code
to e.g. invalidate memcache when needed, add a new field to the
sdap_options structure with the data_provider structure pointer.

Fill the pointer value for all LDAP-based providers.
This newly added function is a helper to properly hadle group
id-collisions when renaming incomplete groups and it does:
- Deletes the group from sysdb
- Adds the new incomplete group
- Notifies the NSS responder that the entry also has to be deleted from
  the memory cache

This function will be called from
sdap_ad_save_group_membership_with_idmapping() and from
sdap_add_incomplete_groups().

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
This situation can be hit when renaming a group. For now, let's just
error this out so the caller can handle it properly on its own layer.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
…initgroups

As we implemented the group renaming heuristics to rename only if we can
use another "hint" like the original DN or the SID to know the group is
the same, this patch adds two tests (positive and negative) to make sure
a group with a totally different RDN and hence different originalDN
cannot be renamed but a group whose name changed but the RDN stays the
same can be renamed.

Related:
https://pagure.io/SSSD/sssd/issue/3282

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
@jhrozek
Copy link
Contributor

jhrozek commented Apr 16, 2018

I pushed the patch that detects the group rename vs. GID duplicate to https://github.com/jhrozek/sssd/tree/group_renaming

The rdn_same test now fails when I revert your patches. Thank you for spotting the test was not hitting the codepath at all.

I haven't ran any downstream tests yet, just the integration tests and unit tests. But I would appreciate code review.

… GID

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
@fidencio
Copy link
Contributor Author

@jhrozek, I've updated the branch. Your tests are solid now.
I've added "Reviewed-by: Fabiano Fidêncio ..." in your patches.

Now we're just waiting for the results of downstream tests to be sure that the latest patch in this series is not going to break something else.

@jhrozek
Copy link
Contributor

jhrozek commented Apr 24, 2018

I'm sorry about the delay. I had to spend some time to triage the downstream tests, because they were failing and at the same time, our test engine internally at red hat had some scheduling issues.

Nonetheless, the tests seem to be passing -- the most relevant tests are LDAP provider (job ID 2432961) and the AD provider test (job ID 2441955), in both tests there were either some known failures or intermittent failures that were gone during a manual re-run.

ACK

@fidencio
Copy link
Contributor Author

@fidencio fidencio closed this Apr 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants