Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix group renaming issue when "id_provider = ldap" is set #128

Closed
wants to merge 9 commits into from

Conversation

Projects
None yet
5 participants
@fidencio
Copy link
Contributor

commented Jan 18, 2017

Those two patches fix https://bugzilla.redhat.com/show_bug.cgi?id=1401241

The sssd.conf used in order to reproduce this issue looks like:

[sssd]
config_file_version = 2
services = nss, pam
domains = ad.fidencio.lan

[nss]

[pam]

[domain/ad.fidencio.lan]
ad_domain = ad.fidencio.lan
krb5_realm = AD.FIDENCIO.LAN
realmd_tags = manages-system joined-with-adcli 
cache_credentials = True
krb5_store_password_if_offline = True
default_shell = /bin/bash
ldap_id_mapping = True
use_fully_qualified_names = True
fallback_homedir = /home/%u@%d
ldap_referrals = false
enumerate = false
id_provider = ldap
#id_provider = ad
auth_provider = krb5
chpass_provider = krb5
access_provider = ldap
ldap_sasl_mech = GSSAPI
ldap_schema = ad
ldap_user_object_class = user
ldap_user_home_directory = unixHomeDirectory
ldap_user_principal = userPrincipalName
ldap_group_object_class = group
ldap_access_order = expire
ldap_account_expire_policy = ad
ldap_force_upper_case_realm = true

The reproducer can be found in the bug report.

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Jan 18, 2017

CI: http://sssd-ci.duckdns.org/logs/job/60/62/summary.html

Debian testing failure is unrelated.

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Jan 20, 2017

I haven't read the patches, I just realized we might want to have a ticket and not just a downstream bugzilla.

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Jan 20, 2017

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch from 940caf7 to bec6586 Jan 21, 2017

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Jan 23, 2017

Updating the patch set with a test for this fix ...

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch 2 times, most recently from 990b813 to 984cf33 Jan 23, 2017

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Feb 13, 2017

The code looks good and manual tests went fine so far. Also the CI run passed: http://sssd-ci.duckdns.org/logs/job/62/55/summary.html

@jhrozek jhrozek self-assigned this Feb 13, 2017

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2017

The tests didn't find any regressions, ACK

@jhrozek jhrozek added the Accepted label Feb 14, 2017

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2017

@lslebodn lslebodn removed the Accepted label Feb 14, 2017

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2017

@lslebodn:
Firstly, my answer may be incomplete due to the lack of knowledge, but let's try ...

  1. As far as I understand SSSD does not deal properly with multiple groups having the same GID and I'm saying that based on both AD's and LDAP's code, where the search is done by the GID and we expect only one result;
  2. We already have at least one bug opened for this situation (https://fedorahosted.org/sssd/ticket/2982) and in case we decide to deal properly with this my feeling is that it will have to be done in all different parts of the code.

I understand why you're worried and I see we can hit this situation. But we can hit this situation even without my fix. So I'd like to propose to fix this situation when someone has time to work on this and in a better way than just "don't deal with group renaming".

Does this make sense for you?

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2017

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Feb 14, 2017

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2017

@lslebodn

This comment has been minimized.

Copy link
Contributor

commented Feb 14, 2017

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Feb 24, 2017

So, as far as I remember, the conclusion about this patch is that we should also have a really loud debug message saying that the group has been renamed.

Is everyone here in agreement about this? @lslebodn, @jhrozek, @sumit-bose

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch from 984cf33 to 6b927e3 Feb 25, 2017

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Feb 25, 2017

Just updated the patches adding the loud debug message saying the group has been renamed.

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch from 6b927e3 to eef7fba Mar 5, 2017

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Aug 10, 2017

So I was thinking about this PR a bit more and I'm no longer sure we can rename groups at will. I think we can only support that if we support multiple objects with the same ID (which we currently don't and which would be a big task) or if we implement some heuristics to see that it's indeed the same, just renamed group and not a conflict.

So, I suggest that we, before renaming the group check that at least one of these conditions applies:

  • both objects have the same SID
  • both objects have the same UUID
  • both objects have the same originalDN

if these don't apply, then we can't know if the object is the same or different and we thrown an error. What do you think?

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Aug 10, 2017

@jhrozek, your suggestion is good.

I've updated the patchset and I'm waiting for CI's results.

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch from eef7fba to 24e5381 Aug 10, 2017

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Aug 10, 2017

same_original_dn = true;
} else {
same_original_dn = false;
}

This comment has been minimized.

Copy link
@lslebodn

lslebodn Aug 22, 2017

Contributor

Please avoid copy and paste here.
Have you considered for loop?

This comment has been minimized.

Copy link
@fidencio

fidencio Aug 22, 2017

Author Contributor

I haven't. I'll re-work the code and re-submit the PR.

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 3, 2018

@jhrozek, @pbrezina, @sumit-bose, @mzidek-rh ... Please, could someone take a look on #128 (comment) ?

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

So I don't like passing the provider from everywhere. But because this PR was going on for so long, I actually prepared alternative version where the data provider is part of sdap_options and passed this way. Can you check my branch called "group_rename" ? This was we could get rid of all the "SDAP: Pass struct data_provider to XYZ" patches.

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

Your version seems okay, please, force-push it to my branch.

Just a note: Although I do appreciate you took some time to come up with a better solution, for the next time, if it's possible, I'd strongly prefer if you could give me your suggestion and then I'd rework the patch ... that would help get a better understanding of the parts that I'm still lacking knowledge in the project.

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

OK, I'll prettify the patches and force push.

About just nacking the patch, I was both not sure if'd you'd appreciate 10th time someone tells you to rework the PR and at the same time, I wasn't sure if the approach is viable..

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

...viable until I actually sat down and coded the approach myself.

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

Hmm, I'm getting permission rejected from pushing to your repository, any chance you can just fetch my branch and push into yours?

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch from f5432ea to 7cd45e6 Apr 9, 2018

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

I've squashed your patches into mines and updated the PR.

My ask for help still stand: #128 (comment) and on #128 (comment)

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

Have you considered changing the ldb expiration timestamps with pyldb or shelling out to ldbmodify?

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

Have you considered changing the ldb expiration timestamps with pyldb or shelling out to ldbmodify?

I don't understand how it would help, Jakub. How doing changing the ldb expiration timestamps with pyldb would trigger sdap_handle_id_collision_for_incomplete_groups()?

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

Maybe I don't understand the problem. What is it you're trying to test, the memcache invalidation?

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

I really failed to understand how to write a test without calling sss_cache to invalidate the cache as done in https://github.com/SSSD/sssd/pull/128/files#diff-cb96335bb2dbee3face9f5ba04f18ce0R1526 ... That was @lslebodn's suggestion.

When I reworked the patches I didn't find an easy wai to have sdap_handle_id_collision_for_incomplete_groups() triggered (sorry, I can't give you more details from the top of my head) and the test was not passing when removing the sss_cache commands.

So, seems that there are a lot of things to be understood in this part ... but that I'd really need someone's help to do so. If you have the time and the willing to do this, please, let me know. We can do this over IRC ... just gimme an "one day head's up" so I could try to remember the last state of the patch set and then we can talk.

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 9, 2018

I would suggest sometimes this week because currently I do have the patch in my head :)

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 9, 2018

Here's the CI results for the current version of this patch set: http://vm-031.${abc}/logs/job/87/73/summary.html

fidencio and others added some commits Feb 16, 2018

NSS: Add InvalidateGroupById handler
There are some situations where, from the backend, the NSS responder
will have to be notified to invalidate a group.

In order to achieve this in a clean way, let's add the
InvalidateGroupById handler and make use of it later in this very same
series.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
ERRORS: Add ERR_GID_DUPLICATED
This new error will be returned from sysdb_add_incomplete_group()
when renaming a group which will case gid collision.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
DP: Add dp_sbus_invalidate_group_memcache()
This function will be called from the data provider to the NSS
responder, which will invalidate a group in the memcache.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
SYSDB_OPS: Error out on id-collision when adding an incomplete group
This situation can be hit when renaming a group. For now, let's just
error this out so the caller can handle it properly on its own layer.

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
SDAP: Add sdap_handle_id_collision_for_incomplete_groups()
This newly added function is a helper to properly hadle group
id-collisions when renaming incomplete groups and it does:
- Deletes the group from sysdb
- Adds the new incomplete group
- Notifies the NSS responder that the entry also has to be deleted from
  the memory cache

This function will be called from
sdap_ad_save_group_membership_with_idmapping() and from
sdap_add_incomplete_groups().

Related:
https://pagure.io/SSSD/sssd/issue/2653

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
LDAP: Augment the sdap_opts structure with a data provider pointer
In order to be able to use the Data Provider methods from the SDAP code
to e.g. invalidate memcache when needed, add a new field to the
sdap_options structure with the data_provider structure pointer.

Fill the pointer value for all LDAP-based providers.
TESTS: Add an integration test for renaming incomplete groups during …
…initgroups

As we implemented the group renaming heuristics to rename only if we can
use another "hint" like the original DN or the SID to know the group is
the same, this patch adds two tests (positive and negative) to make sure
a group with a totally different RDN and hence different originalDN
cannot be renamed but a group whose name changed but the RDN stays the
same can be renamed.

Related:
https://pagure.io/SSSD/sssd/issue/3282

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>
@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 16, 2018

I pushed the patch that detects the group rename vs. GID duplicate to https://github.com/jhrozek/sssd/tree/group_renaming

The rdn_same test now fails when I revert your patches. Thank you for spotting the test was not hitting the codepath at all.

I haven't ran any downstream tests yet, just the integration tests and unit tests. But I would appreciate code review.

SYSDB: sysdb_add_incomplete_group now returns EEXIST with a duplicate…
… GID

Reviewed-by: Fabiano Fidêncio <fidencio@redhat.com>

@fidencio fidencio force-pushed the fidencio:wip/group_renaming branch from 7cd45e6 to d58f84b Apr 17, 2018

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 17, 2018

@jhrozek, I've updated the branch. Your tests are solid now.
I've added "Reviewed-by: Fabiano Fidêncio ..." in your patches.

Now we're just waiting for the results of downstream tests to be sure that the latest patch in this series is not going to break something else.

@jhrozek

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2018

I'm sorry about the delay. I had to spend some time to triage the downstream tests, because they were failing and at the same time, our test engine internally at red hat had some scheduling issues.

Nonetheless, the tests seem to be passing -- the most relevant tests are LDAP provider (job ID 2432961) and the AD provider test (job ID 2441955), in both tests there were either some known failures or intermittent failures that were gone during a manual re-run.

ACK

@jhrozek jhrozek added the Accepted label Apr 24, 2018

@fidencio

This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2018

@fidencio fidencio closed this Apr 25, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.