Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-tenant configuration while using JIT provisioning #215

Closed
jeremylivingston opened this issue Nov 8, 2023 · 13 comments
Closed

Multi-tenant configuration while using JIT provisioning #215

jeremylivingston opened this issue Nov 8, 2023 · 13 comments
Labels
question Further information is requested

Comments

@jeremylivingston
Copy link

I am currently running into limitations with the existing triggers in a scenario where a new user attempts to authentication via an IDP-initiated flow with multiple tenants/IDPs on the backend.

Details of my scenario:

  • Account administrator sets up an SSO configuration for their app. We save the metadata for this configuration.
  • A user on that team attempts to sign in via the IDP app. The user has not yet signed in and therefore has no account.
  • The GET_USER_ID_FROM_SAML_RESPONSE cannot find a user since they do not exist.
  • We're unable to pass a user_id to GET_METADATA_AUTO_CONF_URLS to pull the organization's specific metadata file.

One option would be to return a different ID (the team ID instead of the user ID) from the GET_USER_ID_FROM_SAML_RESPONSE call, then use that downstream, but it seems like this could have unintended side effects.

Is there a recommended approach for how to handle this situation? Should we consider adding another trigger type or even passing the SAML response into GET_METADATA_AUTO_CONF_URLS so it can pull the proper configuration file?

@mostafa
Copy link
Member

mostafa commented Nov 8, 2023

Hey @jeremylivingston,

As I also mentioned in this #56 (comment), the hook you mentioned is not created for this purpose. The hook is created to help with selecting an specific metadata URL for the user based on their organization (or team, in your case), rather than trying to return all the metadata URLs at once (which is possible BTW and the signature will be checked against all the certificates passed, and you can do it by using GET_METADATA_AUTO_CONF_URLS hook function, which should return a list of urls like this: [{"url":"metadata url 1"}, {"url": "metadata url 2"}, ...]). Because the time it takes to finish checking the signature will grow linearly as you add more orgs/teams (more metadata URLs).

What you can do is this:

  1. Set CREATE_USER. This way, the user will be created if it doesn't exist.
  2. Next, create user setup flow in a static function and pass the path to TRIGGERS.BEFORE_LOGIN. This function helps ensure the user is correctly mapped to your internal models for organization, team, project and whatnot. The before login hook/trigger will be called right after user creation.

I created an entire mapping structure for teams, projects, users and orgs to handle the mapping, but there are other ways.

@jeremylivingston
Copy link
Author

@mostafa Thank you for the quick reply! Much appreciated.

Just to make sure I'm understanding, it sounds like you're suggesting that I use the CREATE_USER configuration to create the user if they don't exist, then add the needed organization associations in the BEFORE_LOGIN trigger.

This makes sense, but does that imply that there's no way to filter down the metadata configuration file to be used in a scenario where the user does not yet exist? Ideally, I'd like to have a single config that I am using for authentication requests to mitigate the performance issues that you mentioned. If the user does not yet exist, it seems that there may not be a way to filter down the list of metadata config files. Is that correct?

@mostafa
Copy link
Member

mostafa commented Nov 8, 2023

@jeremylivingston Glad I could help.

Use TRIGGER.GET_USER_ID_FROM_SAML_RESPONSE to extract the user ID and then pass it to TRIGGER.GET_METADATA_AUTO_CONF_URLS to filter out the org.

@jeremylivingston
Copy link
Author

@mostafa yes, I understand that part. My issue is that in an IDP-initiated flow, these triggers happen after the metadata configuration is retrieved. So since the user has not yet been created, there's no way for me to filter down the metadata configs to the correct one for the organization. Is there a way to accomplish this?

@mostafa
Copy link
Member

mostafa commented Nov 8, 2023

@jeremylivingston
When the acs endpoint is called, the decode_saml_response function is called. This calls the get_saml_client function, which eventually calls the hook function I told you about. The decoded saml_response (and an empty user_id) will be available to the hook function you provide, so you get to see what's inside right before the get_metadata function is called. This gives you the ability to look into the user-provided info and find the correct org (and metadata URL) before the message is verified.

@jeremylivingston
Copy link
Author

@mostafa Thanks for writing that up!

So in my case where there is no user_id yet (IDP-initiated flow where the user isn't yet created), are you saying that I should return something other than the user_id from the GET_USER_ID_FROM_SAML_RESPONSE hook? Then this would be passed to get_metadata, which could isolate the configuration file using another variable?

In my case, I should be able to read the Entity ID from the saml_response and determine which organization it belongs to. However, it feels like an anti-pattern to use the GET_USER_ID_FROM_SAML_RESPONSE hook for this purpose, since it's intended to be for a user, not an organization.

@jeremylivingston
Copy link
Author

jeremylivingston commented Nov 8, 2023

Here is some example code that illustrates the issue:

# Configured for `TRIGGER.GET_USER_ID_FROM_SAML_RESPONSE`
def get_user_id_from_saml_response(saml_response: str, user_id: Optional[str]) -> Optional[str]:
    # At this point we are in an IDP-initiated flow for a new user. There is no way to retrieve the user
    # since they don't exist in our system. We *could* retrieve the organization by its entity ID in the
    # SAML response, but that seems to go against the spirit of this trigger.
    try:
        email = None
        root = ElementTree.fromstring(saml_response)
        # Get the email address from the provided SAML response
        for tag in root.iter(tag='{urn:oasis:names:tc:SAML:2.0:assertion}Attribute'):
            if 'email' in tag.attrib['Name']:
                for attribute_value in tag.iter(tag='{urn:oasis:names:tc:SAML:2.0:assertion}AttributeValue'):
                    email = attribute_value.text
                    break

        # We didn't find an email address in the SAML response
        if not email:
            return None

        return User.objects.get(email=email).id
    except ObjectDoesNotExist:
        # We don't have this user in our system
        return None

# Configured for `TRIGGER.GET_METADATA_AUTO_CONF_URLS`
def get_saml_url_by_user_id(user_id: Optional[str] = None):
    # Since we are only able to provide a user ID to this function, we only have one way to look up a 
    # metadata config URL. If we had the ability to pass different fields into this trigger (or even the 
    # SAML response), we could look up the configuration by other elements like the entity ID or 
    # organization ID
    try:
        # We don't have a user_id since the user is not created, so we have no way to narrow down the config
        if not user_id:
            raise Exception("No user provided, so we are unable to retrieve the metadata config URL")

        # Get the SAML config by the user_id
        saml_config = SamlConfig.objects.get(user_id=user_id)

        return [{"url": saml_config.url, "should_sign_response": True}]
    except ObjectDoesNotExist as exc:
        raise Exception("No SAML config found for the provided user_id.")

As you can see, if we had access to the full SAML response in the GET_METADATA_AUTO_CONF_URLS trigger, this would give us more flexibility to look up the URL by other variables.

@mostafa
Copy link
Member

mostafa commented Nov 9, 2023

@jeremylivingston

I see where the confusion is. In this context, the user ID (the user identifier) doesn't mean the USER_MODEL.ID, rather a username or an email address (preferably), hence an optional string. After figuring out the user ID (aka. email), you can return it from the get_user_id_from_saml_response function, so that the rest of the code can use it for creating a user (if it doesn't exist) or finding an org (and subsequently a metadata URL), based on the domain the user passed in the email address. If you just return the email address at this line return User.objects.get(email=email).id and get rid of the try/except (since we no longer need ObjectDoesNotExist), you can then extract the domain name (user@example.com) from the user_id in the get_saml_url_by_user_id function and match it against your existing orgs (considering that you have that piece of info).

@jeremylivingston
Copy link
Author

@mostafa Ah, that's helpful. Thanks!

Unfortunately, this still doesn't fully solve the problem for me. In my case, I'm not able to narrow down an organization/team based on a user's email domain. We often have multiple teams with the same company or email domain that have different tenants and SAML configurations.

I think that my best path forward would be to return the entity ID from the get_user_id_from_saml_response function and then use that to look up the associated team in my system. I don't see another way to accomplish what I'm trying to do (unless you have other ideas!).

Thanks again for all of your help.

@mostafa
Copy link
Member

mostafa commented Nov 9, 2023

@jeremylivingston
I solved this by creating a mapping table (and related admin page and all) that is populated by all orgs/teams/projects upon SAML SSO setup (or later), and upon login and creation of the user object, the user will be mapped to the correct model based on that mapping table. I have a set of defaults, which all the users end up in, and then the admin (customer) can choose to add each person to their team. Alternatively, if the user exists, and the group attribute statements are set, the user will end up in their correct team/project.

@jeremylivingston
Copy link
Author

@mostafa Yes, unfortunately the multiple teams in our setup have no semblance of a shared organization configuration. If one part of the organization has an account (with SSO) and the other one does too, they never communicate or manage each other's users. For this reason, I think we need to stick with using the entity ID to determine which organization's config to use.

@mostafa
Copy link
Member

mostafa commented Nov 9, 2023

Fair enough!

I suppose this is resolved. Feel free to re-open it if you have further questions related to this issue.

@mostafa mostafa closed this as completed Nov 9, 2023
@jeremylivingston
Copy link
Author

Thank you, @mostafa!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants