add: logging for SAML IdP initiated auth#137
Conversation
1320d58 to
017714b
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive logging to debug SAML Identity Provider (IdP) initiated authentication flows in the third-party authentication pipeline. The logging statements capture key decision points and state information during the authentication process to help diagnose issues with SAML IdP-initiated login requests.
Changes:
- Added logging to capture provider information, authentication state, and user existence checks
- Extracted function call results into temporary variables to enable logging without changing logic
- Added logging at critical control flow decision points (user exists, force account creation, dispatch routing)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| logger.info( | ||
| '[THIRD_PARTY_AUTH] ensure_user_information: auth_entry=%s backend=%s is_provider_saml=%s ' | ||
| 'current_provider=%s skip_email_verification=%s send_to_registration_first=%s ' | ||
| 'email=%s kwargs_response_keys=%s', | ||
| auth_entry, | ||
| current_partial.backend, | ||
| _is_saml, | ||
| _provider_obj.provider_id if _provider_obj else None, | ||
| _provider_obj.skip_email_verification if _provider_obj else None, | ||
| _provider_obj.send_to_registration_first if _provider_obj else None, | ||
| details.get('email') if details else None, | ||
| list((kwargs.get('response') or {}).keys()), | ||
| ) |
There was a problem hiding this comment.
Logging email addresses exposes Personally Identifiable Information (PII) in application logs. This creates potential privacy and compliance risks (GDPR, CCPA, etc.) as logs may be retained longer than necessary, shared with third parties, or stored insecurely. While this pattern exists elsewhere in the codebase (e.g., lines 854, 872, 885), consider whether the email is essential for debugging SAML IdP initiated auth. If debugging requires identifying users, consider logging a hashed version of the email or just the domain (e.g., @example.com) instead of the full email address. If the full email is necessary, ensure your logging infrastructure has appropriate PII protection controls in place.
| logger.info( | ||
| '[THIRD_PARTY_AUTH] ensure_user_information: user_exists=%s user_details_email=%s', | ||
| _user_exists, | ||
| (user_details or {}).get('email'), | ||
| ) |
There was a problem hiding this comment.
This log statement also logs email addresses which exposes PII. The same privacy and compliance concerns apply here as in the previous logging statement. Consider whether the email is essential for debugging or if alternative identifiers could be used.
| _is_saml = is_provider_saml() | ||
| _provider_obj = provider.Registry.get_from_pipeline({'backend': current_partial.backend, 'kwargs': kwargs}) |
There was a problem hiding this comment.
The provider object is being fetched multiple times via provider.Registry.get_from_pipeline(). It's called once here (line 619), once inside is_provider_saml() (line 618, which internally calls it at line 607), and again inside should_force_account_creation() (line 654, which internally calls it at line 601). Consider refactoring to fetch the provider object once and pass it to these helper functions, or refactor the helper functions to accept an optional provider parameter. This would reduce redundant lookups and improve performance.
Logging to debug SAML initiated login requests