GitHub OAuth Security Enhancement#76
Conversation
…etsentry#67876) We're adding one more step in the GitHub integration installation pipeline, namely GitHub OAuth2 authorize. This is transparent from the UX perspective as the data exchange happens without user interaction. The pipeline will now fail in these cases: - If there is a mismatch between currently authenticated GitHub user (derived from OAuth2 authorize step) and the user who installed the GitHub app (https://github.com/apps/sentry-io) - If there is a mismatch between `state` parameter supplied by user and pipeline signature - If GitHub could not generate correct `access_token` from the `code` (wrong or attempt of re-use of `code`). In all those cases, this error is shown: 
|
Tenki Code Review - Complete Files Reviewed: 3 By Severity:
This PR adds a two-step OAuth login flow to the GitHub integration setup, enabling user identity verification and improving security. However, it introduces 3 high-severity issues: unguarded dict access that can crash the pipeline, attacker-controlled installation_id parameters in pipeline state, and redundant/inconsistent database queries with status filtering gaps. Files Reviewed (3 files) |
There was a problem hiding this comment.
Overview
PR #76 refactors the GitHub integration setup flow to add an intermediate OAuth step (OAuthLoginView) that authenticates the user with GitHub and verifies the OAuth user matches the original app installer (sender). The changes consolidate error handling with new helper functions (error(), get_document_origin()) and simplify pipeline forwarding logic in pipeline_advancer.py.
Key Changes
- New
OAuthLoginViewclass (lines 89-139): Handles OAuth flow including state validation, token exchange, and user info retrieval - New helper functions (lines 47-71):
error()andget_document_origin()consolidate duplicate error response logic (3 instances merged into 1) - GitHubInstallation.dispatch updates (lines 148-241): Now extracts installation_id from both GET and pipeline state; validates authenticated user against stored installation's sender
- Test coverage (test_integration.py): New OAuth callback assertions, user mismatch scenario test
Critical Issues
🔴 High Severity: Attacker-Controlled installation_id Parameter (sec-001, lines 397–449)
The installation_id is read from request.GET and stored in pipeline state before state validation in OAuthLoginView. An attacker can supply ?installation_id=<victim_id> in the OAuth callback to override the authenticated value. For new integrations (when Integration.DoesNotExist), the security check is skipped entirely, allowing the attacker to register an integration they don't own.
Impact: Privilege escalation; attacker can hijack GitHub integration setup for another org's installation.
🔴 High Severity: Unguarded integration.metadata['sender']['login'] Access (cq-001, sec-002, lines 501–504)
The code performs a bare dictionary key access without guards. If an existing integration lacks the 'sender' key in its metadata (possible for older records or certain creation paths), a KeyError is raised, causing an unhandled 500 error instead of the graceful error response.
Impact: Pipeline crashes; security check is silently bypassed instead of being properly enforced.
🔴 High Severity: Redundant Query with Status Filter Gap (cq-002, lines 493–498)
A first query confirms the Integration exists; a second query re-fetches it with status=ObjectStatus.ACTIVE, which silently rejects non-ACTIVE integrations (DISABLED, PENDING_DELETION) and returns a generic error instead of a descriptive response.
Impact: Inconsistent user experience and potential for logic gaps when integration state changes.
Additional Issues
🟡 Medium Severity: Broad Exception Handling (cq-003, lines 425–429)
All exceptions from OAuth token exchange are silently swallowed with no logging. Network outages and decode failures look identical to auth failures from the user's perspective, and operators have no visibility.
🟡 Medium Severity: Installation_id Preservation Across OAuth Redirect (cq-004)
The installation_id may be lost if not present in GitHub's OAuth callback URL, creating a broken setup loop if pipeline state is not properly persisted.
🔵 Low Severity: Missing Type Annotations (cq-005, lines 129–134)
New helper functions lack type annotations on request and org parameters, inconsistent with the file's style and reducing IDE support.
Recommendations
- Fix attacker-controlled parameters: Only accept
installation_idfrom GET during initial redirect; use pipeline state exclusively on callback. - Guard metadata access: Use
.get()with fallbacks to prevent KeyError crashes. - Consolidate queries: Reuse the first Integration lookup or check status explicitly on the already-fetched object.
- Add logging: Log OAuth token exchange failures so operators can distinguish auth errors from infrastructure issues.
- Add type annotations: Annotate
error()andget_document_origin()for consistency.
Files Reviewed
src/sentry/integrations/github/integration.py— OAuth flow, user verification, error handlingsrc/sentry/web/frontend/pipeline_advancer.py— Pipeline forwarding logic (minor refactoring)tests/sentry/integrations/github/test_integration.py— OAuth callback assertions, user mismatch test
| installation_id = request.GET.get("installation_id") | ||
| if installation_id: | ||
| pipeline.bind_state("installation_id", installation_id) |
There was a problem hiding this comment.
🔴 Attacker-controlled installation_id accepted after OAuth callback before state verification, enabling pipeline state manipulation (Auth & Access)
In OAuthLoginView.dispatch (lines 397-399), installation_id is read from request.GET and unconditionally stored into the pipeline state via pipeline.bind_state — even when the request is the OAuth callback (i.e., when a state parameter is also present). The state integrity check does not run until line 412. This means an attacker can craft an OAuth callback URL like ?code=<valid_code>&state=<valid_state>&installation_id=<victim_installation_id> and override any previously stored installation_id with an arbitrary value before authentication of the code completes.
In GitHubInstallation.dispatch (lines 448-449), the same pattern recurs: installation_id = request.GET.get('installation_id', pipeline.fetch_state('installation_id')) — meaning the installation ID can again be overridden by a URL parameter at the second pipeline step.
While the sender-login check at lines 501-505 partially mitigates this for existing integrations (it will reject a mismatch between the OAuth'd user and the integration's stored sender), for the Integration.DoesNotExist branch (line 481-482, where the code calls pipeline.next_step() with no sender check at all), an attacker who authenticates with GitHub OAuth and injects any unregistered installation_id value would pass straight through to build_integration, which then calls the GitHub API for that injected installation ID and may create an integration the attacker doesn't own.
💡 Suggestion: Only read installation_id from request.GET before the OAuth redirect (when no state is present). On the callback (when state is present), use only the value already bound in pipeline state. Similarly, in GitHubInstallation.dispatch, do not accept installation_id from request.GET — always use the value from pipeline state.
📋 Prompt for AI Agents
In src/sentry/integrations/github/integration.py:
-
In
OAuthLoginView.dispatch(lines 397-399), move theinstallation_idGET parameter read inside theif not request.GET.get('state'):block (i.e., only store installation_id from GET when this is the initial redirect, not on the OAuth callback). On the callback branch (whenstateis present), do NOT accept or re-bindinstallation_idfrom the request. -
In
GitHubInstallation.dispatch(lines 448-450), change theinstallation_idresolution to ONLY use the pipeline state value, removing therequest.GET.get('installation_id', ...)override:installation_id = pipeline.fetch_state('installation_id'). This ensures that the installation ID authenticated in OAuthLoginView cannot be replaced by an attacker-supplied URL parameter in subsequent pipeline steps.
| if ( | ||
| pipeline.fetch_state("github_authenticated_user") | ||
| != integration.metadata["sender"]["login"] | ||
| ): |
There was a problem hiding this comment.
🔴 Unguarded dict access on integration.metadata may raise KeyError (bug)
In GitHubInstallation.dispatch, after looking up the existing integration, the code accesses integration.metadata["sender"]["login"] directly (lines 501-503). If the metadata dict is missing the "sender" key (possible for integrations created before this field was stored, or if GitHub changes its payload), a KeyError will propagate as an unhandled 500 error instead of returning the graceful error response. The try/except block on lines 493-498 only catches Integration.DoesNotExist — not dict access failures.
💡 Suggestion: Use .get() with a fallback (or a nested get) to safely access the nested keys, and treat a missing value the same as a mismatch.
| if ( | |
| pipeline.fetch_state("github_authenticated_user") | |
| != integration.metadata["sender"]["login"] | |
| ): | |
| if ( | |
| pipeline.fetch_state("github_authenticated_user") | |
| != integration.metadata.get("sender", {}).get("login") | |
| ): |
📋 Prompt for AI Agents
In src/sentry/integrations/github/integration.py around lines 501-504, the expression integration.metadata["sender"]["login"] will raise a KeyError if the 'sender' or 'login' key is absent from an existing integration's metadata (e.g., for older records). Replace the direct dict access with safe .get() calls: change integration.metadata["sender"]["login"] to integration.metadata.get("sender", {}).get("login"). This way, a missing key is treated the same as a mismatch, returning the error page rather than a 500.
| try: | ||
| integration = Integration.objects.get( | ||
| external_id=installation_id, status=ObjectStatus.ACTIVE | ||
| ) | ||
| except Integration.DoesNotExist: | ||
| return error(request, self.active_organization) |
There was a problem hiding this comment.
🔴 Second Integration.objects.get with status filter can silently fail after first query succeeded (bug)
In GitHubInstallation.dispatch, the first lookup (line 477-479) already confirms the integration exists. The second query on lines 493-496 re-fetches using status=ObjectStatus.ACTIVE, which means an integration with any other status (e.g., PENDING_DELETION at the integration level, DISABLED) will raise DoesNotExist and return the generic 'Invalid installation request' error. The comment says 'OrganizationIntegration does not exist, but Integration does exist', implying the next step should be taken — but non-ACTIVE integrations will hit the error branch, silently swallowing the real reason.
💡 Suggestion: Reuse the integration object already retrieved in the first query — store it in a variable at line 477-479 to avoid the second query and the status-filtering gap. If the intent is to block non-ACTIVE integrations, document this explicitly and consider returning a more specific error.
📋 Prompt for AI Agents
In src/sentry/integrations/github/integration.py around lines 475-498, the first query (line 477-479) with Integration.objects.get(external_id=installation_id) already confirms the integration exists — store its result in a variable (e.g., integration = Integration.objects.get(...)). Then remove the second Integration.objects.get(external_id=installation_id, status=ObjectStatus.ACTIVE) query (lines 493-496) and instead use the already-fetched integration object. If filtering by ACTIVE status is intentional, check integration.status == ObjectStatus.ACTIVE on the already-fetched object and return a more descriptive error if it is not active. This eliminates the redundant DB query and the inconsistency where an existing-but-not-ACTIVE integration yields a generic 'Invalid installation request' error.
| if ( | ||
| pipeline.fetch_state("github_authenticated_user") | ||
| != integration.metadata["sender"]["login"] | ||
| ): | ||
| return error(request, self.active_organization) |
There was a problem hiding this comment.
🔴 Unguarded integration.metadata['sender'] key access causes KeyError crash, bypassing the user-identity check (Auth & Access)
At GitHubInstallation.dispatch line 503, the code executes integration.metadata['sender']['login'] to verify that the OAuth-authenticated user matches the original installer. However, build_integration (line 376-377) only conditionally sets sender in the metadata dict:
if state.get('sender'):
integration['metadata']['sender'] = state['sender']Integrations created via the webhook (InstallationEventWebhook.__call__, webhook.py line 202-210) do include sender. However, if an integration was created without a sender (e.g., via an older code path, or if the webhook payload lacked a sender), then integration.metadata will not have the 'sender' key and line 503 will throw a KeyError.
This KeyError is unhandled, causing an unhandled exception (500 error) instead of the intended error(...) response. Depending on how the Django error handler is configured, this may leak internal details, and more importantly the security check is silently bypassed in this scenario — neither blocking nor allowing the install in a controlled way.
💡 Suggestion: Wrap the integration.metadata['sender']['login'] access with .get() calls and treat a missing sender as a check failure (return error). This ensures that if sender is absent from metadata, the flow is blocked safely.
📋 Prompt for AI Agents
In src/sentry/integrations/github/integration.py, at lines 500-505, replace the bare dictionary access integration.metadata['sender']['login'] with safe .get() calls to prevent a KeyError crash when sender is absent from integration metadata. Specifically, change the condition to use integration.metadata.get('sender', {}).get('login') so that a missing sender key evaluates to None, which will not match any authenticated user login and will correctly trigger return error(request, self.active_organization) instead of raising an unhandled exception.
| try: | ||
| body = safe_urlread(req).decode("utf-8") | ||
| payload = dict(parse_qsl(body)) | ||
| except Exception: | ||
| payload = {} |
There was a problem hiding this comment.
🟠 Broad except Exception silently swallows token-exchange errors without logging (bug)
In OAuthLoginView.dispatch, lines 425-429 catch all exceptions from safe_urlread, decode, and parse_qsl with except Exception: payload = {}. This means genuine network errors, timeouts, or decode failures are silently discarded. No logging (logger.exception/logger.warning) is performed, so operators have no visibility when the GitHub OAuth token exchange fails for non-auth reasons (e.g., GitHub API outage). The user sees the same generic error as for a real auth mismatch.
💡 Suggestion: Log the exception before falling through to the empty-payload path, so failures are observable in Sentry/logs. Ideally also distinguish between network errors and parse errors.
📋 Prompt for AI Agents
In src/sentry/integrations/github/integration.py around lines 425-429, the except Exception: payload = {} block silently discards all errors from the GitHub token exchange. Add a logger.exception("Failed to exchange GitHub OAuth token") (or logger.warning) call inside the except block before setting payload = {}. Ensure logger is imported/defined at the module level (it likely already is in this file). This allows operators to distinguish between a legitimate auth failure and a transient infrastructure error.
| def error( | ||
| request, | ||
| org, | ||
| error_short="Invalid installation request.", | ||
| error_long=ERR_INTEGRATION_INVALID_INSTALLATION_REQUEST, | ||
| ): |
There was a problem hiding this comment.
🟡 Missing type annotations on new module-level helper functions (style)
The error(request, org, ...) function (lines 129-134) and get_document_origin(org) (lines 149-152) do not have type annotations for request or org. The rest of the module uses Request and typed parameters consistently. Without annotations, type checkers and IDEs cannot validate callers or the return type of error().
💡 Suggestion: Add type annotations: request: HttpRequest (or Request from DRF), org: RpcUserOrganizationContext | None, and -> HttpResponse on error(), and org: RpcUserOrganizationContext | None on get_document_origin().
📋 Prompt for AI Agents
In src/sentry/integrations/github/integration.py at lines 129-134 and 149-152, add type annotations to the error and get_document_origin module-level functions. For error: annotate request as HttpRequest or Request (whichever is already imported), org as the appropriate organization context type (likely RpcUserOrganizationContext | None based on self.active_organization usage), and add -> HttpResponse return type. For get_document_origin: annotate org with the same type and confirm -> str is already present.
| if ( | ||
| pipeline.fetch_state("github_authenticated_user") | ||
| != integration.metadata["sender"]["login"] | ||
| ): | ||
| return error(request, self.active_organization) |
There was a problem hiding this comment.
🔴 KeyError on integration.metadata["sender"] when sender is missing from metadata
At line 503, integration.metadata["sender"]["login"] is accessed without verifying that "sender" exists in the metadata. After a successful first installation, ensure_integration (src/sentry/integrations/pipeline.py:32-44) is called from finish_pipeline with the result of build_integration (src/sentry/integrations/github/integration.py:351-379), which only includes sender if state.get("sender") is truthy. Since the pipeline state never contains sender (only installation_id and github_authenticated_user), ensure_integration overwrites the Integration's metadata and strips the sender key that was originally set by the webhook handler (src/sentry/integrations/github/webhook.py:204-206). The test at test_integration.py:303 confirms the metadata after installation has no sender. If the OrganizationIntegration is later deleted (but the Integration persists), a subsequent installation attempt reaches line 503 and raises an unhandled KeyError, resulting in a 500 error.
Comparable safe access in installation.py
The existing GitHubIntegrationsInstallationEndpoint at src/sentry/integrations/github/installation.py:44 properly guards against this with if "sender" not in integration.metadata: return HttpResponse(status=404), but the new code does not.
| if ( | |
| pipeline.fetch_state("github_authenticated_user") | |
| != integration.metadata["sender"]["login"] | |
| ): | |
| return error(request, self.active_organization) | |
| if "sender" not in integration.metadata or ( | |
| pipeline.fetch_state("github_authenticated_user") | |
| != integration.metadata["sender"]["login"] | |
| ): | |
| return error(request, self.active_organization) |
Was this helpful? React with 👍 or 👎 to provide feedback.
Uh oh!
There was an error while loading. Please reload this page.