ref(scm): Update GitHub Provider with Access to Raw Response Instance by cmanallen · Pull Request #111192 · getsentry/sentry

cmanallen · 2026-03-20T14:13:17Z

Refactors the GitHub SCM provider layer to make HTTP requests directly via _request and work with raw requests.Response objects instead of delegating to high-level methods on GitHubApiClient.

Why: The provider needs access to response headers (ETag, Last-Modified) to support conditional requests and pagination. The
existing GitHubApiClient methods return parsed JSON dicts, discarding all response metadata. Rather than modifying every client method to optionally return headers, this moves the provider to use raw responses directly.

What changed:

Introduced GitHubProviderApiClient, a thin wrapper around GitHubApiClient._request that returns raw requests.Response objects. It handles pagination params, conditional request headers (If-None-Match, If-Modified-Since), and error translation (ApiError → SCMProviderException) in one place.
All GitHubProvider methods now construct their own API paths and call through GitHubProviderApiClient instead of delegating to named methods on GitHubApiClient (e.g. get_branch, create_git_ref, list_pull_requests, etc.).
map_action and new map_paginated_action helpers extract ResponseMeta (etag, last_modified) from response headers and compute next_cursor for pagination.
Removed ~30 methods from GitHubBaseClient that were only used by the SCM provider (GraphQL queries, git ref/tree/blob CRUD, PR management, comment deletion, etc.). These are now inlined in the provider.
Moved the MINIMIZE_COMMENT_MUTATION GraphQL query to the provider since it's the only consumer.
Added force_raise_for_status parameter to BaseApiClient._request so raw responses still get status checks.
Rewrote unit tests to mock at the GitHubProviderApiClient request boundary instead of asserting on GitHubApiClient method delegation. Tests now verify the HTTP method, path, and payload sent to the API rather than which client method was called.

No behavior changes for callers of the SCM provider interface.

This reverts commit ec30a79.

src/sentry/scm/private/providers/github.py

tests/sentry/scm/integration/test_github_provider_integration.py

sentry · 2026-03-20T14:16:20Z

src/sentry/scm/private/providers/github.py

+        if not isinstance(response, dict) or ("data" not in response and "errors" not in response):
+            raise SCMProviderException("GraphQL response is not in expected format")

-def catch_provider_exception(fn):
-    @functools.wraps(fn)
-    def wrapper(*args, **kwargs):
-        try:
-            return fn(*args, **kwargs)
-        except ApiError as e:
-            raise SCMProviderException(str(e)) from e
+        errors = response.get("errors", [])
+        if errors and not response.get("data"):
+            err_message = "\n".join(e.get("message", "") for e in errors)
+            raise SCMProviderException(err_message)

-    return wrapper
+        return response.get("data", {})


Bug: The graphql() method always raises an exception because it checks isinstance(response, dict) on a requests.Response object without first parsing it as JSON.
_{Severity: CRITICAL}

Suggested Fix

The response object should be parsed into a dictionary by calling response.json() before it is used. The type and content checks should be performed on this new dictionary variable.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/sentry/scm/private/providers/github.py#L231-L239 Potential issue: The `graphql()` method in `GitHubProviderApiClient` receives a `requests.Response` object from `self.post()`, but then immediately checks if the response is a dictionary with `isinstance(response, dict)`. Since `self.post()` is configured to return a raw `requests.Response` object, this check will always fail. As a result, the method will unconditionally raise an `SCMProviderException`, making it and any feature that relies on it, such as `minimize_comment()`, non-functional at runtime.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

github-actions · 2026-03-20T14:21:57Z

Backend Test Failures

Failures on 103d9ac in this run:

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_api_403_error_raises_scm_provider_exception

— log

tests/sentry/scm/integration/test_github_provider_integration.py:445: in test_api_403_error_raises_scm_provider_exception
    self.provider.get_pull_request("1")
src/sentry/scm/private/providers/github.py:294: in get_pull_request
    return map_action(response, map_pull_request)
src/sentry/scm/private/providers/github.py:1039: in map_action
    "data": fn(raw),
src/sentry/scm/private/providers/github.py:1022: in map_pull_request
    id=str(raw["id"]),
E   KeyError: 'id'

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_api_422_error_raises_scm_provider_exception

— log

tests/sentry/scm/integration/test_github_provider_integration.py:458: in test_api_422_error_raises_scm_provider_exception
    self.provider.create_pull_request(
src/sentry/scm/private/providers/github.py:671: in create_pull_request
    return map_action(response, map_pull_request)
src/sentry/scm/private/providers/github.py:1039: in map_action
    "data": fn(raw),
src/sentry/scm/private/providers/github.py:1022: in map_pull_request
    id=str(raw["id"]),
E   KeyError: 'id'

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_api_500_error_raises_scm_provider_exception

— log

tests/sentry/scm/integration/test_github_provider_integration.py:432: in test_api_500_error_raises_scm_provider_exception
    self.provider.get_issue_comments("42")
src/sentry/scm/private/providers/github.py:273: in get_issue_comments
    return map_paginated_action(pagination, response, lambda r: [map_comment(c) for c in r])
src/sentry/scm/private/providers/github.py:1057: in map_paginated_action
    "data": fn(raw),
src/sentry/scm/private/providers/github.py:273: in <lambda>
    return map_paginated_action(pagination, response, lambda r: [map_comment(c) for c in r])
src/sentry/scm/private/providers/github.py:870: in map_comment
    id=str(raw["id"]),
E   TypeError: string indices must be integers, not 'str'

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_api_error_raises_scm_provider_exception

— log

tests/sentry/scm/integration/test_github_provider_integration.py:419: in test_api_error_raises_scm_provider_exception
    self.provider.get_issue_comments("42")
src/sentry/scm/private/providers/github.py:273: in get_issue_comments
    return map_paginated_action(pagination, response, lambda r: [map_comment(c) for c in r])
src/sentry/scm/private/providers/github.py:1057: in map_paginated_action
    "data": fn(raw),
src/sentry/scm/private/providers/github.py:273: in <lambda>
    return map_paginated_action(pagination, response, lambda r: [map_comment(c) for c in r])
src/sentry/scm/private/providers/github.py:870: in map_comment
    id=str(raw["id"]),
E   TypeError: string indices must be integers, not 'str'

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_get_issue_comment_reactions

— log

src/sentry/shared_integrations/client/base.py:264: in _request
    resp: Response = session.send(finalized_request, **session_settings)
.venv/lib/python3.13/site-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
.venv/lib/python3.13/site-packages/responses/__init__.py:1104: in unbound_on_send
    return self._on_request(adapter, request, *a, **kwargs)
.venv/lib/python3.13/site-packages/responses/__init__.py:1046: in _on_request
    raise response
E   requests.exceptions.ConnectionError: Connection refused by Responses - the call doesn't match any registered mock.
E   
E   Request: 
E   - GET https://api.github.com/repos/test-org/test-repo/issues/comments/42/reactions?per_page=50&page=1
E   
E   Available matches:
E   - GET https://api.github.com/repos/test-org/test-repo/issues/comments/42/reactions?per_page=100 Query string doesn't match. {page: 1, per_page: 50} doesn't match {per_page: 100}

The above exception was the direct cause of the following exception:
tests/sentry/scm/integration/test_github_provider_integration.py:288: in test_get_issue_comment_reactions
    reactions = self.provider.get_issue_comment_reactions("1", "42")
src/sentry/scm/private/providers/github.py:326: in get_issue_comment_reactions
    response = self.client.get(
src/sentry/scm/private/providers/github.py:200: in get
    return self.request("GET", path=path, params=params, headers=headers)
src/sentry/scm/private/providers/github.py:185: in request
    return self.client._request(
src/sentry/shared_integrations/client/base.py:273: in _request
    raise ApiHostError.from_exception(e) from e
E   sentry.shared_integrations.exceptions.ApiHostError: Unable to reach host: api.github.com

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_get_issue_reactions

— log

src/sentry/shared_integrations/client/base.py:264: in _request
    resp: Response = session.send(finalized_request, **session_settings)
.venv/lib/python3.13/site-packages/requests/sessions.py:703: in send
    r = adapter.send(request, **kwargs)
.venv/lib/python3.13/site-packages/responses/__init__.py:1104: in unbound_on_send
    return self._on_request(adapter, request, *a, **kwargs)
.venv/lib/python3.13/site-packages/responses/__init__.py:1046: in _on_request
    raise response
E   requests.exceptions.ConnectionError: Connection refused by Responses - the call doesn't match any registered mock.
E   
E   Request: 
E   - GET https://api.github.com/repos/test-org/test-repo/issues/42/reactions?per_page=50&page=1
E   
E   Available matches:
E   - GET https://api.github.com/repos/test-org/test-repo/issues/42/reactions?per_page=100 Query string doesn't match. {page: 1, per_page: 50} doesn't match {per_page: 100}

The above exception was the direct cause of the following exception:
tests/sentry/scm/integration/test_github_provider_integration.py:364: in test_get_issue_reactions
    reactions = self.provider.get_issue_reactions("42")
src/sentry/scm/private/providers/github.py:376: in get_issue_reactions
    response = self.client.get(
src/sentry/scm/private/providers/github.py:200: in get
    return self.request("GET", path=path, params=params, headers=headers)
src/sentry/scm/private/providers/github.py:185: in request
    return self.client._request(
src/sentry/shared_integrations/client/base.py:273: in _request
    raise ApiHostError.from_exception(e) from e
E   sentry.shared_integrations.exceptions.ApiHostError: Unable to reach host: api.github.com

tests/sentry/scm/integration/test_github_provider_integration.py::TestGitHubProviderIntegration::test_get_pull_request_uses_conditional_request_headers

— log

tests/sentry/scm/integration/test_github_provider_integration.py:142: in test_get_pull_request_uses_conditional_request_headers
    assert responses.calls[0].request.headers["If-None-Match"] == '"etag-123"'
.venv/lib/python3.13/site-packages/requests/structures.py:52: in __getitem__
    return self._store[key.lower()][1]
E   KeyError: 'if-none-match'

src/sentry/scm/private/providers/github.py

### Summary - Introduces a dynamic, per-org rate limiter (DynamicRateLimiter + RedisRateLimitProvider) backed by Redis that reads GitHub's x-ratelimit-limit, x-ratelimit-used, and x-ratelimit-reset response headers to eagerly throttle requests before hitting provider limits. - Adds a referrer allocation system where specific referrers (e.g. emerge) get a dedicated percentage of the rate-limit quota, with remaining capacity shared across all other callers. - Removes dead code: old ratelimits.backend-based is_rate_limited/is_rate_limited_with_allocation_policy helpers, unused REACTION_MAP, catch_provider_exception decorator, and stale encode_ratelimit_key utilities. ### How it works 1. GitHubProviderApiClient.request() makes the HTTP call and, on every response containing rate-limit headers, calls DynamicRateLimiter.update_rate_limit_meta() to sync Sentry's Redis state with GitHub's reported capacity/usage. 2. Before each request, GitHubProviderApiClient.is_rate_limited(referrer) checks the referrer's dedicated allocation first, then falls back to the shared pool. Both checks call DynamicRateLimiter.is_rate_limited() which atomically increments usage counters in Redis via a pipeline. 3. The rate limiter fails open — if no limit has been cached yet (first request for an org), the request proceeds and the limit is populated from the response. ### Test plan - Unit tests for DynamicRateLimiter covering: allocated/shared quota exhaustion, fail-open on missing limits, capacity caching, update_rate_limit_meta with matching/mismatched windows, shared usage floor at zero. - Integration tests for RedisRateLimitProvider verifying Redis pipeline behavior: get_and_set_rate_limit (INCR + TTL), get_accounted_usage (multi-GET sum), set_key_values (SET with/without expiration).

src/sentry/scm/private/rate_limit.py

sentry · 2026-03-25T18:27:16Z

src/sentry/scm/private/providers/github.py

+    def is_rate_limited(self, referrer: Referrer) -> bool:
+        """Return true if access to the resource has been blocked."""
+        # If the referrer has allocated quota and that quota has not been exhausted we eagerly
+        # exit by returning false. Otherwise we consume from the shared quota pool.
+        if (
+            referrer in self.rate_limiter.referrer_allocation
+            and not self.rate_limiter.is_rate_limited(referrer)
+        ):
+            return False
+        else:
+            return self.rate_limiter.is_rate_limited("shared")


Bug: The graphql method incorrectly checks the type of the response object before parsing it as JSON, causing it to always raise an exception and fail.
_{Severity: CRITICAL}

Suggested Fix

Move the response.json() call to before the type check. The check should be performed on the parsed JSON data, not the requests.Response object. For example: response_data = response.json() followed by if not isinstance(response_data, dict) or ....

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: src/sentry/scm/private/providers/github.py#L171-L181 Potential issue: In the `graphql` method, the `self.post()` call returns a `requests.Response` object, not a dictionary. The subsequent check `isinstance(response, dict)` will always evaluate to `False`, causing the condition `if not isinstance(response, dict) or ...` to always be true. This unconditionally raises an `SCMProviderException` with the message "GraphQL response is not in expected format". As a result, the logic to parse the JSON response and handle GraphQL errors is never reached, and any call to this method, such as from `minimize_comment`, will fail.

src/sentry/scm/private/rate_limit.py

…n/scm-rate-limits

src/sentry/scm/private/providers/github.py

github-actions · 2026-03-25T20:35:24Z

Backend Test Failures

Failures on dab6c3c in this run:

tests/sentry/taskworker/test_config.py::test_all_instrumented_tasks_registered — log

tests/sentry/taskworker/test_config.py:120: in test_all_instrumented_tasks_registered
    raise AssertionError(
E   AssertionError: Found 1 module(s) with @instrumented_task that are NOT registered in TASKWORKER_IMPORTS.
E   These tasks will not be discovered by the taskworker in production!
E   
E   Missing modules:
E     - sentry.workflow_engine.tasks.cleanup
E   
E   Add these to TASKWORKER_IMPORTS in src/sentry/conf/server.py

jacquev6

🚢

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

src/sentry/scm/private/providers/github.py

cmanallen added 14 commits March 17, 2026 12:49

Checkpoint

363897d

Another checkpoint

9f6a02e

API client refactor

b0c955a

Complete refactor to new raw response type handling

4dc4fb3

Simplify API

2a0be41

Minor code quality improvements

7154bd0

Pagination params are sent in the params not the headers

f750631

Comment headers

5ff99d3

Test rewrite

d936a19

Add initial rate-limiting behavior

ec30a79

Merge branch 'master' into cmanallen/scm-rate-limits

2347d9e

Translate get_archive_link to new format

52e27cf

Update test coverage

4d5003e

Revert "Add initial rate-limiting behavior"

274c5bd

This reverts commit ec30a79.

github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Mar 20, 2026

vercel bot deployed to Preview March 20, 2026 14:13 View deployment

cursor bot reviewed Mar 20, 2026

View reviewed changes

src/sentry/scm/private/providers/github.py Outdated Show resolved Hide resolved

tests/sentry/scm/integration/test_github_provider_integration.py Outdated Show resolved Hide resolved

sentry bot reviewed Mar 20, 2026

View reviewed changes

sentry-warden bot reviewed Mar 20, 2026

View reviewed changes

src/sentry/scm/private/providers/github.py Show resolved Hide resolved

cmanallen added 2 commits March 20, 2026 09:54

Force raise for status

d548aaf

Remove weird AI artifacts

4c2b7c3

cmanallen requested a review from a team as a code owner March 20, 2026 15:01

vercel bot deployed to Preview March 20, 2026 15:04 View deployment

cursor bot reviewed Mar 20, 2026

View reviewed changes

src/sentry/scm/private/providers/github.py Show resolved Hide resolved

Fix typing

0011f36

cmanallen requested a review from a team as a code owner March 20, 2026 15:12

vercel bot deployed to Preview March 20, 2026 15:15 View deployment

sentry bot reviewed Mar 20, 2026

View reviewed changes

src/sentry/scm/private/providers/github.py Show resolved Hide resolved

cmanallen mentioned this pull request Mar 20, 2026

Pull Reviews and Updating the SCM Client to Match #111247

Open

cmanallen added 5 commits March 25, 2026 11:17

Update codeowners

c03875f

Add DynamicRateLimiter docstring

f93d7d1

Remove scm test module

d4709a1

More docstrings

2875d1e

cmanallen requested review from a team as code owners March 25, 2026 18:10

evanpurkhiser reviewed Mar 25, 2026

View reviewed changes

src/sentry/scm/private/rate_limit.py Show resolved Hide resolved

vercel bot deployed to Preview March 25, 2026 18:13 View deployment

Merge branch 'master' into cmanallen/scm-rate-limits

c15fecd

vercel bot deployed to Preview March 25, 2026 18:18 View deployment

Catch ApiError as well

310ade5

sentry-warden bot reviewed Mar 25, 2026

View reviewed changes

src/sentry/scm/private/rate_limit.py Outdated Show resolved Hide resolved

sentry bot reviewed Mar 25, 2026

View reviewed changes

cmanallen added 4 commits March 25, 2026 13:59

Add Redis failure handling

4eea75d

Fix graphql handling

b96afc6

Add graphql transform coverage

b82c7e9

Merge branch 'cmanallen/scm-github-dynamic-rate-limits' into cmanalle…

6a5eb8e

…n/scm-rate-limits

vercel bot deployed to Preview March 25, 2026 20:18 View deployment

cursor bot reviewed Mar 25, 2026

View reviewed changes

src/sentry/scm/private/providers/github.py Show resolved Hide resolved

src/sentry/scm/private/providers/github.py Show resolved Hide resolved

jacquev6 approved these changes Mar 26, 2026

View reviewed changes

cmanallen added 4 commits March 26, 2026 08:07

Check status code and headers before transforming

33fa804

Add status_code handling

05b7ba7

Fix typing

8c2d26d

Merge branch 'master' into cmanallen/scm-rate-limits

da19da5

vercel bot deployed to Preview March 26, 2026 13:26 View deployment

cursor bot reviewed Mar 26, 2026

View reviewed changes

src/sentry/scm/private/providers/github.py Show resolved Hide resolved

cmanallen merged commit 15ae4f1 into master Mar 26, 2026
106 checks passed

cmanallen deleted the cmanallen/scm-rate-limits branch March 26, 2026 14:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ref(scm): Update GitHub Provider with Access to Raw Response Instance#111192

ref(scm): Update GitHub Provider with Access to Raw Response Instance#111192
cmanallen merged 62 commits intomasterfrom
cmanallen/scm-rate-limits

cmanallen commented Mar 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

sentry bot Mar 20, 2026

Uh oh!

github-actions bot commented Mar 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sentry bot Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

jacquev6 left a comment

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

cmanallen commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sentry bot Mar 20, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 20, 2026

Backend Test Failures

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sentry bot Mar 25, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Mar 25, 2026

Backend Test Failures

Uh oh!

jacquev6 left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cmanallen commented Mar 20, 2026 •

edited

Loading