fix(integrations): Cache missing GitHub repo tree lookups#113113
Open
fix(integrations): Cache missing GitHub repo tree lookups#113113
Conversation
Cache GitHub repo tree 404 responses as empty results to avoid repeated failing requests from auto source code config. Keep non-404 API errors raising normally and cover both behaviors with tests. Fixes SENTRY-5K7G Co-Authored-By: Codex <noreply@openai.com> Made-with: Cursor
This comment was marked as outdated.
This comment was marked as outdated.
Apply shifted cache expiry to not-found repo tree responses so negative cache entries do not expire simultaneously across repositories. Refs SENTRY-5K7G Co-Authored-By: Codex <noreply@openai.com> Made-with: Cursor
Apply a single empty-result cache path for repo tree fetches so 409 and 404 outcomes are both cached with staggered expiry. Update GitHub integration tests to expect 404 repos as empty trees. Fixes SENTRY-5K7G Co-Authored-By: Codex <noreply@openai.com> Made-with: Cursor
Switch patch targets to the imported GitHubBaseClient symbol so mypy passes when github integration tests are part of the push hooks. Refs SENTRY-5K7G Co-Authored-By: Codex <noreply@openai.com> Made-with: Cursor
armenzg
commented
Apr 16, 2026
|
|
||
| repo_full_name: e.g. getsentry/sentry | ||
| tree_sha: A branch or a commit sha | ||
| shifted_seconds: Staggers cache expiration times across repositories |
Member
Author
There was a problem hiding this comment.
The caller makes the cache expire around 24 hour buckets (per repo) so it doesn't all expire around the same hour window.
| # being acceptable to sometimes not having everything cached | ||
| cache.set(key, repo_files, self.CACHE_SECONDS + shifted_seconds) | ||
| else: | ||
| cache.set(key, [], self.CACHE_SECONDS + shifted_seconds) |
Member
Author
There was a problem hiding this comment.
Empty list as a negative cache.
|
|
||
|
|
||
| def _is_not_found_error(error: ApiError) -> bool: | ||
| if error.code == 404: |
Member
Author
There was a problem hiding this comment.
In all honestly, it is likely the 404 is real and even if we only fetch once a day we would still get a 404 but who knows maybe GitHub may send the wrong status code.
| ) | ||
|
|
||
| with patch.object(sentry.integrations.github.client.GitHubBaseClient, "page_size", 1): | ||
| with patch.object(GitHubBaseClient, "page_size", 1): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cache missing GitHub repo tree lookups as empty results and keep them in cache.
Auto source code config was repeatedly requesting missing repos/refs and creating
high-volume 404 failures. This changes repo tree fetching to treat not-found outcomes
as empty results, cache them, and return empty
RepoTreeentries instead of droppingthose repositories.
Cache expiry remains staggered using
shifted_secondsso empty-cache refreshes do notthunder at the same time across many repositories.
I considered keeping 404 behavior separate from 409 handling, but using one empty-result
cache path makes behavior and expiry consistent for both outcomes while preserving normal
error raising for non-not-found API failures.
Tests add coverage for 404 caching, staggered TTL behavior, and non-404 error handling,
and update GitHub integration expectations now that 404 repos are represented as empty
trees.
Fixes SENTRY-5K7G