ref(commits): Unify repository provider commit fetch interfaces#113281
ref(commits): Unify repository provider commit fetch interfaces#113281
Conversation
Replace commit task branching on integration vs plugin repository providers with a shared provider API that supports recent and compare-range fetches. This keeps behavior consistent while making the commit fetching path easier to evolve. Add type annotations for the provider base classes and include them in the mypy bypass allowlist so stricter checks remain enforced for these modules. Co-Authored-By: Codex <noreply@openai.com> Made-with: Cursor
Widen integration repository provider base method annotations to match concrete SCM providers and satisfy mypy override checks during pre-push validation. Co-Authored-By: Codex <noreply@openai.com> Made-with: Cursor
Update VSTS and Perforce repository provider method signatures to match the current IntegrationRepositoryProvider contract, so mypy accepts these overrides in CI backend typing. Co-Authored-By: Codex 5.3 <noreply@openai.com> Made-with: Cursor
ad42a97 to
83a57b2
Compare
| start_sha: str, | ||
| end_sha: str, | ||
| *, | ||
| actor: Any | None = None, |
There was a problem hiding this comment.
actor was added to keep GitHubRepositoryProvider.fetch_commits_for_compare_range aligned with the unified provider interface.
- In
IntegrationRepositoryProvider, both commit-fetch methods now accept*, actor: Any | None = Noneso callers can pass user context consistently. fetch_commitsinsrc/sentry/tasks/commits.pynow always calls providers withactor=userfor both recent and compare-range paths.- GitHub’s implementation doesn’t currently need that value, so it’s unused there for now, but the parameter is present for interface compatibility and future provider-specific auth/rate-limit behavior.
- Without it, mypy would flag signature mismatch on override (same class of issue we just fixed for other providers).
Once the plugins logic is removed the parameter would be unnecessary.
| def repository_external_slug(self, repo: Repository) -> str | None: | ||
| return repo.external_id | ||
| def repository_external_slug(self, repo: Repository) -> str: | ||
| return repo.external_id or repo.name |
There was a problem hiding this comment.
Q: could returning repo.name break anything?
Potentially, yes — but only in the edge case where repo.external_id is missing.
For VSTS specifically:
identifierfromget_repositories()is a repo ID (GUID string), not the repo name.repository_external_slug()is expected to match that identifier shape.- If we fall back to
repo.name, that invariant is broken for those repos, and any UI/backend logic that comparesidentifiertoexternalSlugcan mis-match (e.g. “already installed” checks).
For normal VSTS repos this usually won’t happen, because external_id should be set at creation, so behavior is unchanged.
So: not likely to break healthy repos, but semantically it is a weaker fallback for legacy/bad rows.
| cache_enabled: bool, | ||
| repo: Repository, | ||
| provider: Any, | ||
| is_integration_repo_provider: bool, |
There was a problem hiding this comment.
Not needed anymore since we can call fetch_commits_for_compare_range on all.
|
|
||
| def fetch_recent_commits(self, repo: Repository, end_sha: str) -> Sequence[Mapping[str, Any]]: | ||
| def fetch_recent_commits( | ||
| self, repo: Repository, end_sha: str, *, actor: Any | None = None |
There was a problem hiding this comment.
Adding actor to all the functions to have the same signature.
| return repo, provider, provider_key | ||
|
|
||
|
|
||
| def get_start_sha_for_ref( |
There was a problem hiding this comment.
Some minor extraction from the main loop.
| repository_id=repo.id, | ||
| ).values_list("commit__key", flat=True)[0] | ||
| except IndexError: | ||
| pass |
There was a problem hiding this comment.
Minor refactor: moving this block into the function get_start_sha_for_ref.

Unify commit-fetching behavior in
fetch_commitsby moving provider-family differences behind a shared provider API.Before this change, the task branched on
is_integration_repo_providerto decide how to resolve provider bindings, which method signatures to call, and which provider key to emit for SCM lifecycle metrics. That made the commit-fetch path harder to evolve and required task-level conditionals for integration vs plugin providers.This change adds compatibility shims on both provider base classes (
fetch_recent_commits,fetch_commits_for_compare_range, andget_scm_provider_key) and updates the task to call the unified methods directly. It also removes the now-redundantis_integration_providerhelper and adds a regression test to ensure plugin providers still receive actor context.A small follow-up commit aligns integration provider type signatures with concrete overrides to satisfy mypy checks enforced at pre-push time.
Made with Cursor