Skip to content

ref(commits): Unify repository provider commit fetch interfaces#113281

Open
armenzg wants to merge 5 commits intomasterfrom
44410-armenzg/ref/unify-commit-provider-api
Open

ref(commits): Unify repository provider commit fetch interfaces#113281
armenzg wants to merge 5 commits intomasterfrom
44410-armenzg/ref/unify-commit-provider-api

Conversation

@armenzg
Copy link
Copy Markdown
Member

@armenzg armenzg commented Apr 17, 2026

Unify commit-fetching behavior in fetch_commits by moving provider-family differences behind a shared provider API.

Before this change, the task branched on is_integration_repo_provider to decide how to resolve provider bindings, which method signatures to call, and which provider key to emit for SCM lifecycle metrics. That made the commit-fetch path harder to evolve and required task-level conditionals for integration vs plugin providers.

This change adds compatibility shims on both provider base classes (fetch_recent_commits, fetch_commits_for_compare_range, and get_scm_provider_key) and updates the task to call the unified methods directly. It also removes the now-redundant is_integration_provider helper and adds a regression test to ensure plugin providers still receive actor context.

A small follow-up commit aligns integration provider type signatures with concrete overrides to satisfy mypy checks enforced at pre-push time.

Made with Cursor

@armenzg armenzg self-assigned this Apr 17, 2026
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 17, 2026
armenzg and others added 5 commits April 17, 2026 10:00
Replace commit task branching on integration vs plugin repository providers with
a shared provider API that supports recent and compare-range fetches. This keeps
behavior consistent while making the commit fetching path easier to evolve.

Add type annotations for the provider base classes and include them in the
mypy bypass allowlist so stricter checks remain enforced for these modules.

Co-Authored-By: Codex <noreply@openai.com>
Made-with: Cursor
Widen integration repository provider base method annotations to match concrete
SCM providers and satisfy mypy override checks during pre-push validation.

Co-Authored-By: Codex <noreply@openai.com>
Made-with: Cursor
Update VSTS and Perforce repository provider method signatures to match the
current IntegrationRepositoryProvider contract, so mypy accepts these overrides
in CI backend typing.

Co-Authored-By: Codex 5.3 <noreply@openai.com>
Made-with: Cursor
@armenzg armenzg force-pushed the 44410-armenzg/ref/unify-commit-provider-api branch from ad42a97 to 83a57b2 Compare April 17, 2026 14:05
start_sha: str,
end_sha: str,
*,
actor: Any | None = None,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actor was added to keep GitHubRepositoryProvider.fetch_commits_for_compare_range aligned with the unified provider interface.

  • In IntegrationRepositoryProvider, both commit-fetch methods now accept *, actor: Any | None = None so callers can pass user context consistently.
  • fetch_commits in src/sentry/tasks/commits.py now always calls providers with actor=user for both recent and compare-range paths.
  • GitHub’s implementation doesn’t currently need that value, so it’s unused there for now, but the parameter is present for interface compatibility and future provider-specific auth/rate-limit behavior.
  • Without it, mypy would flag signature mismatch on override (same class of issue we just fixed for other providers).

See caller:
Image

See matching signature:
Image

Once the plugins logic is removed the parameter would be unnecessary.

def repository_external_slug(self, repo: Repository) -> str | None:
return repo.external_id
def repository_external_slug(self, repo: Repository) -> str:
return repo.external_id or repo.name
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: could returning repo.name break anything?

Potentially, yes — but only in the edge case where repo.external_id is missing.

For VSTS specifically:

  • identifier from get_repositories() is a repo ID (GUID string), not the repo name.
  • repository_external_slug() is expected to match that identifier shape.
  • If we fall back to repo.name, that invariant is broken for those repos, and any UI/backend logic that compares identifier to externalSlug can mis-match (e.g. “already installed” checks).

For normal VSTS repos this usually won’t happen, because external_id should be set at creation, so behavior is unchanged.

So: not likely to break healthy repos, but semantically it is a weaker fallback for legacy/bad rows.

cache_enabled: bool,
repo: Repository,
provider: Any,
is_integration_repo_provider: bool,
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed anymore since we can call fetch_commits_for_compare_range on all.


def fetch_recent_commits(self, repo: Repository, end_sha: str) -> Sequence[Mapping[str, Any]]:
def fetch_recent_commits(
self, repo: Repository, end_sha: str, *, actor: Any | None = None
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding actor to all the functions to have the same signature.

return repo, provider, provider_key


def get_start_sha_for_ref(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor extraction from the main loop.

repository_id=repo.id,
).values_list("commit__key", flat=True)[0]
except IndexError:
pass
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor refactor: moving this block into the function get_start_sha_for_ref.

@armenzg armenzg marked this pull request as ready for review April 17, 2026 14:08
@armenzg armenzg requested review from a team as code owners April 17, 2026 14:08
@armenzg armenzg requested a review from GabeVillalobos April 17, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant