Skip to content

ref(commits): Refactor fetch_commits task#113293

Open
armenzg wants to merge 2 commits intomasterfrom
44410-armenzg/ref/extract-fetch-commits-helpers
Open

ref(commits): Refactor fetch_commits task#113293
armenzg wants to merge 2 commits intomasterfrom
44410-armenzg/ref/extract-fetch-commits-helpers

Conversation

@armenzg
Copy link
Copy Markdown
Member

@armenzg armenzg commented Apr 17, 2026

Refactor the release commit-fetching task by extracting focused helpers for repository lookup, provider resolution, start SHA selection, and lifecycle-wrapped commit fetching.

This keeps fetch_commits easier to follow and reason about as a linear orchestration flow while preserving the existing behavior and error handling paths.

I considered only extracting the lifecycle block, but splitting repo lookup and provider resolution as separate helpers makes each step explicit and aligns with the recent follow-up refactors requested during review.

Additional context: logging now uses shared per-loop context and still records commit counts without changing commit association semantics.

Made with Cursor

Split repository lookup, provider resolution, start SHA selection, and
SCM lifecycle fetch handling into focused helpers. This keeps the main
fetch_commits loop linear and easier to reason about without changing
behavior.

Made-with: Cursor
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 17, 2026
Comment thread src/sentry/tasks/commits.py Outdated
Copy link
Copy Markdown
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Duplicate "fetch_commits.start" log entry per invocation
    • Removed the duplicate logger.info('fetch_commits.start', ...) call at line 332, keeping only the new enhanced version at line 353 with richer context.

Create PR

Or push these changes by commenting:

@cursor push 0fb9386369
Preview (0fb9386369)
diff --git a/src/sentry/tasks/commits.py b/src/sentry/tasks/commits.py
--- a/src/sentry/tasks/commits.py
+++ b/src/sentry/tasks/commits.py
@@ -329,7 +329,6 @@
     commit_list: list[dict[str, Any]] = []
 
     release = Release.objects.get(id=release_id)
-    logger.info("fetch_commits.start", extra={"organization_slug": release.organization.slug})
     set_tag("organization.slug", release.organization.slug)
     # TODO: Need a better way to error handle no user_id. We need the SDK to be able to call this without user context
     # to autoassociate commits to releases

You can send follow-ups to the cloud agent here.

Reviewed by Cursor Bugbot for commit 54e4b99. Configure here.

Comment thread src/sentry/tasks/commits.py Outdated
Guard span status updates when no active span is available so the
original exception handling path can send notifications and record
lifecycle state. Rename the second start-phase log event to avoid
duplicate fetch_commits.start entries per invocation.

Made-with: Cursor


def get_repo_and_provider_for_ref(
def get_repo_for_ref(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decoupling this function into two.

This first one will only return the repo.

return repo


def get_provider_for_repo(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This second one only focuses on getting the provider.

return provider, is_integration_repo_provider, provider_key


def get_start_sha_for_ref(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is extracted from the main loop.

return None


def fetch_commits_for_ref_with_lifecycle(
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main logic which happened within the for loop.
The else part of the block is not ported:

else:
logger.info(
"fetch_commits.complete",
extra={
"organization_id": repo.organization_id,
"user_id": user_id,
"repository": repo.name,
"end_sha": end_sha,
"start_sha": start_sha,
"num_commits": len(repo_commits or []),
},
)
commit_list.extend(repo_commits)

}
logger.info("fetch_commits.config", extra=extra)

for ref in refs:
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the changes in this PR are about making this for loop easier to read by moving a lot of the code blocks into their own functions.

logger.info("fetch_commits.config", extra=extra)

for ref in refs:
resolved = get_repo_and_provider_for_ref(release=release, ref=ref, user_id=user_id)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're splitting this function into two to keep their logic simpler (even if we added a few more lines here).


end_sha = ref["commit"]
extra = extra | {"repository": repo.name, "end_sha": end_sha, "start_sha": start_sha}
logger.info("fetch_commits.loop.start", extra=extra)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New logging line.

@armenzg armenzg marked this pull request as ready for review April 17, 2026 14:59
@armenzg armenzg requested a review from a team as a code owner April 17, 2026 14:59
@armenzg armenzg requested a review from a team April 17, 2026 15:10
@armenzg armenzg changed the title ref(commits): Extract fetch_commits helper responsibilities ref(commits): Refactor fetch_commits task Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant