ref(commits): Simplify main loop, extra logging & CODEOWNERS update#113418
ref(commits): Simplify main loop, extra logging & CODEOWNERS update#113418
Conversation
Fold the per-ref resolution steps (repo lookup, provider binding, start sha derivation, end sha extraction) into a single resolve_ref helper returning a ResolvedRef NamedTuple. The task loop becomes resolve → skip-or-fetch → collect, and refs that can never succeed still short- circuit before an SCM lifecycle event is opened. Push the compare-commits cache gating from the task body down into fetch_commits_for_compare_range where it is actually consumed. The feature flag flows through untouched; the provider/feature-flag check lives next to the cache read/write and the compare_commits_cache_enabled span tag, so the three concerns move together. Move the fetch_commits.loop.start / fetch_commits.loop.complete logs into fetch_commits_for_ref_with_lifecycle. The helper owns its own loop_extra construction from a task-level task_extra mapping, and the complete log is emitted from a finally so it still fires on every swallowed-error path. No behavior change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> Made-with: Cursor
| end_sha: str, | ||
| user: RpcUser | None, | ||
| ) -> list[dict[str, Any]]: | ||
| cache_enabled = ( |
There was a problem hiding this comment.
This is part of cleaning up the for loop in the main task.
| release=release, | ||
| repo=repo, | ||
| prev_release=prev_release, | ||
| ) |
There was a problem hiding this comment.
All of the above logic comes from the task's main for loop.
| ) | ||
|
|
||
|
|
||
| def fetch_commits_for_ref_with_lifecycle( |
There was a problem hiding this comment.
If you hide whitespaces for the view, you will more clearly see what's happening in this function.
It's also easier if you switch to the split view.
| logger.info( | ||
| "fetch_commits.loop.complete", | ||
| extra={**loop_extra, "num_commits": len(repo_commits or [])}, | ||
| ) |
There was a problem hiding this comment.
This logging line also comes from the main loop.
|
|
||
| release = Release.objects.get(id=release_id) | ||
| logger.info("fetch_commits.start", extra={"organization_slug": release.organization.slug}) | ||
| set_tag("organization.slug", release.organization.slug) |
There was a problem hiding this comment.
I'm moving some of this logging a few lines further down to include more context.
| "user_id": user_id, | ||
| "refs": refs, | ||
| "num_refs": len(refs), | ||
| "prev_release_id": prev_release_id, |
There was a problem hiding this comment.
Including a bit more context to fetch_commits.start to help with debugging.
|
|
||
| provider_values = get_provider_for_repo(repo=repo) | ||
| if provider_values is None: | ||
| resolved = resolve_ref(ref=ref, release=release, prev_release=prev_release, user_id=user_id) |
There was a problem hiding this comment.
This last refactoring makes the for loop short enough.
| "user_id": user_id, | ||
| }, | ||
| ) | ||
| logger.info("fetch_commits.duplicate", extra=extra) |
There was a problem hiding this comment.
This logging line now has enough info to help me see which orgs are seeing the most impact.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Dropped
release_idfrom shared loggingextradict- Added release_id to the shared extra dict at line 413, ensuring all log statements now include this critical identifier for debugging.
Or push these changes by commenting:
@cursor push cc3dca3fe7
Preview (cc3dca3fe7)
diff --git a/src/sentry/tasks/commits.py b/src/sentry/tasks/commits.py
--- a/src/sentry/tasks/commits.py
+++ b/src/sentry/tasks/commits.py
@@ -410,6 +410,7 @@
)
set_tag("organization.slug", release.organization.slug)
extra = {
+ "release_id": release.id,
"organization_id": release.organization_id,
"user_id": user_id,
"refs": refs,You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 15d8dc3. Configure here.
| "organization_id": release.organization_id, | ||
| "user_id": user_id, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Dropped release_id from shared logging extra dict
Low Severity
The shared extra dict is missing release_id (release.id). The old fetch_commits.duplicate log explicitly included "release_id": release.id, but the new extra dict that replaced it doesn't contain this key. Since release_id is the primary identifier for the task's target entity, its absence from all log lines (fetch_commits.start, fetch_commits.duplicate, fetch_commits.complete, and the loop logs via task_extra) reduces debuggability.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 15d8dc3. Configure here.



Follow-up refactor to #113293.
It adds some context missing from some of the logging lines and makes the main for loop shorter and easier to read.
This also makes the task be owned by @getsentry/coding-workflows's backend team.
Reshapes the
fetch_commitstask body around two ideas:resolve_refhelperfetch_commits_for_compare_range, next to thecache.get/cache.setand thecompare_commits_cache_enabledspan tag. The task just passes the flag through.fetch_commits.loop.start/fetch_commits.loop.completelogs moved intofetch_commits_for_ref_with_lifecycle, which now owns itsloop_extraconstruction from a task-leveltask_extramapping. The complete log runs in afinallyso it still fires on every swallowed-error path (NotImplementedError,IntegrationResourceNotFoundError, the handledExceptionbranches).No behavior change. All existing tests in
tests/sentry/tasks/test_commits.pypass unmodified.Made with Cursor