Skip to content

ref(code_review): Split transform_webhook_to_codegen_request#106398

Merged
armenzg merged 14 commits intomasterfrom
1_15/split_transformation_function/code_review/armenzg
Jan 23, 2026
Merged

ref(code_review): Split transform_webhook_to_codegen_request#106398
armenzg merged 14 commits intomasterfrom
1_15/split_transformation_function/code_review/armenzg

Conversation

@armenzg
Copy link
Member

@armenzg armenzg commented Jan 15, 2026

Currently, transform_webhook_to_codegen_request is executed for both issue_comment and pull_request. This decouples the two event types to help comprehension. I also want all data manipulations to happen in the same location rather than throughout different stages (this currently happens during pre-scheduling task & after task scheduling).

This also fixes this issue:

  • Prevent orphaned state in Seer by allowing CLOSED events through even for draft PRs

Improvements:

  • Non-PR comments filtered earlier in the flow, thus, we won't unnecesarily schedule them
  • Clean separation of concerns with dedicated transform functions for each webhook type
  • DRAFT_PR removed from error types since it's a filter condition, not an error

@armenzg armenzg self-assigned this Jan 15, 2026
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Jan 15, 2026
Currently, `transform_webhook_to_codegen_request` is executed as part of the task, however, we want to execute prior to scheduling the task.

Splitting this function helps with comprehension and will help each GitHub event have its own data transformation path.
@armenzg armenzg force-pushed the 1_15/split_transformation_function/code_review/armenzg branch from def541b to b930edb Compare January 16, 2026 21:00
Copy link
Contributor

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

@armenzg
Copy link
Member Author

armenzg commented Jan 20, 2026

@sentry review

MISSING_PULL_REQUEST = "github.webhook.pull_request.missing-pull-request"
MISSING_ACTION = "github.webhook.pull_request.missing-action"
UNSUPPORTED_ACTION = "github.webhook.pull_request.unsupported-action"
DRAFT_PR = "github.webhook.pull_request.draft-pr"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really don't care to track this since we don't run on drafts.

if pull_request.get("draft") is True:
# Skip draft check for CLOSED actions to ensure Seer receives cleanup notifications
# even if the PR was converted to draft before closing
if action != PullRequestAction.CLOSED and pull_request.get("draft") is True:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently added support for closed, however, we need to report to Seer even if it's a draft.

}


def _get_trigger_metadata(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed.

self.mock_reaction.assert_not_called()
self.mock_seer.assert_not_called()

def test_skips_regular_issue_comments(self) -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tests are basically some of the tests of test_utils.py.
These convert them from unit tests to integration tests.

assert payload["data"]["config"]["trigger_comment_id"] is None
assert payload["data"]["config"]["trigger_comment_type"] is None

def test_pull_request_closed_draft_still_sends_to_seer(self) -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test case missing from yesterday's close pull request support.

comment_body = comment.get("body")

issue = event.get("issue", {})
if not issue.get("pull_request"):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have been unnecessarily scheduling tasks for GitHub comments that are not part of a pull request. This will drop the numbered of scheduled tasks.

except OrganizationContributors.DoesNotExist:
metrics.incr(
"seer.code_review.error.contributor_not_found",
tags={"organization_id": organization_id},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our metrics module should raise errors about forbidden tags, however, it only shows up in the output. I noticed while running tests.

@@ -171,7 +158,7 @@ def _get_target_commit_sha(
# XXX: Refactor this function to handle it at the handler level rather than during task execution
def transform_webhook_to_codegen_request(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewing this section is easier with a split view:

Image

trigger_user = event_payload.get("sender", {}).get("login") or event_payload.get(
"pull_request", {}
).get("user", {}).get("login")
# Prioritize sender (person who triggered the action) over PR author
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment and code changes make it clearer what we were doing in the original code:

Image

@armenzg armenzg marked this pull request as ready for review January 20, 2026 13:40
@armenzg armenzg requested a review from a team as a code owner January 20, 2026 13:40
@armenzg
Copy link
Member Author

armenzg commented Jan 21, 2026

@cursoragent resolve the conflicts

@cursor
Copy link
Contributor

cursor bot commented Jan 21, 2026

Perfect! I've successfully resolved the merge conflicts and pushed the changes. Here's what I did:

Conflict Resolution:

  • Found a merge conflict in src/sentry/seer/code_review/utils.py
  • The conflict was in the _build_repo_definition function where master had added integration_id handling
  • Integrated the integration_id logic from master while maintaining your refactored structure with separate transform_issue_comment_to_codegen_request and transform_pull_request_to_codegen_request functions

Changes Made:

  • Updated _build_repo_definition to build the repo definition dict first, then conditionally add integration_id if present (used for product metrics dashboarding in pr_closed_step)
  • Fixed the organization_id reference to use repo.organization_id since organization is not a parameter
  • Committed and pushed the merge to your branch

The branch is now clean and up to date with master, with all conflicts resolved while preserving your PR's refactored code structure.

Open in Cursor Open in Web

cursoragent and others added 2 commits January 21, 2026 18:53
- Integrated integration_id handling from master into _build_repo_definition
- Maintained refactored structure with separate transform functions

Co-authored-by: armenzg <armenzg@sentry.io>

def _build_repo_definition(repo: Repository, target_commit_sha: str) -> dict[str, Any]:
"""
Build the repository definition for the CodecovTaskRequest.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really called CodecovTaskRequest? We should rename that 😅

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some renaming happening here:
#106810

@armenzg armenzg merged commit c59bf9b into master Jan 23, 2026
66 checks passed
@armenzg armenzg deleted the 1_15/split_transformation_function/code_review/armenzg branch January 23, 2026 11:32
JonasBa pushed a commit that referenced this pull request Jan 27, 2026
Currently, `transform_webhook_to_codegen_request` is executed for both
`issue_comment` and `pull_request`. This decouples the two event types
to help comprehension. I also want all data manipulations to happen in
the same location rather than throughout different stages (this
currently happens during pre-scheduling task & after task scheduling).

This also fixes this issue:
* Prevent orphaned state in Seer by allowing CLOSED events through even
for draft PRs

Improvements:
* Non-PR comments filtered earlier in the flow, thus, we won't
unnecesarily schedule them
* Clean separation of concerns with dedicated transform functions for
each webhook type
* DRAFT_PR removed from error types since it's a filter condition, not
an error

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: getsantry[bot] <66042841+getsantry[bot]@users.noreply.github.com>
@github-actions github-actions bot locked and limited conversation to collaborators Feb 7, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants