-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
fix(preprod): Use consistent external_id for sibling artifacts to prevent duplicate status checks (EME-610) #103521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…vent duplicate status checks (EME-610) When multiple sibling artifacts (e.g., iOS, Android, Web in a monorepo) complete processing for the same commit, each artifact was using its own ID as the external_id when creating GitHub check runs. This caused GitHub to create separate check runs instead of updating the same one, resulting in duplicate "Size Analysis" checks on PRs. The fix ensures all sibling artifacts use the first sibling's ID as the external_id. Since siblings are deterministically ordered by app_id, this guarantees consistency: regardless of which artifact triggers the task, they all use the same external_id. GitHub recognizes this and updates the same check run instead of creating duplicates. Changes: - Modified create_preprod_status_check_task to use first sibling's ID as external_id instead of the triggering artifact's ID - Added test_create_preprod_status_check_task_external_id_consistency to verify all siblings use the same external_id
221024c to
76a195a
Compare
…icate rows in status checks (EME-610) This commit addresses the second issue in EME-610: duplicate rows appearing in status checks after reprocessing/retriggering CI pipelines. Problem: When CI is retriggered, new artifacts are created with the same app_id as existing artifacts. The previous implementation returned ALL artifacts for a commit, causing status checks to show duplicate rows (e.g., 4 rows after reprocessing instead of 2). Solution: Modified `get_sibling_artifacts_for_commit()` to deduplicate by app_id, matching the behavior used in the Emerge codebase: - Groups all artifacts by app_id - For each app_id group: - If the calling artifact is in that group, use it - Otherwise, use the earliest (oldest) artifact for that app_id - Result: Max 1 artifact per app_id, preventing duplicate rows This ensures status checks consistently show one row per app, even after multiple reprocessing runs. Changes: - Modified PreprodArtifact.get_sibling_artifacts_for_commit() to deduplicate - Updated create_preprod_status_check_task() to use consistent external_id - Added update_check_run() method to GitHub client for future use - Added comprehensive tests for both deduplication scenarios
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #103521 +/- ##
===========================================
+ Coverage 78.03% 80.60% +2.56%
===========================================
Files 9245 9250 +5
Lines 394973 395098 +125
Branches 25190 25190
===========================================
+ Hits 308215 318461 +10246
+ Misses 86327 76206 -10121
Partials 431 431 |
runningcode
added a commit
that referenced
this pull request
Nov 20, 2025
…icate rows in status checks (EME-610) (#103528) ## Summary Fixes EME-610: Duplicate rows appearing in GitHub status checks after reprocessing/retriggering CI pipelines. This PR addresses **only** the duplicate rows issue (not the duplicate check runs issue which is in PR #103521). ## Problem: Duplicate Rows After Reprocessing ### Issue When CI is retriggered or artifacts are reprocessed, new artifacts are created with the same `app_id` as existing artifacts. The status check would then show ALL artifacts, resulting in duplicate rows. **Example**: - Initial upload: iOS-old, Android-old → **2 rows** in status check ✓ - Reprocess: iOS-new, Android-new created (4 total artifacts for same commit) - Status check now shows: iOS-old, iOS-new, Android-old, Android-new → **4 rows** ❌ ### Root Cause `get_sibling_artifacts_for_commit()` returned ALL artifacts for a commit without deduplication: ```python return PreprodArtifact.objects.filter( commit_comparison=self.commit_comparison, project__organization_id=self.project.organization_id, ).order_by("app_id") ``` ### Solution Deduplicate siblings by `app_id`: 1. Group all artifacts by `app_id` 2. For each `app_id` group: - If the **calling artifact** is in that group → use it - Otherwise → use the **earliest** (oldest) artifact for that app_id 3. Result: Max 1 artifact per `app_id` **Example with fix**: - When iOS-new triggers: Returns `[iOS-new, Android-old]` → **2 rows** ✓ - When Android-new triggers: Returns `[iOS-old, Android-new]` → **2 rows** ✓ - Never shows all 4 artifacts --- ## Testing Added comprehensive test `test_sibling_deduplication_after_reprocessing`: - Creates initial artifacts: iOS-old, Android-old - Simulates reprocessing: iOS-new, Android-new (4 total) - Verifies deduplication: - `ios_new.get_sibling_artifacts_for_commit()` → Returns [iOS-new, Android-old] - `android_new.get_sibling_artifacts_for_commit()` → Returns [iOS-old, Android-new] - Never returns all 4 artifacts
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes EME-610: Two related issues causing duplicate status checks and duplicate rows in GitHub check runs for preprod artifacts.
This PR addresses both problems described by Trevor in EME-610:
Problem 1: Duplicate Check Runs on GitHub
Issue
When multiple sibling artifacts (e.g., iOS, Android, Web in a monorepo) completed processing for the same commit, each artifact would create its own separate GitHub check run because they used different
external_idvalues.Result: 3 separate "Size Analysis" checks on GitHub instead of 1 unified check.
Root Cause
Each artifact was using its own ID as the
external_id:GitHub identifies check runs by the tuple:
(repo, sha, check_name, external_id)Different
external_idvalues → GitHub treats them as separate checksSolution
Use the first sibling's ID consistently as
external_idacross all sibling artifacts:Now all sibling artifacts share the same
external_id, so GitHub recognizes them as updates to the same check run.Files Changed:
src/sentry/preprod/vcs/status_checks/size/tasks.py- Use consistent external_idtests/.../test_status_checks_tasks.py- Added testtest_create_preprod_status_check_task_external_id_consistencyProblem 2: Duplicate Rows After Reprocessing
Issue
When CI is retriggered or artifacts are reprocessed, new artifacts are created with the same
app_idas existing artifacts. The status check would then show ALL artifacts, resulting in duplicate rows.Trevor's quote: "The status check itself isn't duplicated...The issue is the duplicated content inside the status check, after re-running it went from 2x builds -> 4x builds."
Example:
Root Cause
get_sibling_artifacts_for_commit()returned ALL artifacts for a commit without deduplication:Solution
Deduplicate siblings by
app_id, matching the behavior in the Emerge codebase (BaseBuildUtils.ts:findSiblingBuilds):app_idapp_idgroup:app_idExample with fix:
[iOS-new, Android-old]→ 2 rows ✓[iOS-old, Android-new]→ 2 rows ✓Files Changed:
src/sentry/preprod/models.py- Added deduplication logic toget_sibling_artifacts_for_commit()tests/.../test_status_checks_tasks.py- Added testtest_sibling_deduplication_after_reprocessingAdditional Changes
src/sentry/integrations/github/client.py: Addedupdate_check_run()method for future flexibility (not currently used but documents GitHub API capability)Testing
Added comprehensive tests:
test_create_preprod_status_check_task_external_id_consistency: Verifies all sibling artifacts use the sameexternal_idtest_sibling_deduplication_after_reprocessing: Verifies deduplication prevents showing all 4 artifacts after reprocessingAll 18 existing tests still pass ✅
Verification
To verify the fix works:
bin/preprod/reproduce_duplicate_checks_bugmultiple times