chore(preprod): Optimize the status check task#107169
Conversation
7eeb9f5 to
184832b
Compare
trevor-e
left a comment
There was a problem hiding this comment.
I suspect most of the improvement will come from fixing the N+1 queries in 1 + 2. 3) and 4) theoretically help but our N is so small that I'm not so sure, at least relative to the tradeoff of how much more verbose the code has become. We might want to split this file up in the near future.
| def _fetch_base_size_metrics( | ||
| artifacts: list[PreprodArtifact], | ||
| ) -> dict[int, PreprodArtifactSizeMetrics]: | ||
| """Fetch base artifact size metrics for head artifacts.""" |
There was a problem hiding this comment.
I'm a bit confused why we need two maps here? Couldn't this be something like dict[int, tuple[PreprodArtifact, PreprodArtifactSizeMetrics]] so that one lookup gives both things?
There was a problem hiding this comment.
Although after removing metrics_artifact_type that might need to also be a list[PreprodArtifactSizeMetrics]?
There was a problem hiding this comment.
One artifact can have multiple size metrics (MAIN_ARTIFACT, WATCH_APP, etc. with different identifiers). So the relationship is:
1 head artifact → 1 base artifact
1 head artifact → N base metrics
If you combined as dict[int, tuple[PreprodArtifact, list[PreprodArtifactSizeMetrics]]], the filtering would change from O(1) lookup to O(n) filtering
I agree with you though that the N is so small in this file...
| version_string = _format_version_string(artifact, default=str(_("Unknown"))) | ||
|
|
||
| base_artifact = None | ||
| base_artifact = base_artifact_map.get(artifact.id) if base_artifact_map else None |
There was a problem hiding this comment.
nit: seems like base_artifact_map and base_size_metrics_map should be required params, not optional
src/sentry/preprod/models.py
Outdated
| # Build lookup dict once - O(m) | ||
| # Newest base artifact wins due to -date_added ordering, so first match per key wins | ||
| base_artifacts = list(base_artifacts_qs) | ||
| base_by_key: dict[tuple[str | None, int | None, int | None], PreprodArtifact] = {} |
There was a problem hiding this comment.
nit: I had explicitly combined this into a single loop since its easier to read IMO
There was a problem hiding this comment.
got it, i rolled this back
184832b to
979e822
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 4 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
tests/sentry/preprod/vcs/status_checks/size/test_status_check_filters.py
Outdated
Show resolved
Hide resolved
tests/sentry/preprod/vcs/status_checks/size/test_status_check_filters.py
Outdated
Show resolved
Hide resolved
tests/sentry/preprod/vcs/status_checks/size/test_status_check_filters.py
Outdated
Show resolved
Hide resolved
tests/sentry/preprod/vcs/status_checks/size/test_status_check_filters.py
Outdated
Show resolved
Hide resolved
Performance Optimizations - Added `select_related` for `commit_comparison`, `mobile_app_info` on initial artifact fetch - Added `select_related` for `preprod_artifact` on size metrics fetch - Eliminated N+1 queries in template formatting by pre-fetching and passing base artifact/metrics data - Single-pass dict grouping for base metrics by artifact ID Result: ~N+1 queries → ~7 constant queries (artifact, size metrics, rules, approvals, base commit comparison, base artifacts, base size metrics)
Performance Optimizations
select_relatedforcommit_comparison,mobile_app_infoon initial artifact fetchselect_relatedforpreprod_artifacton size metrics fetchResult: ~N+1 queries → ~7 constant queries (artifact, size metrics, rules, approvals, base commit comparison, base artifacts, base size metrics)