Skip to content

perf(snapshots): Batch image fetches and add timeouts for snapshot download#116076

Merged
NicoHinderling merged 2 commits into
masterfrom
nico/perf-snapshot-download-batching
May 22, 2026
Merged

perf(snapshots): Batch image fetches and add timeouts for snapshot download#116076
NicoHinderling merged 2 commits into
masterfrom
nico/perf-snapshot-download-batching

Conversation

@NicoHinderling
Copy link
Copy Markdown
Contributor

Summary

  • Batches objectstore image fetches in groups of 200 (was: all at once) to cap memory usage and keep data flowing steadily to the client during streaming ZIP downloads
  • Adds 30s per-image fetch timeout to prevent a single slow objectstore read from stalling the entire pipeline
  • Adds 90s gateway timeout override for the snapshot download endpoint (matching other large-download endpoints)

Problem

Downloading large snapshots (40K+ images) via GET /api/0/organizations/{org}/preprodartifacts/snapshots/{id}/download/ fails with HTTP/2 stream errors. The endpoint submits all image-fetch futures at once, causing memory spikes and gaps in data flow that trigger connection timeouts.

Fix

  • Batched fetching: Process images in groups of FETCH_BATCH_SIZE=200 instead of submitting all futures at once. Each batch completes before the next starts, keeping a steady flow of ZIP data to the client.
  • Per-fetch timeout: future.result(timeout=30) prevents indefinite blocking on slow objectstore reads. Timed-out images are skipped (logged as failures), same as other fetch errors.
  • Gateway timeout: Added sentry-api-0-organization-preprod-snapshots-download: 90.0 to both sync and async gateway ENDPOINT_TIMEOUT_OVERRIDE dicts.

Test plan

  • Existing preprod snapshot API tests pass (37/37)
  • Manual test: download a large snapshot (40K images) and verify it completes
  • Manual test: download a normal snapshot (400 images) and verify no regression

🤖 Generated with Claude Code

@NicoHinderling NicoHinderling requested review from a team as code owners May 22, 2026 00:32
@github-actions github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label May 22, 2026
Comment thread src/sentry/preprod/api/endpoints/snapshots/preprod_artifact_snapshot_download.py Outdated
Comment thread src/sentry/preprod/api/endpoints/snapshots/preprod_artifact_snapshot_download.py Outdated
Comment thread src/sentry/preprod/api/endpoints/snapshots/preprod_artifact_snapshot_download.py Outdated
@NicoHinderling NicoHinderling force-pushed the nico/perf-snapshot-download-batching branch 2 times, most recently from 0394ef7 to a1a6070 Compare May 22, 2026 01:14
Copy link
Copy Markdown
Contributor

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 55e4647. Configure here.

Comment thread src/sentry/preprod/api/endpoints/snapshots/preprod_artifact_snapshot_download.py Outdated
…wnload

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@NicoHinderling NicoHinderling force-pushed the nico/perf-snapshot-download-batching branch from 55e4647 to fad22c6 Compare May 22, 2026 01:29
@NicoHinderling NicoHinderling merged commit d9540f4 into master May 22, 2026
62 checks passed
@NicoHinderling NicoHinderling deleted the nico/perf-snapshot-download-batching branch May 22, 2026 02:00
NicoHinderling added a commit that referenced this pull request May 22, 2026
…gateway (#116078)

## Summary

The emmett API gateway (`src/apigw/proxy.py`) has its own path-based
`TIMEOUT_OVERRIDES` dict, separate from the Django gateway's
URL-name-based overrides that were added in #116076. Without this entry,
snapshot downloads fall back to the httpx client default of `read=60s`,
causing HTTP/2 stream errors on large snapshots (40K+ images).

Adds `(["/preprodartifacts/snapshots/", "/download/"], 90.0)` to match
the path pattern.

## Test plan

- [ ] Download a large snapshot (40K images) via `sentry-cli snapshots
download` and confirm it no longer fails with HTTP/2 stream errors after
~60s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants