Skip to content

fix(testutils): Make _snuba_pool lazy for xdist per-worker URL isolation#112797

Draft
joshuarli wants to merge 1 commit intomasterfrom
fix/snuba-xdist-lazy-pool
Draft

fix(testutils): Make _snuba_pool lazy for xdist per-worker URL isolation#112797
joshuarli wants to merge 1 commit intomasterfrom
fix/snuba-xdist-lazy-pool

Conversation

@joshuarli
Copy link
Copy Markdown
Member

Problem

In xdist shuffled runs each worker gets its own Snuba container on a different port (1230, 1231, 1232). The module-level _snuba_pool in snuba.py was created eagerly at import time — before pytest_configure ran and set settings.SENTRY_SNUBA to the per-worker URL. All workers therefore shared the default pool pointing at port 1218, causing Connection refused errors under load.

A secondary failure mode: override_settings() restores the original settings.SENTRY_SNUBA value when its context exits, silently reverting any URL override made during configure_for_worker.

Changes

src/sentry/utils/snuba.py

  • Replace the eager module-level _snuba_pool (a urllib3 HTTPConnectionPool) with a _SnubaPool lazy proxy class.
  • The proxy reads os.environ["_SENTRY_SNUBA_POOL_URL"] (if set) or settings.SENTRY_SNUBA on first use, and rebuilds the inner pool whenever the URL changes.
  • Public interface (urlopen) is unchanged; no callers updated.

src/sentry/testutils/pytest/xdist.py

  • Rewrite get_snuba_url() to read PYTEST_XDIST_WORKER and SNUBA_PORT_BASE at call time rather than import time, so the correct URL is returned after the worker's env is set up.
  • Switch Redis DB allocation from a hard cap of 7 workers to round-robin across 7 slots, so runs with > 7 workers no longer crash.

src/sentry/testutils/pytest/sentry.py

  • In configure_for_worker (xdist) and pytest_runtest_setup, write the per-worker Snuba URL into both settings.SENTRY_SNUBA and os.environ["_SENTRY_SNUBA_POOL_URL"]. The env var survives override_settings() resets that would otherwise revert the URL.
  • Force a pool rebuild when the URL doesn't match the proxy's cached URL.

Test plan

Verified by running the shuffle-tests-across-shards workflow with 16 shards × 3 xdist workers: Snuba connection errors disappeared after this change.

In xdist, each worker starts its own Snuba container on a different port.
The module-level _snuba_pool was created eagerly at import time — before
pytest_configure ran and set the per-worker SENTRY_SNUBA URL — so all
workers shared the default pool pointing at the wrong port.

Changes:
- snuba.py: Replace the eager module-level pool with a _SnubaPool proxy
  that rebuilds the pool on first use (or when the URL changes). Reads
  os.environ["_SENTRY_SNUBA_POOL_URL"] so it survives Django
  override_settings() resets that restore settings.SENTRY_SNUBA.
- pytest/xdist.py: Rewrite get_snuba_url() to read env vars at call time
  (not import time) and parse PYTEST_XDIST_WORKER dynamically. Also
  switch Redis DB allocation to round-robin so > 7 workers don't crash.
- pytest/sentry.py: Write the per-worker URL into both settings and
  os.environ["_SENTRY_SNUBA_POOL_URL"] in pytest_configure and
  pytest_runtest_setup, and force a pool rebuild when the URL changes.
@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 13, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant