feat(replays): Add memory-efficient project and project option query caches by cmanallen · Pull Request #101606 · getsentry/sentry

cmanallen · 2025-10-16T16:53:40Z

get_from_cache caches results in a thread-local. This means eight copies of the same project model (and project-option model) for our current configuration. Additionally get_from_cache is very coarse fetching fields we don't need. By caching just the boolean values we care about we can minimize the footprint of each query on our overall memory usage. Since we need to cache a lot of projects and project-options this is important to maintaining stable memory usage.

Total memory should be reduced by O(8n) where n is the delta between the size of a project model and a boolean and the size of the project-option model and a tuple of booleans.

A word on the AutoCache. AutoCache is safe but not logically atomic. We defer to last writer wins and potentially duplicate work. This could be improved but we don't expect the results of the fn argument to produce an effect or be non-deterministic. At least for our current case. However, it might be wise to implement better locking behavior in AutoCache.__getitem__ so we don't unnecessarily compute a project or project-option query multiple times. This is easy enough to do but I think we've done enough already for this pull so we can address this in a follow-up!

A context object is now being passed around the consumer. This is better for testing than using globals. There are more effects that could be moved out of the processing logic and into the context object. This would make our consumer significantly easier to test and require fewer mocks.

…rectly

codecov · 2025-10-16T17:12:43Z

❌ 2 Tests Failed:

Tests completed	Failed	Passed	Skipped
41108	2	41106	251

View the top 2 failed test(s) by shortest run time

tests.sentry.replays.integration.consumers.test_recording::test_recording_consumer_invalid_message

Stack Traces | 0.046s run time

#x1B[1m#x1B[.../integration/consumers/test_recording.py#x1B[0m:27: in consumer
    ).create_with_partitions(lambda x, force=False: None, {})
#x1B[1m#x1B[.../replays/consumers/recording.py#x1B[0m:73: in create_with_partitions
    if options.get("replay.consumer.enable_new_query_caching_system"):
#x1B[1m#x1B[.../sentry/options/manager.py#x1B[0m:312: in get
    result = self.store.get(opt, silent=silent)
#x1B[1m#x1B[.../sentry/options/store.py#x1B[0m:115: in get
    result = self.get_store(key, silent=silent)
#x1B[1m#x1B[.../sentry/options/store.py#x1B[0m:215: in get_store
    value = self.model.objects.get(key=key.name).value
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/manager.py#x1B[0m:87: in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:629: in get
    num = len(clone)
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:366: in __len__
    self._fetch_all()
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:1945: in _fetch_all
    self._result_cache = list(self._iterable_class(self))
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:91: in __iter__
    results = compiler.execute_sql(
#x1B[1m#x1B[31m.venv/lib/python3.13.../models/sql/compiler.py#x1B[0m:1621: in execute_sql
    cursor = self.connection.cursor()
#x1B[1m#x1B[31m.venv/lib/python3.13.../django/utils/asyncio.py#x1B[0m:26: in inner
    return func(*args, **kwargs)
#x1B[1m#x1B[31m.venv/lib/python3.13.../backends/base/base.py#x1B[0m:320: in cursor
    return self._cursor()
#x1B[1m#x1B[.../db/postgres/decorators.py#x1B[0m:38: in inner
    return func(self, *args, **kwargs)
#x1B[1m#x1B[.../db/postgres/base.py#x1B[0m:114: in _cursor
    return super()._cursor()
#x1B[1m#x1B[31m.venv/lib/python3.13.../backends/base/base.py#x1B[0m:296: in _cursor
    self.ensure_connection()
#x1B[1m#x1B[31mE   RuntimeError: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.#x1B[0m

tests.sentry.replays.integration.consumers.test_recording::test_recording_consumer

Stack Traces | 0.048s run time

#x1B[1m#x1B[.../integration/consumers/test_recording.py#x1B[0m:27: in consumer
    ).create_with_partitions(lambda x, force=False: None, {})
#x1B[1m#x1B[.../replays/consumers/recording.py#x1B[0m:73: in create_with_partitions
    if options.get("replay.consumer.enable_new_query_caching_system"):
#x1B[1m#x1B[.../sentry/options/manager.py#x1B[0m:312: in get
    result = self.store.get(opt, silent=silent)
#x1B[1m#x1B[.../sentry/options/store.py#x1B[0m:115: in get
    result = self.get_store(key, silent=silent)
#x1B[1m#x1B[.../sentry/options/store.py#x1B[0m:215: in get_store
    value = self.model.objects.get(key=key.name).value
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/manager.py#x1B[0m:87: in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:629: in get
    num = len(clone)
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:366: in __len__
    self._fetch_all()
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:1945: in _fetch_all
    self._result_cache = list(self._iterable_class(self))
#x1B[1m#x1B[31m.venv/lib/python3.13.../db/models/query.py#x1B[0m:91: in __iter__
    results = compiler.execute_sql(
#x1B[1m#x1B[31m.venv/lib/python3.13.../models/sql/compiler.py#x1B[0m:1621: in execute_sql
    cursor = self.connection.cursor()
#x1B[1m#x1B[31m.venv/lib/python3.13.../django/utils/asyncio.py#x1B[0m:26: in inner
    return func(*args, **kwargs)
#x1B[1m#x1B[31m.venv/lib/python3.13.../backends/base/base.py#x1B[0m:320: in cursor
    return self._cursor()
#x1B[1m#x1B[.../db/postgres/decorators.py#x1B[0m:38: in inner
    return func(self, *args, **kwargs)
#x1B[1m#x1B[.../db/postgres/base.py#x1B[0m:114: in _cursor
    return super()._cursor()
#x1B[1m#x1B[31m.venv/lib/python3.13.../backends/base/base.py#x1B[0m:296: in _cursor
    self.ensure_connection()
#x1B[1m#x1B[31mE   RuntimeError: Database access not allowed, use the "django_db" mark, or the "db" or "transactional_db" fixtures to enable it.#x1B[0m

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

…n/replays-project-query-cache

srest2021

I think these two tests are failing with db access errors because options.get("replay.consumer.enable_new_query_caching_system") in ProcessReplayRecordingStrategyFactory.create_with_partitions() isn't being correctly patched by the existing options_get mock:

tests/sentry/replays/integration/consumers/test_recording.py::test_recording_consumer_invalid_message

tests/sentry/replays/integration/consumers/test_recording.py::test_recording_consumer

You can patch options.get() in the consumer fixture to make sure the tests pass.

Approving b/c I patched it locally to return True and the tests passed!

srest2021 · 2025-10-17T06:37:10Z

+
+    # We're intentionally manually looking up the options. We're avoided the project-options local
+    # cache which exist on the preferred interface methods.
+    options = ProjectOption.objects.filter(


This will return false for both options if project doesn't exist. _has_replays_lookup() will raise before we get here, but might be good to add the same raise DropEvent() behavior here to make the expectation that the project exists explicit.

That's intentional. I'm not sure if an absence of the project-options implies the absence of the project and I don't want to query to find out (since another motivation with this PR is to reduce the amount of times PG bouncer rejects our queries and it would generally be bad for throughput).

In practice, Relay drops events from deleted projects so what we would be catching here are poorly timed deletions. Raising here would prevent an unnecessary publish to the issues platform but its not a big deal if we occasionally push bad data. They validate it regardless.

cmanallen · 2025-10-17T13:28:44Z

Thank you @srest2021 for the notes on patching the fixture!

sentry · 2025-10-18T08:56:50Z

Issues attributed to commits in this pull request

This pull request was merged and Sentry observed the following issues:

‼️ BadGateway: POST https://storage.googleapis.com/up... in prod
‼️ InternalServerError: POST https://storage.googleap... in prod
‼️ TransportError: Failed to retrieve http://metadata... in prod
‼️ Forbidden: POST https://storage.googleapis.com/upl... in prod

…n query caches (#101606)" This reverts commit 808f085.

cmanallen added 4 commits October 15, 2025 20:12

Only query for project if segment 0 and search for project options di…

34f4297

…rectly

Add new cache facades

1153acc

Add cache implementation

d67e741

Fix impl

e83369f

cmanallen requested a review from a team as a code owner October 16, 2025 16:53

github-actions Bot added the Scope: Backend Automatically applied to PRs that change backend components label Oct 16, 2025

This comment was marked as outdated.

Sign in to view

seer-by-sentry Bot reviewed Oct 16, 2025

View reviewed changes

Comment thread src/sentry/replays/usecases/ingest/cache.py Outdated

Comment thread src/sentry/replays/usecases/ingest/cache.py Outdated

Comment thread src/sentry/replays/usecases/ingest/__init__.py Outdated

Comment thread src/sentry/replays/usecases/ingest/event_logger.py Outdated

cmanallen added 3 commits October 16, 2025 12:56

Add test coverage

f43adfb

Fix types

05a786c

Merge branch 'cmanallen/replays-refactor-project-query' into cmanalle…

fa9c192

…n/replays-project-query-cache

vercel Bot deployed to Preview October 16, 2025 19:19 View deployment

cmanallen added 2 commits October 16, 2025 14:20

Use project_id

57e399a

Specify id as a kwarg

43d7163

This comment was marked as outdated.

Sign in to view

vercel Bot deployed to Preview October 16, 2025 19:26 View deployment

This comment was marked as outdated.

Sign in to view

Add option cache coverage and fix defects

77b76ed

vercel Bot deployed to Preview October 16, 2025 19:33 View deployment

cmanallen added 2 commits October 16, 2025 14:43

Add test coverage and fix implementation

75ea406

Merge branch 'master' into cmanallen/replays-project-query-cache

479a270

vercel Bot deployed to Preview October 16, 2025 19:47 View deployment

Add E2E test with some assertions on behavior

87bc650

vercel Bot deployed to Preview October 16, 2025 20:12 View deployment

Add missing param

c4d57e8

vercel Bot deployed to Preview October 16, 2025 20:15 View deployment

Add type hints

a9422c1

vercel Bot deployed to Preview October 16, 2025 20:21 View deployment

This comment was marked as outdated.

Sign in to view

Add explicit coverage for options_cache

5c70df4

vercel Bot deployed to Preview October 16, 2025 20:37 View deployment

Pass caches as arguments to the consumer instead of globals

9c32e71

vercel Bot deployed to Preview October 17, 2025 00:52 View deployment

Make it clearer what is being done

665dff5

srest2021 approved these changes Oct 17, 2025

View reviewed changes

Patch option in consumer fixture

7f2024a

vercel Bot deployed to Preview October 17, 2025 13:30 View deployment

This comment was marked as outdated.

Sign in to view

Add type hint

4383f49

vercel Bot deployed to Preview October 17, 2025 15:01 View deployment

cmanallen merged commit 808f085 into master Oct 17, 2025
69 checks passed

cmanallen deleted the cmanallen/replays-project-query-cache branch October 17, 2025 15:24

srest2021 added a commit that referenced this pull request Oct 20, 2025

Revert "feat(replays): Add memory-efficient project and project optio…

eff6b54

…n query caches (#101606)" This reverts commit 808f085.

srest2021 mentioned this pull request Oct 20, 2025

fix(replay): revert project options cache & project options direct querying #101826

Closed

github-actions Bot locked and limited conversation to collaborators Nov 11, 2025

Uh oh!

Conversation

cmanallen commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ 2 Tests Failed:

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

srest2021 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

srest2021 Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

cmanallen Oct 17, 2025

Choose a reason for hiding this comment

Uh oh!

cmanallen commented Oct 17, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

Uh oh!

sentry Bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issues attributed to commits in this pull request

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cmanallen commented Oct 16, 2025 •

edited

Loading

codecov Bot commented Oct 16, 2025 •

edited

Loading

srest2021 left a comment •

edited

Loading

sentry Bot commented Oct 18, 2025 •

edited

Loading