-
Notifications
You must be signed in to change notification settings - Fork 0
feat(upsampling) - Support upsampled error count with performance optimizations #3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…(#94376) Part of the Error Upsampling project: https://www.notion.so/sentry/Tech-Spec-Error-Up-Sampling-1e58b10e4b5d80af855cf3b992f75894?source=copy_link Events-stats API will now check if all projects in the query are allowlisted for upsampling, and convert the count query to a sum over `sample_weight` in Snuba, this is done by defining a new SnQL function `upsampled_count()`. I noticed there are also eps() and epm() functions in use in this endpoint. I considered (and even worked on) also supporting swapping eps() and epm() which for correctness should probably also not count naively and use `sample_weight`, but this caused some complications and since they are only in use by specific dashboard widgets and not available in discover I decided to defer changing them until we realize it is needed.
- Add 60-second cache for upsampling eligibility checks to improve performance - Separate upsampling eligibility check from query transformation for better optimization - Remove unnecessary null checks in upsampled_count() function per schema requirements - Add cache invalidation utilities for configuration management This improves performance during high-traffic periods by avoiding repeated expensive allowlist lookups while maintaining data consistency.
WalkthroughThis change introduces error upsampling support for event statistics queries in Sentry. It adds helper functions for upsampling eligibility, query column transformation, and cache management, and integrates these into the query logic. The Discover dataset gains a new Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant APIEndpoint
participant ErrorUpsamplingHelper
participant DiscoverDataset
Client->>APIEndpoint: Request event stats (with query)
APIEndpoint->>ErrorUpsamplingHelper: Check if upsampling applies (project, dataset, request)
ErrorUpsamplingHelper->>ErrorUpsamplingHelper: Check cache/allowlist, dataset, query
ErrorUpsamplingHelper-->>APIEndpoint: Return upsampling eligibility
APIEndpoint->>ErrorUpsamplingHelper: Transform query columns if eligible
APIEndpoint->>DiscoverDataset: Execute query (with transformed columns)
DiscoverDataset-->>APIEndpoint: Return results
APIEndpoint-->>Client: Respond with stats (upsampled if eligible)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
✅ Actions performedReview triggered.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
1 similar comment
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
src/sentry/testutils/factories.py (1)
344-358
: Improve exception handling as suggested by static analysis.The function correctly implements the sample rate extraction logic, but the exception handling can be improved.
Apply this diff to use
contextlib.suppress
as suggested by static analysis:def _set_sample_rate_from_error_sampling(normalized_data: MutableMapping[str, Any]) -> None: """Set 'sample_rate' on normalized_data if contexts.error_sampling.client_sample_rate is present and valid.""" client_sample_rate = None - try: + with contextlib.suppress(Exception): client_sample_rate = ( normalized_data.get("contexts", {}).get("error_sampling", {}).get("client_sample_rate") ) - except Exception: - pass if client_sample_rate: - try: + with contextlib.suppress(Exception): normalized_data["sample_rate"] = float(client_sample_rate) - except Exception: - passThis follows the static analysis suggestion and makes the code more concise while maintaining the same behavior.
src/sentry/api/helpers/error_upsampling.py (2)
67-75
: Consider extracting cache key generation to avoid duplication.The cache invalidation logic is correct, but the cache key generation is duplicated between this function and
is_errors_query_for_error_upsampled_projects
.Extract cache key generation to a helper function:
+def _get_upsampling_cache_key(organization_id: int, project_ids: Sequence[int]) -> str: + """Generate cache key for upsampling eligibility.""" + return f"error_upsampling_eligible:{organization_id}:{hash(tuple(sorted(project_ids)))}" + def invalidate_upsampling_cache(organization_id: int, project_ids: Sequence[int]) -> None: """ Invalidate the upsampling eligibility cache for the given organization and projects. This should be called when the allowlist configuration changes to ensure cache consistency across the system. """ - cache_key = f"error_upsampling_eligible:{organization_id}:{hash(tuple(sorted(project_ids)))}" + cache_key = _get_upsampling_cache_key(organization_id, project_ids) cache.delete(cache_key)Then update the main function to use the helper as well.
137-140
: Simplify the return logic.The conditional can be simplified by returning the boolean expression directly.
- if "event.type:error" in query: - return True - - return False + return "event.type:error" in query
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
pyproject.toml
(2 hunks)sentry-repo
(1 hunks)src/sentry/api/endpoints/organization_events_stats.py
(6 hunks)src/sentry/api/helpers/error_upsampling.py
(1 hunks)src/sentry/search/events/datasets/discover.py
(1 hunks)src/sentry/testutils/factories.py
(3 hunks)tests/sentry/api/helpers/test_error_upsampling.py
(1 hunks)tests/snuba/api/endpoints/test_organization_events_stats.py
(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
src/sentry/testutils/factories.py
347-352: Use contextlib.suppress(Exception)
instead of try
-except
-pass
Replace with contextlib.suppress(Exception)
(SIM105)
354-357: Use contextlib.suppress(Exception)
instead of try
-except
-pass
Replace with contextlib.suppress(Exception)
(SIM105)
src/sentry/api/helpers/error_upsampling.py
137-140: Return the condition "event.type:error" in query
directly
Replace with return "event.type:error" in query
(SIM103)
🔇 Additional comments (24)
sentry-repo (1)
1-1
: Submodule pointer update – ensure the referenced commit is accessibleThe file just pins the submodule to commit
a5d290951def84afdcc4c88d2f1f20023fc36e2a
. Double-check that:
- The commit exists in the submodule’s remote repo.
- The CI/CD runner has access rights to fetch that commit (e.g., correct SSH keys / tokens are configured).
A missing or private commit will break checkouts on fresh clones.
pyproject.toml (1)
176-176
: Addition looks correctIncluding the new helper module in the “stronger typing” section keeps type-checking strict for the newly-added code. No concerns.
src/sentry/testutils/factories.py (2)
11-11
: LGTM!The import addition is correctly placed and necessary for the type annotation in the new helper function.
1048-1050
: LGTM!The integration is well-placed in the event storage workflow. Calling the helper function after normalization and before saving ensures that the sample rate from error sampling contexts is properly set on stored events for downstream upsampling logic.
src/sentry/api/endpoints/organization_events_stats.py (4)
14-17
: LGTM!The import additions are necessary for the error upsampling feature and follow proper naming conventions.
124-124
: LGTM!Good correction to improve the grammar and clarity of the error message.
218-228
: LGTM!The upsampling eligibility logic is well-implemented with good performance optimization through early caching. The separation between eligibility checking and transformation application is clean, and the comments clearly explain the rationale.
230-234
: LGTM!The upsampling transformation is consistently applied across all query execution paths (top events, RPC timeseries, and standard timeseries). The late transformation approach ensures current schema assumptions are used, and the uniform pattern makes the code maintainable.
Also applies to: 239-239, 254-254, 275-278, 282-282, 294-297, 299-299
tests/sentry/api/helpers/test_error_upsampling.py (6)
1-16
: LGTM! Clean and comprehensive imports.The imports are well-organized and include all necessary modules for testing the error upsampling helper functions.
19-36
: Well-structured test setup.The test class setup properly initializes all necessary test fixtures including organization, projects, and mock request objects. The use of
TestCase
and proper fixture creation follows Django testing best practices.
37-53
: Excellent test coverage for project allowlist checking.The test method comprehensively covers all scenarios for
_are_all_projects_error_upsampled
:
- All projects allowlisted (positive case)
- Partial allowlist (negative case)
- Empty allowlist (negative case)
- Empty project list (edge case)
The use of mocking and clear test structure makes this very maintainable.
54-76
: Thorough testing of column transformation logic.The test method effectively validates the
transform_query_columns_for_error_upsampling
function with excellent edge case coverage:
- Basic transformation:
count()
→upsampled_count() as count
- Case insensitivity handling
- Whitespace normalization
- Preservation of non-count columns
This ensures robust handling of various input formats.
77-89
: Good coverage of query type detection.The test method properly validates
_is_error_focused_query
with appropriate test cases:
- Explicit error type query (positive case)
- Explicit transaction type query (negative case)
- Empty query (default case)
The use of
QueryDict
accurately simulates HTTP request parameters.
90-102
: Comprehensive testing of sample weight transform logic.The test method effectively validates
_should_apply_sample_weight_transform
across different scenarios:
- Errors dataset: always applies transform
- Transactions dataset: never applies transform
- Discover dataset: applies transform based on query type (error vs transaction)
The test logic correctly captures the conditional behavior for different dataset types.
tests/snuba/api/endpoints/test_organization_events_stats.py (5)
3554-3556
: LGTM! Clear test class definition for error upsampling feature.The test class follows the established pattern in the file with proper inheritance and endpoint definition.
3557-3603
: Well-structured test setup with comprehensive event creation.The setup method properly creates the necessary test data including:
- Two projects and users for testing cross-project scenarios
- Error events with
error_sampling
context containingclient_sample_rate: 0.1
- Proper event storage and synchronization with
wait_for_event_count
The test data setup aligns well with the upsampling feature requirements.
3629-3653
: Good test coverage for partial allowlist scenario.This test correctly verifies that upsampling is only applied when ALL projects in the query are allowlisted. When only one project is allowlisted but the query includes both projects, it falls back to regular counting.
The assertion comments accurately reflect the expected behavior.
3654-3698
: Excellent test for transaction event filtering.This test properly verifies that upsampling is only applied to error events, not transaction events, even when projects are allowlisted. The transaction event creation includes all required fields for a valid transaction.
The test correctly expects regular count behavior for transaction queries.
3699-3723
: Complete test coverage for empty allowlist scenario.This test ensures that when no projects are allowlisted, regular counting is used regardless of the presence of error sampling context. Good edge case coverage.
src/sentry/api/helpers/error_upsampling.py (5)
1-11
: LGTM - Clean and appropriate imports.The imports are well-organized and all necessary for the module's functionality.
13-41
: Well-implemented caching strategy for performance optimization.The function correctly implements a caching layer to avoid expensive repeated allowlist lookups. The cache key design using organization ID and hashed project IDs is sound and deterministic.
43-65
: LGTM - Robust allowlist validation logic.The function correctly handles edge cases (empty inputs) and uses efficient
all()
check to ensure every project is allowlisted. The documentation about configuration changes is helpful.
77-101
: Conservative and safe query transformation logic.The function correctly transforms only exact
count()
matches toupsampled_count()
while preserving other columns. The case-insensitive comparison with strip() handles common variations safely.
103-128
: Clear dataset-specific logic with appropriate safeguards.The function correctly handles different dataset types with sensible defaults. Local imports are appropriate to avoid circular dependencies, and the fallback to
False
for unknown datasets is a safe choice.
SnQLFunction( | ||
"upsampled_count", | ||
required_args=[], | ||
# Optimized aggregation for error upsampling - assumes sample_weight | ||
# exists for all events in allowlisted projects as per schema design | ||
snql_aggregate=lambda args, alias: Function( | ||
"toInt64", | ||
[Function("sum", [Column("sample_weight")])], | ||
alias, | ||
), | ||
default_result_type="number", | ||
), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toInt64(sum(sample_weight))
risks silent truncation of fractional weights
sample_weight
is typically a floating value when upsampling by an inverse sampling‐rate (e.g. 0.1, 2.0).
Casting sum(sample_weight)
to Int64
drops the fractional part and under-counts events whenever weights are not whole numbers.
- snql_aggregate=lambda args, alias: Function(
- "toInt64",
- [Function("sum", [Column("sample_weight")])],
- alias,
- ),
- default_result_type="number",
+ # Keep the aggregation in floating precision to avoid truncation.
+ snql_aggregate=lambda _args, alias: Function(
+ "sum",
+ [Column("sample_weight")],
+ alias,
+ ),
+ # Result can be fractional, keep as "number" (float) or adopt
+ # "float" / "decimal" result_type if the type helper supports it.
+ default_result_type="number",
If consumers need an integer, consider round()
or explicit toUInt64
after verifying the weights are always integral.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
SnQLFunction( | |
"upsampled_count", | |
required_args=[], | |
# Optimized aggregation for error upsampling - assumes sample_weight | |
# exists for all events in allowlisted projects as per schema design | |
snql_aggregate=lambda args, alias: Function( | |
"toInt64", | |
[Function("sum", [Column("sample_weight")])], | |
alias, | |
), | |
default_result_type="number", | |
), | |
SnQLFunction( | |
"upsampled_count", | |
required_args=[], | |
# Optimized aggregation for error upsampling - assumes sample_weight | |
# exists for all events in allowlisted projects as per schema design | |
# Keep the aggregation in floating precision to avoid truncation. | |
snql_aggregate=lambda _args, alias: Function( | |
"sum", | |
[Column("sample_weight")], | |
alias, | |
), | |
# Result can be fractional, keep as "number" (float) or adopt | |
# "float" / "decimal" result_type if the type helper supports it. | |
default_result_type="number", | |
), |
🤖 Prompt for AI Agents
In src/sentry/search/events/datasets/discover.py around lines 1041 to 1052, the
current use of toInt64(sum(sample_weight)) truncates fractional parts of
sample_weight, causing under-counting when weights are non-integers. To fix
this, replace toInt64 with a rounding function like round() applied to
sum(sample_weight) before casting, or use toUInt64 after confirming weights are
integral, ensuring fractional weights are properly accounted for without silent
truncation.
@mock.patch("sentry.api.helpers.error_upsampling.options") | ||
def test_error_upsampling_with_allowlisted_projects(self, mock_options): | ||
# Set up allowlisted projects | ||
mock_options.get.return_value = [self.project.id, self.project2.id] | ||
|
||
# Test with count() aggregation | ||
response = self.client.get( | ||
self.url, | ||
data={ | ||
"start": self.day_ago.isoformat(), | ||
"end": (self.day_ago + timedelta(hours=2)).isoformat(), | ||
"interval": "1h", | ||
"yAxis": "count()", | ||
"query": "event.type:error", | ||
"project": [self.project.id, self.project2.id], | ||
}, | ||
format="json", | ||
) | ||
|
||
assert response.status_code == 200, response.content | ||
data = response.data["data"] | ||
assert len(data) == 2 # Two time buckets | ||
assert data[0][1][0]["count"] == 10 # First bucket has 1 event | ||
assert data[1][1][0]["count"] == 10 # Second bucket has 1 event | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the upsampling calculation logic.
The test expects counts of 10 for both time buckets, which implies a 10x multiplier (1/0.1 = 10). However, the assertion comments suggest there's 1 event in each bucket, but the expected count is 10.
Please verify that:
- The upsampling multiplier calculation (1/client_sample_rate = 1/0.1 = 10) is correct
- The event distribution across time buckets is as expected
🏁 Script executed:
#!/bin/bash
# Verify upsampling logic implementation
echo "Searching for upsampling calculation logic..."
rg -A 10 -B 5 "client_sample_rate|upsampled_count" --type py
Length of output: 82975
Clarify upsampling in test assertions
The upsampling logic is correct—each raw event is weighted by 1/client_sample_rate = 1/0.1 = 10, and you have one event in each 1-hour bucket, yielding a count of 10 per bucket. To avoid confusion, update the inline assertion comments to reflect the upsampled count rather than the raw event count:
• tests/snuba/api/endpoints/test_organization_events_stats.py, around lines 3608–3613
assert len(data) == 2 # Two time buckets
- assert data[0][1][0]["count"] == 10 # First bucket has 1 event
+ assert data[0][1][0]["count"] == 10 # First bucket upsampled count: 1 raw event × 10
- assert data[1][1][0]["count"] == 10 # Second bucket has 1 event
+ assert data[1][1][0]["count"] == 10 # Second bucket upsampled count: 1 raw event × 10
No changes to the upsampling logic itself are required.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@mock.patch("sentry.api.helpers.error_upsampling.options") | |
def test_error_upsampling_with_allowlisted_projects(self, mock_options): | |
# Set up allowlisted projects | |
mock_options.get.return_value = [self.project.id, self.project2.id] | |
# Test with count() aggregation | |
response = self.client.get( | |
self.url, | |
data={ | |
"start": self.day_ago.isoformat(), | |
"end": (self.day_ago + timedelta(hours=2)).isoformat(), | |
"interval": "1h", | |
"yAxis": "count()", | |
"query": "event.type:error", | |
"project": [self.project.id, self.project2.id], | |
}, | |
format="json", | |
) | |
assert response.status_code == 200, response.content | |
data = response.data["data"] | |
assert len(data) == 2 # Two time buckets | |
assert data[0][1][0]["count"] == 10 # First bucket has 1 event | |
assert data[1][1][0]["count"] == 10 # Second bucket has 1 event | |
@mock.patch("sentry.api.helpers.error_upsampling.options") | |
def test_error_upsampling_with_allowlisted_projects(self, mock_options): | |
# Set up allowlisted projects | |
mock_options.get.return_value = [self.project.id, self.project2.id] | |
# Test with count() aggregation | |
response = self.client.get( | |
self.url, | |
data={ | |
"start": self.day_ago.isoformat(), | |
"end": (self.day_ago + timedelta(hours=2)).isoformat(), | |
"interval": "1h", | |
"yAxis": "count()", | |
"query": "event.type:error", | |
"project": [self.project.id, self.project2.id], | |
}, | |
format="json", | |
) | |
assert response.status_code == 200, response.content | |
data = response.data["data"] | |
assert len(data) == 2 # Two time buckets | |
assert data[0][1][0]["count"] == 10 # First bucket upsampled count: 1 raw event × 10 | |
assert data[1][1][0]["count"] == 10 # Second bucket upsampled count: 1 raw event × 10 |
🤖 Prompt for AI Agents
In tests/snuba/api/endpoints/test_organization_events_stats.py around lines 3608
to 3613, update the inline comments on the assertions to clarify that the counts
represent the upsampled event counts (count of 10) rather than raw event counts.
Change comments to indicate that each bucket has 1 raw event upsampled to 10 due
to the client_sample_rate weighting, without modifying the test logic or
assertion values.
Test 3
Summary by CodeRabbit
New Features
Bug Fixes
topEvents
parameter input.Tests
Chores