feat(dynamic-sampling): add recalibration logic to the new per-org pipeline#115786
feat(dynamic-sampling): add recalibration logic to the new per-org pipeline#115786shellmayr wants to merge 9 commits into
Conversation
| @property | ||
| def needs_recalibration(self) -> bool: | ||
| return False | ||
|
|
There was a problem hiding this comment.
We could just use needs_recalibration: bool = False if this never depends on dynamic data
| def should_log(location: str) -> bool: | ||
| if location not in (options.get(LOGGING_LOCATIONS_OPTION) or []): | ||
| return False | ||
|
|
||
| sample_rate = float(options.get(LOGGING_SAMPLE_RATE_OPTION)) | ||
| if sample_rate <= 0.0: | ||
| return False | ||
| if sample_rate >= 1.0: | ||
| return True | ||
| return random.random() < sample_rate |
There was a problem hiding this comment.
I think this is a bit convoluted, we could separate our concerns a bit better by using a logging filter. We could re-use or adapt the SamplingFilter
| if org_volume_1_hour is None: | ||
| return DynamicSamplingStatus.NO_VOLUME | ||
|
|
||
| calculate_recalibration_factor(config, org_volume_5_minutes) |
There was a problem hiding this comment.
Bug: The result of calculate_recalibration_factor is discarded, so the computed recalibration factor is never stored in the cache, rendering the feature ineffective.
Severity: HIGH
Suggested Fix
Store the return value of calculate_recalibration_factor in the cache. This will likely require adding a cache-writing function to the new cache.py module and then calling it from scheduler.py with the computed factor.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/sentry/dynamic_sampling/per_org/tasks/scheduler.py#L122
Potential issue: The function `calculate_recalibration_factor` is called, but its return
value is discarded and never stored in the cache. Consequently, any subsequent attempt
to retrieve the factor, such as with
`per_org_recalibration_cache.get_adjusted_factor()`, will always receive the default
value of `1.0`. This renders the per-organization recalibration feature ineffective and
makes the associated diagnostic logging meaningless, as it will always compare the old
factor to a 'new' factor of `1.0`.
Did we get this right? 👍 / 👎 to inform future reviews.
| if org_volume.indexed is None or org_volume.total == 0: | ||
| raise DynamicSamplingInvalidOrgVolumes |
There was a problem hiding this comment.
Bug: The check to raise DynamicSamplingInvalidOrgVolumes is unreachable because the conditions are already guaranteed to be false by the preceding is_valid_for_recalibration() call, making it dead code.
Severity: LOW
Suggested Fix
Remove the unreachable code block that raises DynamicSamplingInvalidOrgVolumes. The preceding is_valid_for_recalibration() check is sufficient. The DynamicSamplingInvalidOrgVolumes exception class, which becomes unused, can also be removed.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: src/sentry/dynamic_sampling/per_org/calculations.py#L43-L44
Potential issue: In the `calculate_recalibration_factor` function, a check for
`org_volume.is_valid_for_recalibration()` is performed. If this check passes, it
guarantees that `org_volume.total > 0` and `org_volume.indexed` is not `None`. However,
the subsequent lines of code check for `org_volume.indexed is None or org_volume.total
== 0` and attempt to raise a `DynamicSamplingInvalidOrgVolumes` exception. Due to the
prior validation, these conditions can never be met, rendering the exception-raising
code unreachable. This dead code fails to implement the intended safeguard against
impossible data states.
Backend Test FailuresFailures on
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit ef0371a. Configure here.
| if org_volume is None or not org_volume.is_valid_for_recalibration(): | ||
| return None | ||
| if org_volume.indexed is None or org_volume.total == 0: | ||
| raise DynamicSamplingInvalidOrgVolumes |
There was a problem hiding this comment.
Unreachable exception makes guard logic dead code
Low Severity
The check if org_volume.indexed is None or org_volume.total == 0 on line 43 is unreachable. The preceding guard on line 41 calls org_volume.is_valid_for_recalibration(), which returns self.total > 0 and self.indexed is not None and self.indexed > 0. If that method returns True (required to reach line 43), then indexed is guaranteed non-None and total is guaranteed positive. The DynamicSamplingInvalidOrgVolumes exception can therefore never be raised, and the exception class itself is unused dead code.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit ef0371a. Configure here.


Closes TET-2233