Skip to content

Conversation

@ceorourke
Copy link
Member

@ceorourke ceorourke commented Apr 17, 2024

Adds a processor for rules with "slow" conditions that are evaluated up to a minute after the event comes in. This PR only goes as far as getting the rules that we need to fire, the actual firing will happen in a follow up PR.

Addresses part of https://github.com/getsentry/team-core-product-foundations/issues/242

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Apr 17, 2024
@codecov
Copy link

codecov bot commented Apr 17, 2024

Codecov Report

Attention: Patch coverage is 88.28125% with 15 lines in your changes are missing coverage. Please review.

Project coverage is 79.65%. Comparing base (781586e) to head (6fa2e8c).
Report is 2 commits behind head on master.

❗ Current head 6fa2e8c differs from pull request most recent head 1ad7747. Consider uploading reports for the commit 1ad7747 to get more accurate results

Additional details and impacted files
@@             Coverage Diff             @@
##           master   #69167       +/-   ##
===========================================
+ Coverage   53.51%   79.65%   +26.14%     
===========================================
  Files        6291     6479      +188     
  Lines      276821   287758    +10937     
  Branches    47739    49596     +1857     
===========================================
+ Hits       148137   229212    +81075     
+ Misses     128196    58109    -70087     
+ Partials      488      437       -51     
Files Coverage Δ
src/sentry/rules/processing/processor.py 91.97% <78.57%> (+70.01%) ⬆️
src/sentry/rules/processing/delayed_processing.py 90.55% <89.47%> (ø)

... and 2853 files with indirect coverage changes


@instrumented_task(
name="sentry.delayed_processing.tasks.apply_delayed",
queue="dynamicsampling",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to use this queue? If not, how can we choose/create a new/different one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the impact of this 😬 -- cc @wedamija

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either specify no queue, or make a new one, not use an existing unrelated one. I always forget what's needed when creating a new queue, ask in ops if we need to do anything to have this queue be assigned workers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By specifying no queue do you mean just not pass it for now?

Copy link
Member

@wedamija wedamija Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you can just not pass queue at all and it'll go to default

ceorourke added a commit that referenced this pull request Apr 22, 2024
Remove a metric that was put in [4 years
ago](#69167 (comment))
to track snuba query usage that we no longer need.
@ceorourke ceorourke force-pushed the ceorourke/apply_delayed_rules branch from 4fc44b6 to 088b9c7 Compare April 22, 2024 17:05
@ceorourke ceorourke force-pushed the ceorourke/apply_delayed_rules branch from 088b9c7 to 6fa2e8c Compare April 23, 2024 18:36
@ceorourke ceorourke marked this pull request as ready for review April 24, 2024 18:52
@ceorourke ceorourke requested a review from saponifi3d April 24, 2024 18:53
Copy link
Contributor

@saponifi3d saponifi3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just the question about the task queue for Dan. Since we're not executing the task / using the queue, we shouldn't need to worry about it as a blocker.

time_limit=60, # 1 minute
silo_mode=SiloMode.REGION,
)
def apply_delayed(project: Project, buffer: RedisBuffer) -> DefaultDict[Rule, set[int]] | None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is sooooo much nicer now 🎉 thanks for all the cleanup.


@instrumented_task(
name="sentry.delayed_processing.tasks.apply_delayed",
queue="dynamicsampling",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure the impact of this 😬 -- cc @wedamija

return rule_type


def split_conditions_and_filters(rule_condition_list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 this is a great idea to pull it out and make this a composable function.

assert self.rule1 in rules
assert diff_env_rule in rules

def test_apply_delayed_two_rules_one_fires(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i really like how you have each of these these structured 🎉 it's really clear the cases you're testing for and the intended result.

Copy link
Member

@wedamija wedamija left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good, great job on untangling all of this and structuring it to be easy to follow.

Feel free to merge as is and address my feedback in a separate pr


@instrumented_task(
name="sentry.delayed_processing.tasks.apply_delayed",
queue="dynamicsampling",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should either specify no queue, or make a new one, not use an existing unrelated one. I always forget what's needed when creating a new queue, ask in ops if we need to do anything to have this queue be assigned workers.

Comment on lines +151 to +161
for slow_condition in slow_conditions:
if slow_condition:
condition_id = slow_condition.get("id")
condition_interval = slow_condition.get("interval")
target_value = int(str(slow_condition.get("value")))
for condition_data, results in condition_group_results.items():
if (
alert_rule.environment_id == condition_data.environment_id
and condition_id == condition_data.cls_id
and condition_interval == condition_data.interval
):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to take into account whether the rule is ANY or ALL here. When ALL, we should only add to rules_to_fire if all slow conditions pass. This logic works fine as is for ANY.

Might be easier as two pass - so iterate through all the rules and conditions, and find out which ones actually fired. Then for each of those rules, add the groups

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 I think this I'll address in a separate PR since it'll need more test coverage

time_limit=60, # 1 minute
silo_mode=SiloMode.REGION,
)
def apply_delayed(project: Project, buffer: RedisBuffer) -> DefaultDict[Rule, set[int]] | None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have a return value here? Just for tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is just temporary for now since it's not actually firing the rules yet and makes it easier to test. Once they fire I can check RuleFireHistory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw I think it's fine to return here for testability, it doesn't cause problems with the task or anything

MichaelSun48 pushed a commit that referenced this pull request Apr 25, 2024
Remove a metric that was put in [4 years
ago](#69167 (comment))
to track snuba query usage that we no longer need.
@ceorourke ceorourke merged commit 2bdf2a7 into master Apr 25, 2024
@ceorourke ceorourke deleted the ceorourke/apply_delayed_rules branch April 25, 2024 16:36
ceorourke added a commit that referenced this pull request Apr 26, 2024
Follow up to comments on #69167
to not pass the Redis buffer to `apply_delayed`, and also a minor
comment to rename a function.
ceorourke added a commit that referenced this pull request Apr 29, 2024
Another follow up to #69167,
specifically the [comment about handling any and all action
matches](#69167 (comment))
to check if the rule is for "any" or "all" conditions and only add to
the list to be fired if it matches accordingly.
ceorourke added a commit that referenced this pull request May 2, 2024
@github-actions github-actions bot locked and limited conversation to collaborators May 16, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants