Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability run an alert immediately #50215

Closed
mikecote opened this issue Nov 11, 2019 · 16 comments · Fixed by #139848
Closed

Ability run an alert immediately #50215

mikecote opened this issue Nov 11, 2019 · 16 comments · Fixed by #139848
Assignees
Labels
Feature:Alerting R&D Research and development ticket (not meant to produce code, but to make a decision) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@mikecote
Copy link
Contributor

mikecote commented Nov 11, 2019

Depends on #50214.

Maybe create an execute API similar to actions.

@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-stack-services (Team:Stack Services)

@pmuellr
Copy link
Member

pmuellr commented Nov 13, 2019

Not sure we even need to use TM to do this - we could in theory just run the code in the Kibana process that requested it be run. I think that's what action execution via http request does. We should probably be consistent between these two, in any case.

@mikecote
Copy link
Contributor Author

The only challenge doing that with the current structure may be the alert / alert instance state that is stored in the task manager. They wouldn't have access to it. Though there may be plans to move some of it outside of TM.

@gmmorris
Copy link
Contributor

Is the thinking that this would basically be like scheduling with a runAt of Date.now()?
Because it would be simple enough to do and would mean that, no matter what, the way in which actions are executed is the same in all cases, which reduces the chance of variance in behavior.... I wouldn't expect there to be a difference, but just in case 🤷‍♀️

@pmuellr
Copy link
Member

pmuellr commented Nov 13, 2019

Ya, we need to think through this - I think this would be very similar to the "test" scenario as well, so may come with caveats like "you have no state". A different thing we could do would be to change the runAt as Gidi notes, which presumably WOULD give you state. And if we change the runAt, then the subsequent intervals would be relative to that new runAt - might be what the user wants, or not ...

@peterschretlen
Copy link
Contributor

In addition the 'now' case, the ability to run the alert at some time in the past for testing purposes has been requested:

...were looking for a way to test the actual alert against their historic data set to see what they'd get back. We've talked about testing alerts, but more in the context of the connection. This would be pretty valuable, particularly for user defined alerts.

So if we could run the alert 'now' but also pass a timestamp parameter, and the alert condition is checked relative to that time.

@peterschretlen
Copy link
Contributor

Somewhat related to #49411

@tsullivan
Copy link
Member

tsullivan commented Nov 21, 2019

we could in theory just run the code in the Kibana process that requested it be run

Note this solution would have a gotcha: these kind of tasks would not be able to benefit from the distributed nature of multiple instances working the same task queue.

I think that gotcha gets a bit nasty when Kibana administrators would want to make finegrain controls and have "lightweight" instances that serve a UI to multiple users, and can ONLY schedule tasks (not claim them), and "heavyweight" instances that are configured only for claiming tasks and not serving UI to end-users.

I don't know if it's possible today with Task Manager to configure multiple instances for dedicated roles in tasks - but it would be a reasonable enhancement for tackling scaling a Kibana deployment.

@bmcconaghy bmcconaghy added Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) and removed Team:Stack Services labels Dec 12, 2019
@mikecote
Copy link
Contributor Author

mikecote commented Feb 4, 2021

Moving from 7.x - Candidates to 8.x - Candidates (Backlog) after the latest 7.x planning session.

@ymao1
Copy link
Contributor

ymao1 commented Mar 18, 2021

Closing as task manager has a run now API that can be used for this.

@ymao1 ymao1 closed this as completed Mar 18, 2021
@kobelb kobelb added the needs-team Issues missing a team label label Jan 31, 2022
@botelastic botelastic bot removed the needs-team Issues missing a team label label Jan 31, 2022
@peluja1012
Copy link
Contributor

Hi @ymao1, we are revisiting adding a "run now" functionality in Security Solution. Do you know if the alerting framework exposes this capability via the Alerting client?

@ymao1
Copy link
Contributor

ymao1 commented Feb 1, 2022

@peluja1012 The alerting rules client calls the task manager runNow function during rule updates (but it looks like only if the schedule interval changes) but does not expose any wrapper around it.

We could reopen this issue and update the description to add a wrapper around the task manager runNow in the alerting rules client. Alternatively, it looks like taskManager is already a required plugin for security solutions so the security solutions plugin could directly call runNow, which just takes the scheduledTaskId as an argument. Here is where it is used in the rules client:

this.taskManager
.runNow(updateResult.scheduledTaskId)
.then(() => {
this.logger.debug(
`Alert update has rescheduled the underlying task: ${updateResult.scheduledTaskId}`
);
})
.catch((err: Error) => {
this.logger.error(
`Alert update failed to run its underlying task. TaskManager runNow failed with Error: ${err.message}`
);
});

@peluja1012
Copy link
Contributor

Thanks, @ymao1! We'll review our use case in more detail and let you know if calling the taskManager's runNow directly would be sufficient.

@mikecote mikecote reopened this Jul 25, 2022
@mikecote
Copy link
Contributor Author

From #134400.

Describe the feature:
It would be fantastic to have a button on the Rule and Connectors UI ("Rules" tab) that allows running a rule add hoc.

Describe a specific use case for the feature:
A user has a Rule that is active, they make a change to the Rule that should cause it to no longer be active, but the Rule doesn't run again for several hours. Unless they modify the frequency, test it, then set the frequency back to what it was, they have to remember to come back to the Rule once it naturally runs again to make sure it is no longer active. If there was some sort of a "run now" button, it would allow much faster ad hoc testing/running of a Rule.

This could also be useful if a Rule runs infrequently, and was Active last time it ran, but before the user investigates further they want to run it ad hoc to verify if it is still active.

@mikecote
Copy link
Contributor Author

This feature should leverage the new task manager's runSoon API that will allow any Kibana from picking up this task. The run soon API sets new Date() as the next runAt which will work for this use case. We should lookout for rule types using the rule's interval value in case running a rule more frequently may behave differently. (we should also look into discouraging the use of interval..)

@mikecote
Copy link
Contributor Author

mikecote commented Aug 8, 2022

Linking with #138124. If the framework can run a rule to run ad-hoc anytime, the framework should also be able to auto retry when encountering a socket hangup error (or other retry-able errors).

@mikecote mikecote added the R&D Research and development ticket (not meant to produce code, but to make a decision) label Aug 29, 2022
@doakalexi doakalexi self-assigned this Aug 30, 2022
@doakalexi doakalexi moved this from Todo to In Progress in AppEx: ResponseOps - Execution & Connectors Aug 30, 2022
@doakalexi doakalexi moved this from In Progress to In Review in AppEx: ResponseOps - Execution & Connectors Sep 1, 2022
Repository owner moved this from In Review to Done in AppEx: ResponseOps - Execution & Connectors Sep 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Alerting R&D Research and development ticket (not meant to produce code, but to make a decision) Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

Successfully merging a pull request may close this issue.