-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement the rest of alerts #1228
Conversation
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
I've also made an excel sheet trying to visualise the different use cases I implemented for the conditions that trigger an alarm action along with its desired state - Scenarios for Alert execution |
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the code looks pretty good. The only two things were that naming of the GetType/GetParentType were a bit confusing and I think we should consider exposing an interface over exposing the evalParams struct to the engine..
return alert.actionType | ||
} | ||
|
||
// SubType returns the action subtype of the remediation engine | ||
func (_ *Alert) SubType() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions: 1. since we're returning the module constant, why not just use the module constant in the code? 2. Why is the method named SubType? (we have ParentType and SubType..but no Type?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree I didn't liked the naming that much too. I'll think of something that makes more sense 😃
The reason for parent and sub-type(should have been child I guess?) is the following. In the actions engine we have a list of actions and their type can be remediate or alerts. Then all of them can have types on their own - different variations like rest, pull_request, advisory and so on, so thus why parent and sub-type.
The idea of having a sub-type method in the actions interface is to be able to act on it, i.e. do not alert if action of type remediation and sub-type pull_request was successful and it seemed better to have that as part of the actions interface.
// Type returns the action type of the security-advisory engine | ||
func (alert *Alert) Type() interfaces.ActionType { | ||
// ParentType returns the action type of the security-advisory engine | ||
func (alert *Alert) ParentType() interfaces.ActionType { | ||
return alert.actionType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the methods be called something like ActionType and AlertType/AlertEngineType?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I addressed this in the comment above 👍
@@ -242,10 +240,6 @@ func (r *RuleTypeEngine) Eval(ctx context.Context, ent protoreflect.ProtoMessage | |||
result, err := r.rdi.Ingest(ctx, ent, ruleParams) | |||
if err != nil { | |||
evalParams.EvalErr = fmt.Errorf("error ingesting data: %w", err) | |||
evalParams.ActionsErr = evalerrors.ActionsError{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this block was moved to createEvalStatusParams right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or getDefaultResult ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, getDefaultResult. The idea is that the actionsErr value should come only from the DoActions() method by the Actions engine.
// any existing meta entry since we don't upsert in case of conflict while skipping | ||
m, err := json.Marshal(&map[string]any{}) | ||
if err != nil { | ||
logger.Error().Err(err).Msg("error marshaling empty json.RawMessage") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we store on error? Should we explicitly set m to an empty message? (Or are we relying on the db default?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering that since it's not a variable that we try to marshal it is safe to assume that it shouldn't fail unless there's a problem with the json.Marshal. As for the default value, I guess I had to make the alert tables column accepting nil values so I don't have to ensure this.
@@ -258,12 +252,6 @@ func (r *RuleTypeEngine) Actions( | |||
ruleDef, ruleParams map[string]any, | |||
evalParams *EvalStatusParams, | |||
) { | |||
// Skip actions in case ingesting failed during evaluation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what handles this case now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be handled by the actions engine now.
And it was actually not correct, mainly because of the way evaluations work now - the end status of an evaluation can come from both the ingest part and the eval part.
This is a bit implicit as it's hidden behind errors, some generic, some custom. For example, depending on the ingest type, ingest can err out due to a failed ingesting (which makes sense for actions skipping), but also with ErrEvaluationFailed which has to be processed by the actions.
I was thinking to address that by having ingesting do ingestions only and the eval part be the one that sets the custom Error type but it's out of the PR scope and it implies updating big portions of the code.
This change makes evaluation more explicit and also sets clear boundaries - eval is for eval and shouldn't decide if the action engine should be skipped. With this we channel every rule evaluation result (either coming from ingest or eval) to the actions and let the actions engine decide if what needs to be skipped - whether it's a specific action or even all actions.
metadata = sqlc.arg(metadata)::jsonb, | ||
status = CASE WHEN $2 != 'skipped' THEN $2 ELSE rule_details_alert.status END, | ||
details = CASE WHEN $2 != 'skipped' THEN $3 ELSE rule_details_alert.details END, | ||
metadata = CASE WHEN $2 != 'skipped' THEN sqlc.arg(metadata)::jsonb ELSE rule_details_alert.metadata END, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the SQL part that you asked for some extra eyes on? It looks good to me (tests would be nice)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is this one, yes 👍 I haven't seen any issue so far, but it's good to hear that from someone else too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense to me at least.
internal/engine/executor.go
Outdated
@@ -18,7 +18,6 @@ import ( | |||
"context" | |||
"errors" | |||
"fmt" | |||
engif "github.com/stacklok/mediator/internal/engine/interfaces" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -175,7 +175,7 @@ func (_ *Alert) SubType() string { | |||
|
|||
// GetOnOffState returns the alert action state read from the profile | |||
func (_ *Alert) GetOnOffState(p *pb.Profile) interfaces.ActionOpt { | |||
return interfaces.ActionOptFromString(p.Alert) | |||
return interfaces.ActionOptFromString(p.Alert, interfaces.ActionOptOn) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great defaults.
logger := zerolog.Ctx(ctx) | ||
params := ¶msSA{} | ||
_ = evalParams | ||
switch entity := entity.(type) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely not for this PR, but this sounds like something that should be extracted into an interface so you wouldn't have to code the typeswich here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added that as an item in our profile design improvement document 👍
} | ||
params.Summary = summaryStr.String() | ||
|
||
var descriptionStr strings.Builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, nice for some reason I never realised that strings.Builder implements raw io.Writer and wasn't using it for template expansion even though I could..
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
Signed-off-by: Radoslav Dimitrov <radoslav@stacklok.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for renaming the interface. As long as you plan to address exposing the struct to the actions in a subsequent PR, I'm happy with this version.
Also, great work overall getting this feature from design, to database changes to implementation. It's really nice to see it all come together.
The following PR includes:
Top of my mind topics that need feedback and/or are still draft:
skipped
status. I've tried one approach but it was failing withno rows found
when there was such a conflict.Marking as WIP until I have tested this properly and addressed the pending issues.
Fixes: #1213
Fixes: #1214
Screenshots of the alert in action. It was closed automatically after the remediation PR was merged.