Add remediation status to the rule_evaluation_status table #1094

jhrozek · 2023-10-03T21:32:59Z

Adds the remediation status to the rule_evaluation_status table. Even though
the table might change and be split soon when we add support for the alerts, I
think it still makes sense to include the result in the meantime so we can
iterate faster and have a working code to refactor already.

Adds the same fields to the RuleEvaluationStatus protobuf message to be sent
to the clients and fills those fields as appropriate based on errors received
from the remediator.

I'm not happy with the unit test coverage, I would like to extend the test for
the handleEntityEvent method, but at the same time I'd prefer to do it after we
at least agree that this patch is going in the right direction.

Fixes: #1075

jhrozek · 2023-10-03T21:33:25Z

going to point out some places where I could use some feedback as comments.

jhrozek · 2023-10-03T21:34:22Z

database/migrations/000001_init.up.sql

@@ -201,6 +201,8 @@ CREATE TABLE entity_policies (

 create type eval_status_types as enum ('success', 'failure', 'error', 'skipped', 'pending');

+create type remediation_status_types as enum ('success', 'failure', 'error', 'skipped', 'not_supported');


Do these states make sense? Success, failure and error are hopefully self-explanatory, not_supported might also be quite clear. Skipped is supposed to mean that the remediation wasn't run because the evaluation passed in the first place.

Initially I thought the remediation skipped could mean one of the following:

The rule is actually not applicable e.g. "skipped because the check was also skipped" or "this rule applies to repositories with golang codebases and this applies to npm".

The rule was marked by the user as something that should be skipped (future functionality).

Seeing "Skipped" for a rule that already passed is actually confusing for me. Perhaps we should have yet another state "no remediation needed" or simply "" (since the user needs no visual queue since the rule is indeed passing).

I also wonder about the "not_supported" status. We only pass that via the noop remediator. Not supported would tell me that a remediation was tried but it's of the wrong driver or the wrong provider. Maybe it should be not_available instead? Indicating that there simply is not a remediation avaiable.

I also wonder if we need a pending state for remediations?

I like the idea of an empty string.
About the pending remediation, currently I don't see a use-case for this, a policy can be pending, but only if there are no rule status rows. As soon as they are, at least one remediation status other than pending will be available.

database/query/policy_status.sql

jhrozek · 2023-10-03T21:36:51Z

database/query/policy_status.sql

-    last_updated = NOW()
+    eval_details = $7,
+    remediation_status = $8,
+    remediation_details = COALESCE($9, rule_evaluation_status.remediation_details), -- don't overwrite details already set with NULL


actually, this might be a bug, I need to check if we overwrite an error back to nothing if the remediation first fails and then succeeds..sorry, I'll check this case tomorrow, this might not need to be coalesced.

Make me think that details shouldn't be nullable. If there are no details we simply should set it to an empty string.

yes, so this is a remainder of how I refactored the code, initially I didn't have remediation_status at all as I was trying to minimise the amount of new columns, but that proved confusing. I just never reverted the nullable details.

I agree, I'll switch them to non-nullable.

internal/controlplane/handlers_policy.go

jhrozek · 2023-10-03T21:39:47Z

cmd/cli/app/policy_status/table_render.go

+	case errorStatus:
+		return "❌ Error"
+	case skippedStatus:
+		return "✅ Skipped" // success mark because we didn't need to remediate


currently skipped and success are very visually similar, I'll be glad to get some help on how we can distinguish them better? Suggestions on unicode character or colours welcome.

I actually don't like the "⚠️ Skipped" that I used above in the getEvalStatusText section. In reality, that status comes up for quite valid reasons:

The rule is actually not applicable

The rule was marked by the user as something that should be skipped (future functionality).

So the visual queue should be smooth and something that a user actually also skips.

Perhaps "⬜ Skipped" or "▫️ Skipped" is better.

What do you think about ⏭️ for skipping?

For skipped remediation I currently use an empty string, I agree with @JAORMX that a skipped remediation on evaluation success should just not be reported. I'm fine changing the evaluation skipped icon to forward (currently I used the stop-button)

JAORMX · 2023-10-04T06:05:01Z

cmd/cli/app/policy_status/table_render.go

+	case errorStatus:
+		return "❌ Error"
+	case skippedStatus:
+		return "✅ Skipped" // success mark because we didn't need to remediate


I actually don't like the "⚠️ Skipped" that I used above in the getEvalStatusText section. In reality, that status comes up for quite valid reasons:

The rule is actually not applicable

The rule was marked by the user as something that should be skipped (future functionality).

So the visual queue should be smooth and something that a user actually also skips.

Perhaps "⬜ Skipped" or "▫️ Skipped" is better.

JAORMX · 2023-10-04T06:09:49Z

database/migrations/000001_init.up.sql

@@ -201,6 +201,8 @@ CREATE TABLE entity_policies (

 create type eval_status_types as enum ('success', 'failure', 'error', 'skipped', 'pending');

+create type remediation_status_types as enum ('success', 'failure', 'error', 'skipped', 'not_supported');


Initially I thought the remediation skipped could mean one of the following:

The rule is actually not applicable e.g. "skipped because the check was also skipped" or "this rule applies to repositories with golang codebases and this applies to npm".

The rule was marked by the user as something that should be skipped (future functionality).

Seeing "Skipped" for a rule that already passed is actually confusing for me. Perhaps we should have yet another state "no remediation needed" or simply "" (since the user needs no visual queue since the rule is indeed passing).

I also wonder about the "not_supported" status. We only pass that via the noop remediator. Not supported would tell me that a remediation was tried but it's of the wrong driver or the wrong provider. Maybe it should be not_available instead? Indicating that there simply is not a remediation avaiable.

I also wonder if we need a pending state for remediations?

JAORMX · 2023-10-04T06:11:31Z

database/query/policy_status.sql

-    last_updated = NOW()
+    eval_details = $7,
+    remediation_status = $8,
+    remediation_details = COALESCE($9, rule_evaluation_status.remediation_details), -- don't overwrite details already set with NULL


Make me think that details shouldn't be nullable. If there are no details we simply should set it to an empty string.

database/query/policy_status.sql

internal/controlplane/handlers_policy.go

JAORMX · 2023-10-04T06:16:26Z

internal/engine/eval_status.go

+	} else if err != nil {
+		return db.RemediationStatusTypesError
+	}
+	return db.RemediationStatusTypesSuccess


nit: style wise it could look more readable in the following format:

switch err != nil { case errors.Is(err, evalerrors.ErrRemediateFailed): return db.RemediationStatusTypesFailure case errors.Is(err, evalerrors.ErrRemediationSkipped): return db.RemediationStatusTypesSkipped case errors.Is(err, evalerrors.ErrRemediationNotSupported): return db.RemediationStatusTypesNotSupported default: return db.RemediationStatusTypesError } } return db.RemediationStatusTypesSuccess

suggestion taken (with a modification because this way the success is actually unreachable which surprised me as well)

After the evaluation and potentail remediation, we store the remediation result into the database. There are three fields: - remediation_status - a summary - remediation_details - pretty much err.String() - remediation_last_updated - timestamp

To be able to send the remediation detail to the clients, we extend the RuleEvaluationStatus protobuf message with the status, details and timestamp.

Extends the Eval() function so that in addition to the remediation error, the evaluation error is also returned so that it can be stored in the database. New errors are also added to be able to distinguish between errors that must be reported in detail to the user and errors that don't have to be (remediation is not supported for this rule type and/or was not even attempted)

Extends the CLI tool to display the remediation status in a new column

jhrozek · 2023-10-04T11:19:26Z

@JAORMX thank you for the quick review, the current version just resolves a conflict in the protobuf files. There are no other changes.

jhrozek commented Oct 3, 2023

View reviewed changes

database/query/policy_status.sql Show resolved Hide resolved

jhrozek commented Oct 3, 2023

View reviewed changes

internal/controlplane/handlers_policy.go Show resolved Hide resolved

jhrozek commented Oct 3, 2023

View reviewed changes

jhrozek force-pushed the remediate branch from 4421db9 to fed1381 Compare October 3, 2023 21:40

JAORMX requested changes Oct 4, 2023

View reviewed changes

jhrozek force-pushed the remediate branch from fed1381 to d971bde Compare October 4, 2023 09:40

JAORMX previously approved these changes Oct 4, 2023

View reviewed changes

jhrozek added 4 commits October 4, 2023 13:18

Add remediation details to the RuleEvaluationStatus protobuf struct

9db8638

To be able to send the remediation detail to the clients, we extend the RuleEvaluationStatus protobuf message with the status, details and timestamp.

Display the remediation errors to the user in policy_status list

a10231f

Extends the CLI tool to display the remediation status in a new column

jhrozek dismissed JAORMX’s stale review via a10231f October 4, 2023 11:18

jhrozek force-pushed the remediate branch from d971bde to a10231f Compare October 4, 2023 11:18

JAORMX approved these changes Oct 4, 2023

View reviewed changes

jhrozek merged commit f3249bf into mindersec:main Oct 4, 2023
13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add remediation status to the rule_evaluation_status table #1094

Add remediation status to the rule_evaluation_status table #1094

jhrozek commented Oct 3, 2023

jhrozek commented Oct 3, 2023

jhrozek Oct 3, 2023

JAORMX Oct 4, 2023

jhrozek Oct 4, 2023

jhrozek Oct 3, 2023

JAORMX Oct 4, 2023

jhrozek Oct 4, 2023

jhrozek Oct 4, 2023

jhrozek Oct 3, 2023

JAORMX Oct 4, 2023

rdimitrov Oct 4, 2023

jhrozek Oct 4, 2023

JAORMX Oct 4, 2023

JAORMX Oct 4, 2023

JAORMX Oct 4, 2023

JAORMX Oct 4, 2023

jhrozek Oct 4, 2023

jhrozek commented Oct 4, 2023

		@@ -201,6 +201,8 @@ CREATE TABLE entity_policies (

		create type eval_status_types as enum ('success', 'failure', 'error', 'skipped', 'pending');

		create type remediation_status_types as enum ('success', 'failure', 'error', 'skipped', 'not_supported');

Add remediation status to the rule_evaluation_status table #1094

Add remediation status to the rule_evaluation_status table #1094

Conversation

jhrozek commented Oct 3, 2023

jhrozek commented Oct 3, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhrozek commented Oct 4, 2023