Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version task-manager task state #155764

Open
kobelb opened this issue Apr 25, 2023 · 2 comments
Open

Version task-manager task state #155764

kobelb opened this issue Apr 25, 2023 · 2 comments
Assignees
Labels
Feature:Task Manager Meta Project:Serverless MVP Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@kobelb
Copy link
Contributor

kobelb commented Apr 25, 2023

Feature Description

To enable zero downtime upgrades, we need to version the task manager's task state. This will allow us to ensure that if an older Kibana node runs an alerting rule after a newer Kibana node has stored additional task state, the unknown state properties will be dropped. Additionally, this will prompt developers to be aware that adding required fields to the task state is a breaking change that must be handled with care.

Business Value

Facilitates zero downtime rolling upgrades and rollbacks, allowing us to roll-out new features to our users more quickly while they continue to use the system without disruption.

Definition of Done

  • Task Manager consumers can declare multiple versions for the task state
  • Unit tests passed
@kobelb kobelb added Feature:Task Manager Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) Project:Serverless MVP labels Apr 25, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@mikecote mikecote self-assigned this Apr 27, 2023
mikecote added a commit that referenced this issue May 5, 2023
Related issue #155764.

In this POC, I'm adding an `extendsDeep` function to the schema object.
This feature allows you to create a copy of an existing schema
definition and recursively modify options without mutating them. With
`extendsDeep`, you can specify whether unknown attributes on objects
should be allowed, forbidden or ignored.

This new function is particularly useful for alerting scenarios where we
need to drop unknown fields when reading from Elasticsearch without
modifying the schema object. Since we don't control the schema
definition in some areas, `extendsDeep` provides a convenient way to set
the `unknowns` option to all objects recursively. By doing so, we can
validate and drop unknown properties using the same defined schema, just
with `unknowns: forbid` extension.

Usage:
```
// Single, shared type definition
const type = schema.object({ foo: schema.string() });

// Drop unknown fields (bar in this case)
const savedObject = { foo: 'test', bar: 'test' };
const ignoreSchema = type.extendsDeep({ unknowns: 'ignore' });
ignoreSchema.validate(savedObject);

// Prevent unknown fields (bar in this case)
const soToUpdate = { foo: 'test', bar: 'test' };
const forbidSchema = type.extendsDeep({ unknowns: 'forbid' });
forbidSchema.validate(soToUpdate);
```

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Jun 23, 2023
…ate objects (#159048)

Part of #155764.

In this PR, I'm modifying task manager to allow task types to report a
versioned schema for the `state` object. When defining
`stateSchemaByVersion`, the following will happen:
- The `state` returned from the task runner will get validated against
the latest version and throw an error if ever it is invalid (to capture
mismatches at development and testing time)
- When task manager reads a task, it will migrate the task state to the
latest version (if necessary) and validate against the latest schema,
dropping any unknown fields (in the scenario of a downgrade).

By default, Task Manager will validate the state on write once a
versioned schema is provided, however the following config must be
enabled for errors to be thrown on read:
`xpack.task_manager.allow_reading_invalid_state: true`. We plan to
enable this in serverless by default but be cautious on existing
deployments and wait for telemetry to show no issues.

I've onboarded the `alerts_invalidate_api_keys` task type which can be
used as an example to onboard others. See [this
commit](214bae3).

### How to configure a task type to version and validate
The structure is defined as:
```
taskManager.registerTaskDefinitions({
  ...
  stateSchemaByVersion: {
    1: {
      // All existing tasks don't have a version so will get `up` migrated to 1
      up: (state: Record<string, unknown>) => ({
        runs: state.runs || 0,
        total_invalidated: state.total_invalidated || 0,
      }),
      schema: schema.object({
        runs: schema.number(),
        total_invalidated: schema.number(),
      }),
    },
  },
  ...
});
```

However, look at [this
commit](214bae3)
for an example that you can leverage type safety from the schema.

### Follow up issues
- Onboard non-alerting task types to have a versioned state schema
(#159342)
- Onboard alerting task types to have a versioned state schema for the
framework fields (#159343)
- Onboard alerting task types to have a versioned rule and alert state
schema within the task state
(#159344)
- Telemetry on the validation failures
(#159345)
- Remove feature flag so `allow_reading_invalid_state` is always `false`
(#159346)
- Force validation on all tasks using state by removing the exemption
code (#159347)
- Release tasks when encountering a validation failure after run
(#159964)

### To Verify

NOTE: I have the following verification scenarios in a jest integration
test as well =>
https://github.com/elastic/kibana/pull/159048/files#diff-5f06228df58fa74d5a0f2722c30f1f4bee2ee9df7a14e0700b9aa9bc3864a858.

You will need to log the state when the task runs to observe what the
task runner receives in different scenarios.

```
diff --git a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
index 1e624bcd807..4aa4c2c7805 100644
--- a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
+++ b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
@@ -140,6 +140,7 @@ function taskRunner(
 ) {
   return ({ taskInstance }: RunContext) => {
     const state = taskInstance.state as LatestTaskStateSchema;
+    console.log('*** Running task with the following state:', JSON.stringify(state));
     return {
       async run() {
         let totalInvalidated = 0;
```

#### Scenario 1: Adding an unknown field to the task saved-object gets
dropped
1. Startup a fresh Kibana instance
2. Make the following call to Elasticsearch (I used postman). This call
adds an unknown property (`foo`) to the task state and makes the task
run right away.
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z",
      "state": "{\"runs\":1,\"total_invalidated\":0,\"foo\":true}"
    }
  }
}
```
3. Observe the task run log message, with state not containing `foo`.

#### Scenario 2: Task running returning an unknown property causes the
task to fail to update
1. Apply the following changes to the code (and ignore TypeScript
issues)
```
diff --git a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
index 1e624bcd807..b15d4a4f478 100644
--- a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
+++ b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
@@ -183,6 +183,7 @@ function taskRunner(
 
           const updatedState: LatestTaskStateSchema = {
             runs: (state.runs || 0) + 1,
+            foo: true,
             total_invalidated: totalInvalidated,
           };
           return {
```
2. Make the task run right away by calling Elasticsearch with the
following
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z"
    }
  }
}
```
3. Notice the validation errors logged as debug
```
[ERROR][plugins.taskManager] Task alerts_invalidate_api_keys "Alerts-alerts_invalidate_api_keys" failed: Error: [foo]: definition for this key is missing
```

#### Scenario 3: Task state gets migrated
1. Apply the following code change
```
diff --git a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
index 1e624bcd807..338f21bed5b 100644
--- a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
+++ b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
@@ -41,6 +41,18 @@ const stateSchemaByVersion = {
       total_invalidated: schema.number(),
     }),
   },
+  2: {
+    up: (state: Record<string, unknown>) => ({
+      runs: state.runs,
+      total_invalidated: state.total_invalidated,
+      foo: true,
+    }),
+    schema: schema.object({
+      runs: schema.number(),
+      total_invalidated: schema.number(),
+      foo: schema.boolean(),
+    }),
+  },
 };
 
 const latestSchema = stateSchemaByVersion[1].schema;
```

2. Make the task run right away by calling Elasticsearch with the
following
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z"
    }
  }
}
```
3. Observe the state now contains `foo` property when the task runs.

#### Scenario 4: Reading invalid state causes debug logs
1. Run the following request to Elasticsearch
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z",
      "state": "{}"
    }
  }
}
```
2. Observe the Kibana debug log mentioning the validation failure while
letting the task through
```
[DEBUG][plugins.taskManager] [alerts_invalidate_api_keys][Alerts-alerts_invalidate_api_keys] Failed to validate the task's state. Allowing read operation to proceed because allow_reading_invalid_state is true. Error: [runs]: expected value of type [number] but got [undefined]
```

#### Scenario 5: Reading invalid state when setting
`allow_reading_invalid_state: false` causes tasks to fail to run
1. Set `xpack.task_manager.allow_reading_invalid_state: false` in your
kibana.yml settings
2. Run the following request to Elasticsearch
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z",
      "state": "{}"
    }
  }
}
```
3. Observe the Kibana error log mentioning the validation failure
```
[ERROR][plugins.taskManager] Failed to poll for work: Error: [runs]: expected value of type [number] but got [undefined]
```

NOTE: While corrupting the task directly is rare, we plan to re-queue
the tasks that failed to read, leveraging work from
#159302 in a future PR (hence
why the yml config is enabled by default, allowing invalid reads).

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Ying Mao <ying.mao@elastic.co>
kibanamachine pushed a commit to kibanamachine/kibana that referenced this issue Jun 23, 2023
…ate objects (elastic#159048)

Part of elastic#155764.

In this PR, I'm modifying task manager to allow task types to report a
versioned schema for the `state` object. When defining
`stateSchemaByVersion`, the following will happen:
- The `state` returned from the task runner will get validated against
the latest version and throw an error if ever it is invalid (to capture
mismatches at development and testing time)
- When task manager reads a task, it will migrate the task state to the
latest version (if necessary) and validate against the latest schema,
dropping any unknown fields (in the scenario of a downgrade).

By default, Task Manager will validate the state on write once a
versioned schema is provided, however the following config must be
enabled for errors to be thrown on read:
`xpack.task_manager.allow_reading_invalid_state: true`. We plan to
enable this in serverless by default but be cautious on existing
deployments and wait for telemetry to show no issues.

I've onboarded the `alerts_invalidate_api_keys` task type which can be
used as an example to onboard others. See [this
commit](elastic@214bae3).

### How to configure a task type to version and validate
The structure is defined as:
```
taskManager.registerTaskDefinitions({
  ...
  stateSchemaByVersion: {
    1: {
      // All existing tasks don't have a version so will get `up` migrated to 1
      up: (state: Record<string, unknown>) => ({
        runs: state.runs || 0,
        total_invalidated: state.total_invalidated || 0,
      }),
      schema: schema.object({
        runs: schema.number(),
        total_invalidated: schema.number(),
      }),
    },
  },
  ...
});
```

However, look at [this
commit](elastic@214bae3)
for an example that you can leverage type safety from the schema.

### Follow up issues
- Onboard non-alerting task types to have a versioned state schema
(elastic#159342)
- Onboard alerting task types to have a versioned state schema for the
framework fields (elastic#159343)
- Onboard alerting task types to have a versioned rule and alert state
schema within the task state
(elastic#159344)
- Telemetry on the validation failures
(elastic#159345)
- Remove feature flag so `allow_reading_invalid_state` is always `false`
(elastic#159346)
- Force validation on all tasks using state by removing the exemption
code (elastic#159347)
- Release tasks when encountering a validation failure after run
(elastic#159964)

### To Verify

NOTE: I have the following verification scenarios in a jest integration
test as well =>
https://github.com/elastic/kibana/pull/159048/files#diff-5f06228df58fa74d5a0f2722c30f1f4bee2ee9df7a14e0700b9aa9bc3864a858.

You will need to log the state when the task runs to observe what the
task runner receives in different scenarios.

```
diff --git a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
index 1e624bcd807..4aa4c2c7805 100644
--- a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
+++ b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
@@ -140,6 +140,7 @@ function taskRunner(
 ) {
   return ({ taskInstance }: RunContext) => {
     const state = taskInstance.state as LatestTaskStateSchema;
+    console.log('*** Running task with the following state:', JSON.stringify(state));
     return {
       async run() {
         let totalInvalidated = 0;
```

#### Scenario 1: Adding an unknown field to the task saved-object gets
dropped
1. Startup a fresh Kibana instance
2. Make the following call to Elasticsearch (I used postman). This call
adds an unknown property (`foo`) to the task state and makes the task
run right away.
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z",
      "state": "{\"runs\":1,\"total_invalidated\":0,\"foo\":true}"
    }
  }
}
```
3. Observe the task run log message, with state not containing `foo`.

#### Scenario 2: Task running returning an unknown property causes the
task to fail to update
1. Apply the following changes to the code (and ignore TypeScript
issues)
```
diff --git a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
index 1e624bcd807..b15d4a4f478 100644
--- a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
+++ b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
@@ -183,6 +183,7 @@ function taskRunner(

           const updatedState: LatestTaskStateSchema = {
             runs: (state.runs || 0) + 1,
+            foo: true,
             total_invalidated: totalInvalidated,
           };
           return {
```
2. Make the task run right away by calling Elasticsearch with the
following
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z"
    }
  }
}
```
3. Notice the validation errors logged as debug
```
[ERROR][plugins.taskManager] Task alerts_invalidate_api_keys "Alerts-alerts_invalidate_api_keys" failed: Error: [foo]: definition for this key is missing
```

#### Scenario 3: Task state gets migrated
1. Apply the following code change
```
diff --git a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
index 1e624bcd807..338f21bed5b 100644
--- a/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
+++ b/x-pack/plugins/alerting/server/invalidate_pending_api_keys/task.ts
@@ -41,6 +41,18 @@ const stateSchemaByVersion = {
       total_invalidated: schema.number(),
     }),
   },
+  2: {
+    up: (state: Record<string, unknown>) => ({
+      runs: state.runs,
+      total_invalidated: state.total_invalidated,
+      foo: true,
+    }),
+    schema: schema.object({
+      runs: schema.number(),
+      total_invalidated: schema.number(),
+      foo: schema.boolean(),
+    }),
+  },
 };

 const latestSchema = stateSchemaByVersion[1].schema;
```

2. Make the task run right away by calling Elasticsearch with the
following
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z"
    }
  }
}
```
3. Observe the state now contains `foo` property when the task runs.

#### Scenario 4: Reading invalid state causes debug logs
1. Run the following request to Elasticsearch
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z",
      "state": "{}"
    }
  }
}
```
2. Observe the Kibana debug log mentioning the validation failure while
letting the task through
```
[DEBUG][plugins.taskManager] [alerts_invalidate_api_keys][Alerts-alerts_invalidate_api_keys] Failed to validate the task's state. Allowing read operation to proceed because allow_reading_invalid_state is true. Error: [runs]: expected value of type [number] but got [undefined]
```

#### Scenario 5: Reading invalid state when setting
`allow_reading_invalid_state: false` causes tasks to fail to run
1. Set `xpack.task_manager.allow_reading_invalid_state: false` in your
kibana.yml settings
2. Run the following request to Elasticsearch
```
POST http://kibana_system:changeme@localhost:9200/.kibana_task_manager/_update/task:Alerts-alerts_invalidate_api_keys
{
  "doc": {
    "task": {
      "runAt": "2023-06-08T00:00:00.000Z",
      "state": "{}"
    }
  }
}
```
3. Observe the Kibana error log mentioning the validation failure
```
[ERROR][plugins.taskManager] Failed to poll for work: Error: [runs]: expected value of type [number] but got [undefined]
```

NOTE: While corrupting the task directly is rare, we plan to re-queue
the tasks that failed to read, leveraging work from
elastic#159302 in a future PR (hence
why the yml config is enabled by default, allowing invalid reads).

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
Co-authored-by: Ying Mao <ying.mao@elastic.co>
(cherry picked from commit 40c2afd)
mikecote added a commit that referenced this issue Jul 13, 2023
… validation (#161581)

Part of #159342.

In this PR, I'm preparing the non-alerting (rule types) response ops
task types for serverless by defining an explicit task state schema.
This schema is used to validate the task's state before saving but also
when reading. In the scenario an older Kibana node runs a task after a
newer Kibana node has stored additional task state, the unknown state
properties will be dropped. Additionally, this will prompt developers to
be aware that adding required fields to the task state is a breaking
change that must be handled with care. (see
#155764).

For more information on how to use `stateSchemaByVersion`, see
#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Jul 17, 2023
…sk state validation (#161584)

Part of #159342.

In this PR, I'm preparing the `ML:saved-objects-sync-task` for
serverless by defining an explicit task state schema. This schema is
used to validate the task's state before saving the task but also when
reading the task. In the scenario an older Kibana node runs a task after
a newer Kibana node has stored additional task state, the unknown state
properties will be dropped. Additionally, this will prompt developers to
be aware that adding required fields to the task state is a breaking
change that must be handled with care. (see
#155764).

For more information on how to use `stateSchemaByVersion`, see
#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: James Gowdy <jgowdy@elastic.co>
Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Jul 20, 2023
…or task state validation (#161747)

Part of #159342.

In this PR, I'm preparing the security solution related tasks for
serverless by defining an explicit task state schema. This schema is
used to validate the task's state before saving the task but also when
reading the task. In the scenario an older Kibana node runs a task after
a newer Kibana node has stored additional task state, the unknown state
properties will be dropped. Additionally, this will prompt developers to
be aware that adding required fields to the task state is a breaking
change that must be handled with care. (see
#155764).

For more information on how to use `stateSchemaByVersion`, see
#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Jul 20, 2023
… task state validation (#161855)

Part of #159342.

In this PR, I'm preparing the `dashboard_telemetry` for serverless by
defining an explicit task state schema. This schema is used to validate
the task's state before saving the task but also when reading the task.
In the scenario an older Kibana node runs a task after a newer Kibana
node has stored additional task state, the unknown state properties will
be dropped. Additionally, this will prompt developers to be aware that
adding required fields to the task state is a breaking change that must
be handled with care. (see
#155764).

For more information on how to use `stateSchemaByVersion`, see
#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Jul 20, 2023
…sion for task state validation (#161740)

Part of #159342.

In this PR, I'm preparing the `cloud_security_posture-findings_stats`
task type for serverless by defining an explicit task state schema. This
schema is used to validate the task's state before saving the task but
also when reading the task. In the scenario an older Kibana node runs a
task after a newer Kibana node has stored additional task state, the
unknown state properties will be dropped. Additionally, this will prompt
developers to be aware that adding required fields to the task state is
a breaking change that must be handled with care. (see
#155764).

For more information on how to use `stateSchemaByVersion`, see
#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
ThomThomson pushed a commit to ThomThomson/kibana that referenced this issue Aug 1, 2023
…or task state validation (elastic#161747)

Part of elastic#159342.

In this PR, I'm preparing the security solution related tasks for
serverless by defining an explicit task state schema. This schema is
used to validate the task's state before saving the task but also when
reading the task. In the scenario an older Kibana node runs a task after
a newer Kibana node has stored additional task state, the unknown state
properties will be dropped. Additionally, this will prompt developers to
be aware that adding required fields to the task state is a breaking
change that must be handled with care. (see
elastic#155764).

For more information on how to use `stateSchemaByVersion`, see
elastic#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
ThomThomson pushed a commit to ThomThomson/kibana that referenced this issue Aug 1, 2023
… task state validation (elastic#161855)

Part of elastic#159342.

In this PR, I'm preparing the `dashboard_telemetry` for serverless by
defining an explicit task state schema. This schema is used to validate
the task's state before saving the task but also when reading the task.
In the scenario an older Kibana node runs a task after a newer Kibana
node has stored additional task state, the unknown state properties will
be dropped. Additionally, this will prompt developers to be aware that
adding required fields to the task state is a breaking change that must
be handled with care. (see
elastic#155764).

For more information on how to use `stateSchemaByVersion`, see
elastic#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
ThomThomson pushed a commit to ThomThomson/kibana that referenced this issue Aug 1, 2023
…sion for task state validation (elastic#161740)

Part of elastic#159342.

In this PR, I'm preparing the `cloud_security_posture-findings_stats`
task type for serverless by defining an explicit task state schema. This
schema is used to validate the task's state before saving the task but
also when reading the task. In the scenario an older Kibana node runs a
task after a newer Kibana node has stored additional task state, the
unknown state properties will be dropped. Additionally, this will prompt
developers to be aware that adding required fields to the task state is
a breaking change that must be handled with care. (see
elastic#155764).

For more information on how to use `stateSchemaByVersion`, see
elastic#159048 and
https://github.com/elastic/kibana/blob/main/x-pack/plugins/task_manager/README.md.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Aug 16, 2023
…63743)

Part of #155764.

In this PR, I'm setting the
`xpack.task_manager.allow_reading_invalid_state` serverless setting to
`false` so Kibana doesn't allow reading invalid state when loading
tasks.

I'm also doing the same for the functional tests to ensure valid task
state is always read.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Aug 29, 2023
…idation of the framework fields (#162425)

Resolves #159343

In this PR, I'm preparing the alerting rule task types for serverless by
defining an explicit task state schema for the framework level fields.
This schema is used to validate the task's state before saving but also
when reading. In the scenario an older Kibana node runs a task after a
newer Kibana node has stored additional task state, the unknown state
properties will be dropped. Additionally, this will prompt developers to
be aware that adding required fields to the task state is a breaking
change that must be handled with care. (see
#155764).

The PR also includes the following changes:
- Modifying the `@kbn/alerting-state-types` package so the types are
generated from `config-schema`, instead of `io-ts`. The schema is
re-used for exporting a new `stateSchemaByVersion` property.
- Removing `DateFromString` in favour of strings everywhere
(config-schema doesn't support this conversion)
- Add a v1 `up` migration that will ensure the types match the
TypeScript interface on any existing alerting task. The migration
assumes any type of data could exist and dropping parts that don't match
the type expectation. The TypeScript interface uses `schema.maybe()` in
a lot of places (as `io-ts` did), so safe if ever data gets dropped.
- Cleanup the `alerting/common/**` exports to reduce bundle size.
Because the `@kbn/alerting-state-types` package grew.
- Since the new TypeScript interfaces / types are `ReadOnly<...>`, I
created some `Mutable...` types for places that needed it (in order to
avoid code refactoring).

## To verify

Stack Monitoring:
- To make TypeScript happy with the new ReadOnly `RawAlertInstance`
type, I removed some redundant code and solved the issue.

Security Solution:
- Changes to the `alertInstanceFactoryStub` set the alert's date to a
`string` instead of a `Date` value. Note: The HTTP API response
converted `Date` objects to `string`, so the HTTP API response will look
the same with this change.

Response Ops:
- In a fresh Kibana install, create alerting rules and ensure they run.
- In a 8.9 version, create some rules, upgrade to this branch and ensure
they still run.
- Compare the `io-ts` definition with the new `config-schema`. They
should match 1:1.
- Look for ways the migration code could fail.

Note: The main changes are within the following areas:
- `x-pack/plugins/alerting/server/rule_type_registry.ts`
- `x-pack/packages/kbn-alerting-state-types/*`

---------

Co-authored-by: kibanamachine <42973632+kibanamachine@users.noreply.github.com>
mikecote added a commit that referenced this issue Sep 26, 2023
Resolves #159347
Part of #155764

In this PR, I'm forcing validation on any tasks using state by ensuring
a state schema is defined by the task type. Without this schema and once
this PR merges, Task Manager will now fail validation by throwing
`[TaskValidator] stateSchemaByVersion not defined for task type: XYZ`
errors. This forces an explicit schema to be defined so we can properly
handle state objects in ZDT environments.

---------

Co-authored-by: Kibana Machine <42973632+kibanamachine@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Task Manager Meta Project:Serverless MVP Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants