Add list to map processor (opensearch-project#3806)

* Add list to map processor. Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Tweak one last file Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Fix typo Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> * Update mutate-event.md * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md * Add Chris' feedback Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * A couple more wording tweaks Signed-off-by: Naarcha-AWS <naarcha@amazon.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Nathan Bower <nbower@amazon.com> * Update _data-prepper/pipelines/configuration/processors/list-to-map.md Co-authored-by: Nathan Bower <nbower@amazon.com> * Apply suggestions from code review Co-authored-by: Nathan Bower <nbower@amazon.com> --------- Signed-off-by: Naarcha-AWS <naarcha@amazon.com> Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com> Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com> Co-authored-by: Nathan Bower <nbower@amazon.com>
ananzh · Apr 20, 2023 · e1a1f44 · e1a1f44
1 parent 586cb3d
commit e1a1f44
Show file tree

Hide file tree

Showing 26 changed files with 657 additions and 428 deletions.
diff --git a/_data-prepper/pipelines/configuration/processors/add-entries.md b/_data-prepper/pipelines/configuration/processors/add-entries.md
@@ -1,25 +1,62 @@
 ---
 layout: default
-title: add_entries
+title: Add entries processor
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 40
 ---
 
 # add_entries
 
-## Overview
+The `add_entries` processor adds entries to an event.
 
-The `add_entries` processor adds an entry to the event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `add_entries` processor.
+### Configuration
 
-Option | Required | Type | Description
-:--- | :--- | :--- | :---
-entries | Yes | List | List of events to be added. Valid entries are `key`, `value`, and `overwrite_if_key_exists`.
-key | N/A | N/A | Key of the new event to be added.
-value | N/A | N/A | Value of the new entry to be added. Valid data types are strings, booleans, numbers, null, nested objects, and arrays containing the aforementioned data types.
-overwrite_if_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`.
+You can configure the `add_entries` processor with the following options.
 
-<!--- ## Configuration
+| Option | Required | Description |
+| :--- | :--- | :--- |
+| `entries` | Yes | A list of entries to add to an event. |
+| `key` | Yes | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. |
+| `value` | Yes | The value of the new entry to be added. You can use the following data types: strings, Booleans, numbers, null, nested objects, and arrays. |
+| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
 
-Content will be added to this section.--->
+### Usage
+
+To get started, create the following `pipeline.yaml` file:
+
+```yaml
+pipeline:
+  source:
+    file:
+      path: "/full/path/to/logs_json.log"
+      record_type: "event"
+      format: "json"
+  processor:
+    - add_entries:
+        entries:
+        - key: "newMessage"
+          value: 3
+          overwrite_if_key_exists: true
+  sink:
+    - stdout:
+```
+{% include copy.html %}
+
+
+Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
+
+For example, before you run the `add_entries` processor, if the `logs_json.log` file contains the following event record:
+
+```json
+{"message": "hello"}
+```
+
+Then when you run the `add_entries` processor using the previous configuration, it adds a new entry `{"newMessage": 3}` to the existing event `{"message": "hello"}` so that the new event contains two entries in the final output:
+
+```json
+{"message": "hello", "newMessage": 3}
+```
+
+> If `newMessage` already exists, its existing value is overwritten with a value of `3`.
 
diff --git a/_data-prepper/pipelines/configuration/processors/aggregate.md b/_data-prepper/pipelines/configuration/processors/aggregate.md
@@ -1,16 +1,19 @@
 ---
 layout: default
-title: aggregate
+title: Aggregate processor
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 41
 ---
 
 # aggregate
 
-## Overview
+The `aggregate` processor groups events based on the keys provided and performs an action on each group. 
 
-The `aggregate` processor groups events based on the keys provided and performs an action on each group. The following table describes the options you can use to configure the `aggregate` processor.
+
+## Configuration
+
+The following table describes the options you can use to configure the `aggregate` processor.
 
 Option | Required | Type | Description
 :--- | :--- | :--- | :---

diff --git a/_data-prepper/pipelines/configuration/processors/convert_entry_type.md b/_data-prepper/pipelines/configuration/processors/convert_entry_type.md
@@ -0,0 +1,55 @@
+---
+layout: default
+title: Convert entry type processor
+parent: Processors
+grand_parent: Pipelines
+nav_order: 47
+---
+
+# convert_entry_type_type
+
+The `convert_entry_type` processor converts a value type associated with the specified key in a event to the specified type. It is a casting processor that changes the types of some fields in events. Some data must be converted to a different type, such as an integer to a double, or a string to an integer, so that it will pass the events through condition-based processors or perform conditional routing. 
+
+## Configuration
+
+You can configure the `convert_entry_type` processor with the following options.
+
+| Option | Required | Description |
+| :--- | :--- | :--- |
+| `key`| Yes | Keys whose value needs to be converted to a different type. |
+| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. |
+
+## Usage
+
+To get started, create the following `pipeline.yaml` file:
+
+```yaml
+type-conv-pipeline:
+  source:
+    file:
+      path: "/full/path/to/logs_json.log"
+      record_type: "event"
+      format: "json"
+  processor:
+    - convert_entry_type_type:
+        key: "response_status"
+        type: "integer"
+  sink:
+    - stdout:
+```
+{% include copy.html %}
+
+Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). 
+
+For example, before you run the `convert_entry_type` processor, if the `logs_json.log` file contains the following event record:
+
+
+```json
+{"message": "value", "response_status":"200"}
+```
+
+The `convert_entry_type` processor converts the output received to the following output, where the type of `response_status` value changes from a string to an integer:
+
+```json
+{"message":"value","response_status":200}
+```
diff --git a/_data-prepper/pipelines/configuration/processors/copy-values.md b/_data-prepper/pipelines/configuration/processors/copy-values.md
@@ -1,24 +1,60 @@
 ---
 layout: default
-title: copy_values
+title: Copy values processor
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 48
 ---
 
 # copy_values
 
-## Overview
+The `copy_values` processor copies values within an event and is a [mutate event]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/mutate-event/) processor. 
 
-The `copy_values` processor copies values within an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `copy_values` processor.
+## Configuration
 
-Option | Required | Type | Description
-:--- | :--- | :--- | :---
-entries | Yes | List | The list of entries to be copied. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`.
-from_key | N/A | N/A | The key of the entry to be copied.
-to_key | N/A | N/A | The key of the new entry to be added.
-overwrite_if_to_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`.
+You can configure the `copy_values` processor with the following options.
 
-<!---## Configuration
+| Option | Required | Description |
+:--- | :--- | :---
+| `entries` | Yes | A list of entries to be copied in an event. |
+| `from_key` | Yes | The key of the entry to be copied. |
+| `to_key` | Yes | The key of the new entry to be added. |
+| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |
 
-Content will be added to this section.--->
+## Usage
+
+To get started, create the following `pipeline.yaml` file:
+
+```yaml
+pipeline:
+  source:
+    file:
+      path: "/full/path/to/logs_json.log"
+      record_type: "event"
+      format: "json"
+  processor:
+    - copy_values:
+        entries:
+        - from_key: "message"
+          to_key: "newMessage"
+          overwrite_if_to_key_exists: true
+  sink:
+    - stdout:
+```
+{% include copy.html %}
+
+Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper). 
+
+For example, before you run the `copy_values` processor, if the `logs_json.log` file contains the following event record:
+
+```json
+{"message": "hello"}
+```
+
+When you run this processor, it parses the message into the following output:
+
+```json
+{"message": "hello", "newMessage": "hello"}
+```
+
+> If `newMessage` already exists, its existing value is overwritten with `value`.
diff --git a/_data-prepper/pipelines/configuration/processors/csv.md b/_data-prepper/pipelines/configuration/processors/csv.md
@@ -1,16 +1,18 @@
 ---
 layout: default
-title: csv
+title: CSV processor
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 49
 ---
 
 # csv
 
-## Overview
+The `csv` processor parses comma-separated values (CSVs) from the event into columns.
 
-The `csv` processor parses comma-separated values (CSVs) from the event into columns. The following table describes the options you can use to configure the `csv` processor.
+## Configuration
+
+The following table describes the options you can use to configure the `csv` processor.
 
 Option | Required | Type | Description
 :--- | :--- | :--- | :---

diff --git a/_data-prepper/pipelines/configuration/processors/date.md b/_data-prepper/pipelines/configuration/processors/date.md
@@ -1,16 +1,19 @@
 ---
 layout: default
-title: date
+title: Date
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 50
 ---
 
 # date
 
-## Overview
 
-The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp. The following table describes the options you can use to configure the `date` processor.
+The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp. 
+
+## Configuration
+
+The following table describes the options you can use to configure the `date` processor.
 
 Option | Required | Type | Description
 :--- | :--- | :--- | :---

diff --git a/_data-prepper/pipelines/configuration/processors/delete-entries.md b/_data-prepper/pipelines/configuration/processors/delete-entries.md
@@ -3,19 +3,52 @@ layout: default
 title: delete_entries
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 51
 ---
 
 # delete_entries
 
-## Overview
+The `delete_entries` processor deletes entries, such as key-value pairs, from an event. You can define the keys you want to delete in the `with-keys` field following `delete_entries` in the YAML configuration file. Those keys and their values are deleted. 
 
-The `delete_entries` processor deletes entries in an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `delete-entries` processor.
+## Configuration
 
-Option | Required | Type | Description
-:--- | :--- | :--- | :---
-with_keys | Yes | List |  An array of keys of the entries to be deleted.
+You can configure the `delete_entries` processor with the following options.
 
-<!---## Configuration
+| Option | Required | Description |
+:--- | :--- | :---
+| `with_keys` | Yes | An array of keys for the entries to be deleted. |
 
-Content will be added to this section.--->
+## Usage
+
+To get started, create the following `pipeline.yaml` file:
+
+```yaml
+pipeline:
+  source:
+    file:
+      path: "/full/path/to/logs_json.log"
+      record_type: "event"
+      format: "json"
+  processor:
+    - delete_entries:
+        with_keys: ["message"]
+  sink:
+    - stdout:
+```
+{% include copy.html %}
+
+Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).
+
+For example, before you run the `delete_entries` processor, if the `logs_json.log` file contains the following event record:
+
+```json
+{"message": "hello", "message2": "goodbye"}
+```
+
+When you run the `delete_entries` processor, it parses the message into the following output:
+
+```json
+{"message2": "goodbye"}
+```
+
+> If `message` does not exist in the event, then no action occurs.
diff --git a/_data-prepper/pipelines/configuration/processors/drop-events.md b/_data-prepper/pipelines/configuration/processors/drop-events.md
@@ -1,14 +1,13 @@
 ---
 layout: default
-title: drop_events
+title: Drop events processor
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 52
 ---
 
 # drop_events
 
-## Overview
 
 The `drop_events` processor drops all the events that are passed into it. The following table describes when events are dropped and how exceptions for dropping events are handled. 
 

diff --git a/_data-prepper/pipelines/configuration/processors/grok.md b/_data-prepper/pipelines/configuration/processors/grok.md
@@ -3,14 +3,17 @@ layout: default
 title: grok
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 53
 ---
 
 # grok
 
-## Overview
 
-The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys. The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.
+The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys. 
+
+## Configuration
+
+The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.
 
 Option | Required | Type | Description
 :--- | :--- | :--- | :---

diff --git a/_data-prepper/pipelines/configuration/processors/key-value.md b/_data-prepper/pipelines/configuration/processors/key-value.md
@@ -1,14 +1,13 @@
 ---
 layout: default
-title: key_value
+title: Key value processor
 parent: Processors
 grand_parent: Pipelines
-nav_order: 45
+nav_order: 54
 ---
 
 # key_value
 
-## Overview
 
 The `key_value` processor parses a field into key/value pairs. The following table describes `key_value` processor options available that help you parse field information into pairs.