Skip to content

Commit

Permalink
Add list to map processor (opensearch-project#3806)
Browse files Browse the repository at this point in the history
* Add list to map processor.

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Tweak one last file

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Fix typo

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>

* Update mutate-event.md

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

* Add Chris' feedback

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* A couple more wording tweaks

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Nathan Bower <nbower@amazon.com>

* Update _data-prepper/pipelines/configuration/processors/list-to-map.md

Co-authored-by: Nathan Bower <nbower@amazon.com>

* Apply suggestions from code review

Co-authored-by: Nathan Bower <nbower@amazon.com>

---------

Signed-off-by: Naarcha-AWS <naarcha@amazon.com>
Co-authored-by: Hai Yan <8153134+oeyh@users.noreply.github.com>
Co-authored-by: Chris Moore <107723039+cwillum@users.noreply.github.com>
Co-authored-by: Nathan Bower <nbower@amazon.com>
  • Loading branch information
4 people committed Apr 20, 2023
1 parent 586cb3d commit e1a1f44
Show file tree
Hide file tree
Showing 26 changed files with 657 additions and 428 deletions.
61 changes: 49 additions & 12 deletions _data-prepper/pipelines/configuration/processors/add-entries.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,62 @@
---
layout: default
title: add_entries
title: Add entries processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 40
---

# add_entries

## Overview
The `add_entries` processor adds entries to an event.

The `add_entries` processor adds an entry to the event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `add_entries` processor.
### Configuration

Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | List of events to be added. Valid entries are `key`, `value`, and `overwrite_if_key_exists`.
key | N/A | N/A | Key of the new event to be added.
value | N/A | N/A | Value of the new entry to be added. Valid data types are strings, booleans, numbers, null, nested objects, and arrays containing the aforementioned data types.
overwrite_if_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`.
You can configure the `add_entries` processor with the following options.

<!--- ## Configuration
| Option | Required | Description |
| :--- | :--- | :--- |
| `entries` | Yes | A list of entries to add to an event. |
| `key` | Yes | The key of the new entry to be added. Some examples of keys include `my_key`, `myKey`, and `object/sub_Key`. |
| `value` | Yes | The value of the new entry to be added. You can use the following data types: strings, Booleans, numbers, null, nested objects, and arrays. |
| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |

Content will be added to this section.--->
### Usage

To get started, create the following `pipeline.yaml` file:

```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- add_entries:
entries:
- key: "newMessage"
value: 3
overwrite_if_key_exists: true
sink:
- stdout:
```
{% include copy.html %}


Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).

For example, before you run the `add_entries` processor, if the `logs_json.log` file contains the following event record:

```json
{"message": "hello"}
```

Then when you run the `add_entries` processor using the previous configuration, it adds a new entry `{"newMessage": 3}` to the existing event `{"message": "hello"}` so that the new event contains two entries in the final output:

```json
{"message": "hello", "newMessage": 3}
```

> If `newMessage` already exists, its existing value is overwritten with a value of `3`.
11 changes: 7 additions & 4 deletions _data-prepper/pipelines/configuration/processors/aggregate.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
---
layout: default
title: aggregate
title: Aggregate processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 41
---

# aggregate

## Overview
The `aggregate` processor groups events based on the keys provided and performs an action on each group.

The `aggregate` processor groups events based on the keys provided and performs an action on each group. The following table describes the options you can use to configure the `aggregate` processor.

## Configuration

The following table describes the options you can use to configure the `aggregate` processor.

Option | Required | Type | Description
:--- | :--- | :--- | :---
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
layout: default
title: Convert entry type processor
parent: Processors
grand_parent: Pipelines
nav_order: 47
---

# convert_entry_type_type

The `convert_entry_type` processor converts a value type associated with the specified key in a event to the specified type. It is a casting processor that changes the types of some fields in events. Some data must be converted to a different type, such as an integer to a double, or a string to an integer, so that it will pass the events through condition-based processors or perform conditional routing.

## Configuration

You can configure the `convert_entry_type` processor with the following options.

| Option | Required | Description |
| :--- | :--- | :--- |
| `key`| Yes | Keys whose value needs to be converted to a different type. |
| `type` | No | Target type for the key-value pair. Possible values are `integer`, `double`, `string`, and `Boolean`. Default value is `integer`. |

## Usage

To get started, create the following `pipeline.yaml` file:

```yaml
type-conv-pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- convert_entry_type_type:
key: "response_status"
type: "integer"
sink:
- stdout:
```
{% include copy.html %}

Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).

For example, before you run the `convert_entry_type` processor, if the `logs_json.log` file contains the following event record:


```json
{"message": "value", "response_status":"200"}
```

The `convert_entry_type` processor converts the output received to the following output, where the type of `response_status` value changes from a string to an integer:

```json
{"message":"value","response_status":200}
```
60 changes: 48 additions & 12 deletions _data-prepper/pipelines/configuration/processors/copy-values.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,60 @@
---
layout: default
title: copy_values
title: Copy values processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 48
---

# copy_values

## Overview
The `copy_values` processor copies values within an event and is a [mutate event]({{site.url}}{{site.baseurl}}/data-prepper/pipelines/configuration/processors/mutate-event/) processor.

The `copy_values` processor copies values within an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `copy_values` processor.
## Configuration

Option | Required | Type | Description
:--- | :--- | :--- | :---
entries | Yes | List | The list of entries to be copied. Valid values are `from_key`, `to_key`, and `overwrite_if_key_exists`.
from_key | N/A | N/A | The key of the entry to be copied.
to_key | N/A | N/A | The key of the new entry to be added.
overwrite_if_to_key_exists | No | Boolean | If true, the existing value is overwritten if the key already exists within the event. Default value is `false`.
You can configure the `copy_values` processor with the following options.

<!---## Configuration
| Option | Required | Description |
:--- | :--- | :---
| `entries` | Yes | A list of entries to be copied in an event. |
| `from_key` | Yes | The key of the entry to be copied. |
| `to_key` | Yes | The key of the new entry to be added. |
| `overwrite_if_key_exists` | No | When set to `true`, the existing value is overwritten if `key` already exists in the event. The default value is `false`. |

Content will be added to this section.--->
## Usage

To get started, create the following `pipeline.yaml` file:

```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- copy_values:
entries:
- from_key: "message"
to_key: "newMessage"
overwrite_if_to_key_exists: true
sink:
- stdout:
```
{% include copy.html %}

Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).

For example, before you run the `copy_values` processor, if the `logs_json.log` file contains the following event record:

```json
{"message": "hello"}
```

When you run this processor, it parses the message into the following output:

```json
{"message": "hello", "newMessage": "hello"}
```

> If `newMessage` already exists, its existing value is overwritten with `value`.
10 changes: 6 additions & 4 deletions _data-prepper/pipelines/configuration/processors/csv.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,18 @@
---
layout: default
title: csv
title: CSV processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 49
---

# csv

## Overview
The `csv` processor parses comma-separated values (CSVs) from the event into columns.

The `csv` processor parses comma-separated values (CSVs) from the event into columns. The following table describes the options you can use to configure the `csv` processor.
## Configuration

The following table describes the options you can use to configure the `csv` processor.

Option | Required | Type | Description
:--- | :--- | :--- | :---
Expand Down
11 changes: 7 additions & 4 deletions _data-prepper/pipelines/configuration/processors/date.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,19 @@
---
layout: default
title: date
title: Date
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 50
---

# date

## Overview

The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp. The following table describes the options you can use to configure the `date` processor.
The `date` processor adds a default timestamp to an event, parses timestamp fields, and converts timestamp information to the International Organization for Standardization (ISO) 8601 format. This timestamp information can be used as an event timestamp.

## Configuration

The following table describes the options you can use to configure the `date` processor.

Option | Required | Type | Description
:--- | :--- | :--- | :---
Expand Down
49 changes: 41 additions & 8 deletions _data-prepper/pipelines/configuration/processors/delete-entries.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,52 @@ layout: default
title: delete_entries
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 51
---

# delete_entries

## Overview
The `delete_entries` processor deletes entries, such as key-value pairs, from an event. You can define the keys you want to delete in the `with-keys` field following `delete_entries` in the YAML configuration file. Those keys and their values are deleted.

The `delete_entries` processor deletes entries in an event and is a [mutate event](https://github.com/opensearch-project/data-prepper/tree/main/data-prepper-plugins/mutate-event-processors#mutate-event-processors) processor. The following table describes the options you can use to configure the `delete-entries` processor.
## Configuration

Option | Required | Type | Description
:--- | :--- | :--- | :---
with_keys | Yes | List | An array of keys of the entries to be deleted.
You can configure the `delete_entries` processor with the following options.

<!---## Configuration
| Option | Required | Description |
:--- | :--- | :---
| `with_keys` | Yes | An array of keys for the entries to be deleted. |

Content will be added to this section.--->
## Usage

To get started, create the following `pipeline.yaml` file:

```yaml
pipeline:
source:
file:
path: "/full/path/to/logs_json.log"
record_type: "event"
format: "json"
processor:
- delete_entries:
with_keys: ["message"]
sink:
- stdout:
```
{% include copy.html %}

Next, create a log file named `logs_json.log` and replace the `path` in the file source of your `pipeline.yaml` file with that filepath. For more information, see [Configuring Data Prepper]({{site.url}}{{site.baseurl}}/data-prepper/getting-started/#2-configuring-data-prepper).

For example, before you run the `delete_entries` processor, if the `logs_json.log` file contains the following event record:

```json
{"message": "hello", "message2": "goodbye"}
```

When you run the `delete_entries` processor, it parses the message into the following output:

```json
{"message2": "goodbye"}
```

> If `message` does not exist in the event, then no action occurs.
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
---
layout: default
title: drop_events
title: Drop events processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 52
---

# drop_events

## Overview

The `drop_events` processor drops all the events that are passed into it. The following table describes when events are dropped and how exceptions for dropping events are handled.

Expand Down
9 changes: 6 additions & 3 deletions _data-prepper/pipelines/configuration/processors/grok.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,14 +3,17 @@ layout: default
title: grok
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 53
---

# grok

## Overview

The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys. The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.
The `Grok` processor takes unstructured data and utilizes pattern matching to structure and extract important keys.

## Configuration

The following table describes options you can use with the `Grok` processor to structure your data and make your data easier to query.

Option | Required | Type | Description
:--- | :--- | :--- | :---
Expand Down
5 changes: 2 additions & 3 deletions _data-prepper/pipelines/configuration/processors/key-value.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
---
layout: default
title: key_value
title: Key value processor
parent: Processors
grand_parent: Pipelines
nav_order: 45
nav_order: 54
---

# key_value

## Overview

The `key_value` processor parses a field into key/value pairs. The following table describes `key_value` processor options available that help you parse field information into pairs.

Expand Down
Loading

0 comments on commit e1a1f44

Please sign in to comment.