Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add select_entries processor #4147

Merged
merged 2 commits into from
Feb 25, 2024
Merged

Conversation

kkondaka
Copy link
Collaborator

Description

Added select_entries processor to select a few entries from input events.

Configuration

  processor:
    - select_entries:
         include_keys: ["key1", "key2"]
         select_when: '/key1 = "K1"'

If the input event is as follows

{"key1" : "K1", "key2": "K2", "key3": "K3", "message": "new message"}

Then the output event would be

{"key1" : "K1", "key2": "K2"}

Issues Resolved

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • [ X] New functionality includes testing.
  • New functionality has a documentation issue. Please link to it in this PR.
    • New functionality has javadoc added
  • [X ] Commits are signed with a real name per the DCO

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Krishna Kondaka <krishkdk@dev-dsk-krishkdk-2c-bd29c437.us-west-2.amazon.com>
super(pluginMetrics);
this.entries = Arrays.asList(config.getIncludeKeys());
this.selectWhen = config.getSelectWhen();
this.expressionEvaluator = expressionEvaluator;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please validate the selectWhen in this block.

See for example:

if (decompressProcessorConfig.getDecompressWhen() != null
&& !expressionEvaluator.isValidExpressionStatement(decompressProcessorConfig.getDecompressWhen())) {
throw new InvalidPluginConfigurationException(
String.format("decompress_when value of %s is not a valid expression statement. " +
"See https://opensearch.org/docs/latest/data-prepper/pipelines/expression-syntax/ for valid expression syntax.", decompressProcessorConfig.getDecompressWhen()));
}

// Delete all entries from the event
Set keysToDelete = recordEvent.toMap().keySet();
Iterator iter = keysToDelete.iterator();
while (iter.hasNext()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might make more sense to have a clear() method. We could add that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about. OK, will add taht to JacksonEvent.

But is this is the optimal way or just creating a new event with the entries of interest is better>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure which way is optimal from a performance perspective. But, in terms of tracking all the metadata and keeping the handles, wouldn't clearing and setting be more straightforward?

@NotEmpty
@NotNull
@JsonProperty("include_keys")
private String[] includeKeys;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this need to be String[]? Why not just make it a List

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied from DeleteEntries processor code :-) I guess I can make it a list.

@DataPrepperPluginConstructor
public SelectEntriesProcessor(final PluginMetrics pluginMetrics, final SelectEntriesProcessorConfig config, final ExpressionEvaluator expressionEvaluator) {
super(pluginMetrics);
this.entries = Arrays.asList(config.getIncludeKeys());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If include_keys is a List this is unnecessary

// To handle nested case, just get the values and store
// in a temporary map.
Map<String, Object> outMap = new HashMap<>();
for (String entry: entries) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this for (String keyToInclude : config.getInlcudeKeys())

Signed-off-by: Krishna Kondaka <krishkdk@dev-dsk-krishkdk-2c-bd29c437.us-west-2.amazon.com>
@kkondaka kkondaka merged commit ea8d5f5 into opensearch-project:main Feb 25, 2024
47 checks passed
@dlvenable dlvenable added this to the v2.7 milestone Mar 19, 2024
@kkondaka kkondaka deleted the select-entries branch May 13, 2024 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants