Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "regexp_expand" operator for "text" processor #260

Closed
DpoBoceka opened this issue Aug 21, 2019 · 3 comments · Fixed by #314
Closed

Add "regexp_expand" operator for "text" processor #260

DpoBoceka opened this issue Aug 21, 2019 · 3 comments · Fixed by #314
Labels
enhancement processors Any tasks or issues relating specifically to processors

Comments

@DpoBoceka
Copy link
Contributor

DpoBoceka commented Aug 21, 2019

With https://godoc.org/regexp#Regexp.Expand we can use regexp pattern with named subgroups in it. It would be nice to have an opportunity to dynamically create new fields based on name of subgroup and returned value of it.
Use case: we have a string, we can retrieve few fields out of it with some logic and we don't want that field to be replaced with just one value. At the moment, we need to copy initial field into desired one and try to match it against our regexp. I suspect, it is also usefully great to control whether regexp returned some value at all, otherwise the field would persist original value, which is awful.
I believe, grok is able to do the trick, but it seems like a workaround mostly and a bit bulky to perceive its power, so I would rather stick to original regexp syntax which should work as a charm and, perhaps, be optimised in the future releases of golang in order to increase performance of native package.

@Jeffail Jeffail added enhancement processors Any tasks or issues relating specifically to processors labels Aug 21, 2019
@DpoBoceka
Copy link
Contributor Author

How should it replace the original message by default? With the JSON structure where keys are subgroup names?

@Jeffail
Copy link
Member

Jeffail commented Sep 15, 2019

I would use the value field as the template and set the result as the raw output. That's the behavior I would expect based on other operators. When using this processor to extract matches into JSON fields you can use other operators and processors to work the fields into valid JSON to embed in your root document.

@Jeffail
Copy link
Member

Jeffail commented Nov 14, 2019

The goal of this issue would be to add a regexp_expand operator to the text processor. The field value will be used as the template, such that given the following config:

pipeline:
  processors:
  - text:
      operator: regexp_expand
      arg: "(?m)(?P<key>\\w+):\\s+(?P<value>\\w+)$"
      value: "$key=$value\n"

And an input payload of:

# comment line
option1: value1
option2: value2

# another comment line
option3: value3

The resulting payload would be:

option1=value1
option2=value2
option3=value3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement processors Any tasks or issues relating specifically to processors
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants