Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify that the reduce transform can handle various multi-line scenarios #4574

Closed
binarylogic opened this issue Oct 14, 2020 · 4 comments · Fixed by #4771
Closed

Verify that the reduce transform can handle various multi-line scenarios #4574

binarylogic opened this issue Oct 14, 2020 · 4 comments · Fixed by #4771
Assignees
Labels
transform: reduce Anything `reduce` transform related type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@binarylogic
Copy link
Contributor

binarylogic commented Oct 14, 2020

This is a meta issue to ensure the upcoming reduce transform changes will solve a variety of multiline scenarios.

Scenarios

1. Ruby Exceptions

Log stream:

Started GET "/" for 127.0.0.1 at 2012-03-10 14:28:14 +0100
foobar.rb:6:in `/': divided by 0 (ZeroDivisionError)
  from foobar.rb:6:in `bar'
  from foobar.rb:2:in `foo'
  from foobar.rb:9:in `<main>'
Started GET "/" for 127.0.0.1 at 2012-03-10 14:28:14 +0100

Lazy solution:

[transforms.reduce]
type = "reduce"
starts_with = '!match(.message, /^\w/)'
merge_strategies.default = "discard"
merge_strategies.fields.message = "concat"

Strict solution (checking both the start and end conditions)

[transforms.reduce]
type = "reduce"
starts_with = '!match(.message, /^[a-z]*\.rb/)'
ends_with = '!match(.message, /^\w/)'
merge_strategies.default = "discard"
merge_strategies.fields.message = "concat"

2. Line Continuations

Log stream:

First-line
Second line\
more second line\
end of second line

Config:

[transforms.reduce]
type = "reduce"
ends_when = '!endsWith(.message, "\")'
merge_strategies.default = "discard"
merge_strategies.fields.message = "concat"

3. Line Terminations

first line;
second line
more of the second line
end of second line;
third line;

Config:

[transforms.reduce]
type = "reduce"
ends_when = 'endsWith(.message, ";")'
merge_strategies.default = "discard"
merge_strategies.fields.message = "concat"

4. Windows Event Logs

Log stream:

<12> first line 
 more of the first line
<22> second line
<17> third line

Config:

[transforms.reduce]
type = "reduce"
starts_with = 'match(.message, /^<\d\d> /)'
merge_strategies.default = "discard"
merge_strategies.fields.message = "concat"
@binarylogic binarylogic added type: enhancement A value-adding code change that enhances its existing functionality. transform: reduce Anything `reduce` transform related labels Oct 14, 2020
@binarylogic binarylogic added this to the 2020-10-12: Son of Flynn milestone Oct 14, 2020
@MOZGIII
Copy link
Contributor

MOZGIII commented Oct 14, 2020

While working on the line_agg, I uncovered 4 significantly different scenarios. We can take inspiration for this issue from the line_agg test suite:

https://github.com/timberio/vector/blob/75a0c1b594622f6d2a13c7af3602983f96be8070/src/line_agg.rs#L402-L525

@MOZGIII
Copy link
Contributor

MOZGIII commented Oct 14, 2020

Also, this PR might be useful as a learning point on some difficulties with the implementation that I encountered: #3262

@MOZGIII
Copy link
Contributor

MOZGIII commented Oct 16, 2020

Looked through the use cases again, and it's pretty hard to find the real-life use cases for them out of my head...

One of the more important ones are https://github.com/timberio/vector/blob/75a0c1b594622f6d2a13c7af3602983f96be8070/src/line_agg.rs#L570-L642

Thankfully, there is a relevant issue #3237 with some real-world examples.

Other than that - I think we just want to replicate the test cases with the new implementation and make sure they work. Should be enough to, at least, preserve the functionality.

@binarylogic do we plan to support specifying both starts_with and ends_when? We do, I see in the example.

Do we plan to support ends_before/continue_through (which are pretty much direct and inverted predicates of the same logic) - to test the next event, and if it matches/intert-matches - to finish the aggregation of/on this message, and not merge the next message in, but leave as-is?

@binarylogic
Copy link
Contributor Author

I'll leave it to @JeanMertz to decide on that. From my understanding it would allow the user to make the conditions strictire but I may be wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
transform: reduce Anything `reduce` transform related type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants