Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GROK Extractor with OR does not work reliable #4773

Closed
jalogisch opened this Issue May 4, 2018 · 6 comments

Comments

Projects
None yet
6 participants
@jalogisch
Copy link
Member

jalogisch commented May 4, 2018

Expected Behavior

When using GROK Pattern to extract content from fields with OR like (%{INT:login}|%{WORD:login}) it should match on one of the Patterns.

Current Behavior

If the second option in the OR is NULL the field is not created.

screenshot 2018-05-04 17 01 31
screenshot 2018-05-04 17 01 53

Steps to Reproduce (for bugs)

  1. select a field to create a GROK Extractor from
  2. create a grok pattern include OR (example above)
  3. use named_matches_only
  4. Hit try to see what is matched.

Context

When you use multiple nested pattern it can look like it is not possible to have a working solution, sometimes a match happens and sometimes not.

Your Environment

  • Graylog Version: 2.4.3

@joschi joschi added this to the 3.0.0 milestone May 28, 2018

@petracvv

This comment has been minimized.

Copy link

petracvv commented Aug 13, 2018

I am seeing something similar using the pipeline grok() function. ORs do not seem to work as they should.

Would that be the same issue as here?

@kmerz

This comment has been minimized.

Copy link
Member

kmerz commented Aug 14, 2018

I just tested it on 3.0.0 pre alpha and it seems there is yet a different result again:

screenshot_2018-08-14 graylog - new extractor for input test

@jalogisch

This comment has been minimized.

Copy link
Member Author

jalogisch commented Dec 13, 2018

it looks like the 3.0 release will behave now like other GROK implementations out in the wild. It should be verified but I guess that it can be closed as resolved then.

@bernd bernd added to-test S labels Jan 7, 2019

@edmundoa

This comment has been minimized.

Copy link
Member

edmundoa commented Feb 4, 2019

I tested this against 3.0.0-rc.1 and it worked fine for me. I'm closing the ticket, @jalogisch please reopen if you can reproduce the original issue.

@edmundoa edmundoa closed this Feb 4, 2019

@kmerz kmerz referenced this issue Mar 6, 2019

Merged

Flatten grok pattern result when applied with OR #5749

4 of 9 tasks complete

@kmerz kmerz modified the milestones: 3.0.0, 3.0.1 Mar 6, 2019

@kmerz

This comment has been minimized.

Copy link
Member

kmerz commented Mar 6, 2019

It was decided that the correct behavior is that the return value should be without null:

So DELETE instead of [DELETE, null].

@kmerz kmerz reopened this Mar 6, 2019

bernd added a commit that referenced this issue Mar 21, 2019

Add test for issue #4773
This checks that extractors actually use ".captureFlattened()".

bernd added a commit that referenced this issue Mar 21, 2019

Add test for issue #4773
This checks that extractors actually use ".captureFlattened()".

@bernd bernd closed this in #5749 Mar 21, 2019

bernd added a commit that referenced this issue Mar 21, 2019

Flatten grok pattern result when applied with OR (#5749)
* Flatten grok pattern result when applied with OR

Prior to this change, a named grok pattern containing a OR like

`(%{INT:login}|%{WORD:login}})`

would result in array like

`[null, found]`

This change will flatten the result (remove the null) and
only print the single left value instead of the array like
syntax:

`found`.

* Add test for issue #4773

This checks that extractors actually use ".captureFlattened()".

@kmerz kmerz referenced this issue Mar 21, 2019

Merged

Flatten grok pattern result when applied with OR #5790

4 of 9 tasks complete

kmerz added a commit that referenced this issue Mar 21, 2019

Flatten grok pattern result when applied with OR (#5749)
* Flatten grok pattern result when applied with OR

Prior to this change, a named grok pattern containing a OR like

`(%{INT:login}|%{WORD:login}})`

would result in array like

`[null, found]`

This change will flatten the result (remove the null) and
only print the single left value instead of the array like
syntax:

`found`.

* Add test for issue #4773

This checks that extractors actually use ".captureFlattened()".

bernd added a commit that referenced this issue Mar 21, 2019

Flatten grok pattern result when applied with OR (#5749) (#5790)
* Flatten grok pattern result when applied with OR

Prior to this change, a named grok pattern containing a OR like

`(%{INT:login}|%{WORD:login}})`

would result in array like

`[null, found]`

This change will flatten the result (remove the null) and
only print the single left value instead of the array like
syntax:

`found`.

* Add test for issue #4773

This checks that extractors actually use ".captureFlattened()".
@kmerz

This comment has been minimized.

Copy link
Member

kmerz commented Mar 27, 2019

The implemented solution has a unexpected behavior:

2019-03-27 11:48:16,049 ERROR: org.graylog2.filters.ExtractorFilter - Could not apply extractor "UFW SHORT" (id=c4b53200-2a20-11e9-91c5-00e18cb9c35a) to message d377ab03-507d-11e9-b2bb-00e18cb9c35a
io.krakens.grok.api.exception.GrokException: key 'IPV4' has multiple non-null values, this is not allowed in flattened mode, values:'192.168.2.108', '239.255.255.250'
    at io.krakens.grok.api.Match.lambda$capture$0(Match.java:175) ~[grok-0.1.9-graylog-1.jar:?]
    at java.util.LinkedHashMap.forEach(LinkedHashMap.java:684) ~[?:1.8.0_191]
    at io.krakens.grok.api.Match.capture(Match.java:134) ~[grok-0.1.9-graylog-1.jar:?]
    at io.krakens.grok.api.Match.captureFlattened(Match.java:109) ~[grok-0.1.9-graylog-1.jar:?]
    at org.graylog2.inputs.extractors.GrokExtractor.run(GrokExtractor.java:94) ~[classes/:?]
    at org.graylog2.plugin.inputs.Extractor.runExtractor(Extractor.java:214) ~[classes/:?]
    at org.graylog2.filters.ExtractorFilter.filter(ExtractorFilter.java:77) [classes/:?]
    at org.graylog2.messageprocessors.MessageFilterChainProcessor.process(MessageFilterChainProcessor.java:100) [classes/:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.handleMessage(ProcessBufferProcessor.java:114) [classes/:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.dispatchMessage(ProcessBufferProcessor.java:100) [classes/:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:77) [classes/:?]
    at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:42) [classes/:?]
    at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [disruptor-3.4.2.jar:?]
    at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [metrics-core-4.0.3.jar:4.0.3]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_191]```

We might need to implement out own flatten mechanism.

@kmerz kmerz reopened this Mar 27, 2019

kmerz added a commit that referenced this issue Mar 28, 2019

Upgrade to fixed version of java grok
and add a addional test case

Fixes #4773

@bernd bernd closed this in d41e60f Mar 28, 2019

bernd added a commit that referenced this issue Mar 28, 2019

Fix problem with captureFlattened (#5818)
* Add test for flatten with multiple non-null-values

* Upgrade to fixed version of java grok

and add a addional test case

Fixes #4773

(cherry picked from commit d41e60f)

edmundoa added a commit that referenced this issue Mar 29, 2019

Fix problem with captureFlattened (#5818) (#5820)
* Add test for flatten with multiple non-null-values

* Upgrade to fixed version of java grok

and add a addional test case

Fixes #4773

(cherry picked from commit d41e60f)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.