New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filters/clone/conditionals bug #1548

Closed
colinsurprenant opened this Issue Jul 17, 2014 · 14 comments

Comments

Projects
None yet
@colinsurprenant
Contributor

colinsurprenant commented Jul 17, 2014

I am not sure what the source of this issue is but right now my brain is fried so I'll post it here for others to look at it, and when my brain is unfried I'll work on it.

with this configurations

input { stdin { } }

filter {
  clone { clones => ["clone"] }

  if [type] == "clone" {
    mutate { add_field => { "clone" => "true" } }
  } else {
    mutate { add_field => { "original" => "true" } }
  }
}

output { stdout { codec => "rubydebug" } }

type in any input, "foo" for example

{
       "message" => "foo",
      "@version" => "1",
    "@timestamp" => "2014-07-17T23:12:29.536Z",
          "host" => "colin-mbp13r",
      "original" => "true"
}
{
       "message" => "foo",
      "@version" => "1",
    "@timestamp" => "2014-07-17T23:12:29.536Z",
          "host" => "colin-mbp13r",
          "type" => "clone",
      "original" => "true"
}

clearly, the 2nd event, the clone, did not correctly went through the if [type] == "clone" conditional.

@jordansissel

This comment has been minimized.

Show comment
Hide comment
@jordansissel

jordansissel Jul 17, 2014

Contributor

I swear there was a patch floating around for this, hmm..

Contributor

jordansissel commented Jul 17, 2014

I swear there was a patch floating around for this, hmm..

@colinsurprenant

This comment has been minimized.

Show comment
Hide comment
@colinsurprenant

colinsurprenant Jul 17, 2014

Contributor

I originally thought I introduced that bug in PR #1545 while playing with the filters compiler & stuff but no, unrelated.

Contributor

colinsurprenant commented Jul 17, 2014

I originally thought I introduced that bug in PR #1545 while playing with the filters compiler & stuff but no, unrelated.

@samuraraujo

This comment has been minimized.

Show comment
Hide comment
@samuraraujo

samuraraujo Jul 28, 2014

I also notice that conditionals does not work on events cloned and consumed by pipeline.filter_flusher.

I do not have an example, but in general, in many cases that I tested clone events and conditionals do not work.

Any patch for this?

samuraraujo commented Jul 28, 2014

I also notice that conditionals does not work on events cloned and consumed by pipeline.filter_flusher.

I do not have an example, but in general, in many cases that I tested clone events and conditionals do not work.

Any patch for this?

@colinsurprenant

This comment has been minimized.

Show comment
Hide comment
@colinsurprenant

colinsurprenant Jul 29, 2014

Contributor

not yet. did not get back to it...

Contributor

colinsurprenant commented Jul 29, 2014

not yet. did not get back to it...

@wiibaa

This comment has been minimized.

Show comment
Hide comment
@wiibaa

wiibaa Jul 29, 2014

Contributor

@colinsurprenant it's a bug since 1.2.x, so surely not introduced by you... If you need some more context please check LOGSTASH-1477
Maybe your change in #1545 already covered this issue...

Contributor

wiibaa commented Jul 29, 2014

@colinsurprenant it's a bug since 1.2.x, so surely not introduced by you... If you need some more context please check LOGSTASH-1477
Maybe your change in #1545 already covered this issue...

@jpyth

This comment has been minimized.

Show comment
Hide comment
@jpyth

jpyth Jul 29, 2014

It seems like any event created by a filter can cause this, all future conditionals are ignored.

This conf:

[jpyth@logstash bin]# cat bug-merge.conf 
input {
  file {
    type => bug
    path => "/tmp/bugData.log"
  }
}
filter {

  if [type] == "bug" {
    if "a" == [message] {
      noop {
        add_tag => "inA"
      }
      merge {
        order => "first"
        key => "a"
        add_tag => "aMerge"
      }
      merge {
        order => "last"
        key => "a"
        add_tag => "aMerge"
      }
    }
    if "b" == [message] {
      noop {
        add_tag => "inB"
      }
    }
    if "c" == [message] {
      noop {
        add_tag => "inC"
      }
    }
    if "d" == [message] {
      noop {
        add_tag => "inD"
      }
    }

  }

}
output {
  stdout {
    codec => rubydebug
  }
}

Dumping this data into /tmp/bugData.log

[jpyth@logstash ]# cat bugData 
a
a
[jpyth@logstash]# 

Resulted in:

Sending logstash logs to /tmp/debug.log.
{
       "message" => "a",
      "@version" => "1",
    "@timestamp" => "2014-07-29T12:55:18.513Z",
          "type" => "bug",
          "host" => "logstashHost.overthere",
          "path" => "/tmp/bugData.log",
          "tags" => [
        [0] "inA",
        [1] "aMerge"
    ]
}
{
       "message" => "a",
      "@version" => "1",
    "@timestamp" => "2014-07-29T12:55:18.514Z",
          "type" => "bug",
          "host" => "logstashHost.overthere",
          "path" => "/tmp/bugData.log",
          "tags" => [
        [0] "inA",
        [1] "aMerge"
    ]
}
{
            "@version" => "1",
          "@timestamp" => "2014-07-29T12:55:18.513Z",
              "source" => "logstashHost.overthere",
           "merge.key" => "a",
    "merge.time_delta" => 0.0,
             "message" => "a",
                "type" => "bug",
                "host" => "logstashHost.overthere",
                "path" => "/tmp/bugData.log",
                "tags" => [
        [0] "inA",
        [1] "aMerge",
        [2] "inB",
        [3] "inC",
        [4] "inD"
    ]
}
{
            "@version" => "1",
          "@timestamp" => "2014-07-29T12:55:18.514Z",
              "source" => "logstashHost.overthere",
           "merge.key" => "a",
    "merge.time_delta" => 0.0,
             "message" => "a",
                "type" => "bug",
                "host" => "logstashHost.overthere",
                "path" => "/tmp/bugData.log",
                "tags" => [
        [0] "inA",
        [1] "aMerge",
        [2] "inB",
        [3] "inC",
        [4] "inD"
    ]
}
[jpyth@logstash]# 

Same result doing it with the metric plugin, and another one I wrote 'alert'. The events are created by the filter and picked up by 'flush'. I initially used event.append to merge two events but then switched to just created each field from the original events. Same end result.

I had read in the Jira someone states the conditionals were honoring the base event, but that is not what I am seeing. They are just being ignored all together on the created event.

Found a comment from October/2013 suggesting using type/tags deprecated conditionals. Confirmed these work. :)

jpyth commented Jul 29, 2014

It seems like any event created by a filter can cause this, all future conditionals are ignored.

This conf:

[jpyth@logstash bin]# cat bug-merge.conf 
input {
  file {
    type => bug
    path => "/tmp/bugData.log"
  }
}
filter {

  if [type] == "bug" {
    if "a" == [message] {
      noop {
        add_tag => "inA"
      }
      merge {
        order => "first"
        key => "a"
        add_tag => "aMerge"
      }
      merge {
        order => "last"
        key => "a"
        add_tag => "aMerge"
      }
    }
    if "b" == [message] {
      noop {
        add_tag => "inB"
      }
    }
    if "c" == [message] {
      noop {
        add_tag => "inC"
      }
    }
    if "d" == [message] {
      noop {
        add_tag => "inD"
      }
    }

  }

}
output {
  stdout {
    codec => rubydebug
  }
}

Dumping this data into /tmp/bugData.log

[jpyth@logstash ]# cat bugData 
a
a
[jpyth@logstash]# 

Resulted in:

Sending logstash logs to /tmp/debug.log.
{
       "message" => "a",
      "@version" => "1",
    "@timestamp" => "2014-07-29T12:55:18.513Z",
          "type" => "bug",
          "host" => "logstashHost.overthere",
          "path" => "/tmp/bugData.log",
          "tags" => [
        [0] "inA",
        [1] "aMerge"
    ]
}
{
       "message" => "a",
      "@version" => "1",
    "@timestamp" => "2014-07-29T12:55:18.514Z",
          "type" => "bug",
          "host" => "logstashHost.overthere",
          "path" => "/tmp/bugData.log",
          "tags" => [
        [0] "inA",
        [1] "aMerge"
    ]
}
{
            "@version" => "1",
          "@timestamp" => "2014-07-29T12:55:18.513Z",
              "source" => "logstashHost.overthere",
           "merge.key" => "a",
    "merge.time_delta" => 0.0,
             "message" => "a",
                "type" => "bug",
                "host" => "logstashHost.overthere",
                "path" => "/tmp/bugData.log",
                "tags" => [
        [0] "inA",
        [1] "aMerge",
        [2] "inB",
        [3] "inC",
        [4] "inD"
    ]
}
{
            "@version" => "1",
          "@timestamp" => "2014-07-29T12:55:18.514Z",
              "source" => "logstashHost.overthere",
           "merge.key" => "a",
    "merge.time_delta" => 0.0,
             "message" => "a",
                "type" => "bug",
                "host" => "logstashHost.overthere",
                "path" => "/tmp/bugData.log",
                "tags" => [
        [0] "inA",
        [1] "aMerge",
        [2] "inB",
        [3] "inC",
        [4] "inD"
    ]
}
[jpyth@logstash]# 

Same result doing it with the metric plugin, and another one I wrote 'alert'. The events are created by the filter and picked up by 'flush'. I initially used event.append to merge two events but then switched to just created each field from the original events. Same end result.

I had read in the Jira someone states the conditionals were honoring the base event, but that is not what I am seeing. They are just being ignored all together on the created event.

Found a comment from October/2013 suggesting using type/tags deprecated conditionals. Confirmed these work. :)

@colinsurprenant

This comment has been minimized.

Show comment
Hide comment
@colinsurprenant

colinsurprenant Aug 8, 2014

Contributor

this is now fixed in #1545.
I'd really appreciate if any of you could run tests on it!

Contributor

colinsurprenant commented Aug 8, 2014

this is now fixed in #1545.
I'd really appreciate if any of you could run tests on it!

jordansissel added a commit that referenced this issue Sep 5, 2014

Add recurse method for doing breadth-first traversal of the AST
This will be used by the filter flush compiler

Add generation of a flush lambda for each filter.

This allows filters to flush and have any generated events proceed
downward through the config as you would expect, respecting any
branches or future plugins.

Die on IOError which occurs when reading from a closed STDIN

Make filter_flusher invoke the new (and correct, I hope!) way to flush.

- On shutdown, we will also flush all filters.
- The flusher thread will terminate if we are shutting down.

Clarify the comment

Fix comment generation in the code to avoid newlines.

Add 'max_age' setting to multiline for flushing.

This setting chooses how long (in seconds) an event is considered to be
fresh before it will be automatically flushed.

This is useful for:
* slow log sources to get the 'last event' flushed,
* transaction-id-style events that have no obvious "end" event and also
  are mixed among other-id events in the same stream.

Also:
- Make filters have no teardown by default.
- Remove 'enable_flush' since it is not needed anymore; flush is always
  enabled.

refactor flush

new spool filter and specs, mainly for testing flushing

turn off unrelated test error for now

fix the flush logic, fix the flush compiles code to not include output section

synchronize cross-thread  access to @pending

refactor for performance and readability

synchronize cross-thread access to @spool

unused code

input:udp removed boggus ShutdownSignal handling, morphed loop do into while true, cosmetic reformat

use transcient events and not exceptions for in-flow pipeline signaling

inline flushing into  filterworker

removed now unnecessary flushing thread safety

fix conditionals bug for new events generated by filters & specs

spec for issue #793

performance tweeks

simplify filter handling of events and new_events

this removes unecessary duplication when treating the original event as
a special case (different from new_events generated by a filter).
Also, since @filter_func only outputs non-cancelled events, some checks
were also removed.

Move multiple filter specs to a filter_chains file

append events generated by a filter using unshift instead of insert

closes #793, closes #1429, closes #1431, closes #1548

@colinsurprenant colinsurprenant modified the milestones: v1.5.0, 1.4.3 Nov 6, 2014

@karsaroth

This comment has been minimized.

Show comment
Hide comment
@karsaroth

karsaroth Nov 13, 2014

I'm seeing a similar problem when using the elapsed filter (from contrib), and then trying to mutate the new event using a conditional. Version 1.4.2. Is this fix likely to solve that problem too?

karsaroth commented Nov 13, 2014

I'm seeing a similar problem when using the elapsed filter (from contrib), and then trying to mutate the new event using a conditional. Version 1.4.2. Is this fix likely to solve that problem too?

@colinsurprenant

This comment has been minimized.

Show comment
Hide comment
@colinsurprenant

colinsurprenant Nov 13, 2014

Contributor

looking at the elapsed filter I'd say it will but can't guarantee it without a config + sample data to reproduce.

Contributor

colinsurprenant commented Nov 13, 2014

looking at the elapsed filter I'd say it will but can't guarantee it without a config + sample data to reproduce.

@mchlsun

This comment has been minimized.

Show comment
Hide comment
@mchlsun

mchlsun Nov 17, 2014

same problem with karsaroth. When elapsed create new elapsed_expired event, it will go through all conditional code in [if..else if...else...]..

mchlsun commented Nov 17, 2014

same problem with karsaroth. When elapsed create new elapsed_expired event, it will go through all conditional code in [if..else if...else...]..

ajchalk added a commit to catalyst/logstash that referenced this issue Dec 2, 2014

Add recurse method for doing breadth-first traversal of the AST
This will be used by the filter flush compiler

Add generation of a flush lambda for each filter.

This allows filters to flush and have any generated events proceed
downward through the config as you would expect, respecting any
branches or future plugins.

Die on IOError which occurs when reading from a closed STDIN

Make filter_flusher invoke the new (and correct, I hope!) way to flush.

- On shutdown, we will also flush all filters.
- The flusher thread will terminate if we are shutting down.

Clarify the comment

Fix comment generation in the code to avoid newlines.

Add 'max_age' setting to multiline for flushing.

This setting chooses how long (in seconds) an event is considered to be
fresh before it will be automatically flushed.

This is useful for:
* slow log sources to get the 'last event' flushed,
* transaction-id-style events that have no obvious "end" event and also
  are mixed among other-id events in the same stream.

Also:
- Make filters have no teardown by default.
- Remove 'enable_flush' since it is not needed anymore; flush is always
  enabled.

refactor flush

new spool filter and specs, mainly for testing flushing

turn off unrelated test error for now

fix the flush logic, fix the flush compiles code to not include output section

synchronize cross-thread  access to @pending

refactor for performance and readability

synchronize cross-thread access to @spool

unused code

input:udp removed boggus ShutdownSignal handling, morphed loop do into while true, cosmetic reformat

use transcient events and not exceptions for in-flow pipeline signaling

inline flushing into  filterworker

removed now unnecessary flushing thread safety

fix conditionals bug for new events generated by filters & specs

spec for issue #793

performance tweeks

simplify filter handling of events and new_events

this removes unecessary duplication when treating the original event as
a special case (different from new_events generated by a filter).
Also, since @filter_func only outputs non-cancelled events, some checks
were also removed.

Move multiple filter specs to a filter_chains file

append events generated by a filter using unshift instead of insert

closes #793, closes #1429, closes #1431, closes #1548

Conflicts:
	spec/conditionals/test.rb
@oryband

This comment has been minimized.

Show comment
Hide comment
@oryband

oryband Feb 19, 2015

This bug still occurs with clone and mutate in that order in 1.4.2:

if some condition {
  clone {
    clones => "cloned"
  }
}

if [type] == "cloned" {
  mutate { .. }  # doesn't happen
}

oryband commented Feb 19, 2015

This bug still occurs with clone and mutate in that order in 1.4.2:

if some condition {
  clone {
    clones => "cloned"
  }
}

if [type] == "cloned" {
  mutate { .. }  # doesn't happen
}

@tbragin tbragin added the v1.5.0 label Jun 18, 2015

@driverpt

This comment has been minimized.

Show comment
Hide comment
@driverpt

driverpt Apr 20, 2016

@colinsurprenant Has this bug been fixed (2.3.1) ?

driverpt commented Apr 20, 2016

@colinsurprenant Has this bug been fixed (2.3.1) ?

@mjscosta

This comment has been minimized.

Show comment
Hide comment
@mjscosta

mjscosta Apr 20, 2016

Think, it has been solved! But the documentation is not clear about the semantics of the clone behavior!
Weather its reinserted into the queue, and reprocessed by the filter pipeline, or if it is added the the batch of events being filtered, and will only apply, the transformations/conditionals following the clone configuration ("call") ...

mjscosta commented Apr 20, 2016

Think, it has been solved! But the documentation is not clear about the semantics of the clone behavior!
Weather its reinserted into the queue, and reprocessed by the filter pipeline, or if it is added the the batch of events being filtered, and will only apply, the transformations/conditionals following the clone configuration ("call") ...

@jordansissel

This comment has been minimized.

Show comment
Hide comment
@jordansissel

jordansissel Apr 20, 2016

Contributor

@mjscosta Indeed it's been solved. Good note about the documentation not being clear - can you file a new ticket about getting that documented?

To answer your question, your last detail is correct in describing the behavior -- this is right:

if it is added the the batch of events being filtered, and will only apply, the transformations/conditionals following the clone configuration ("call") ...

Contributor

jordansissel commented Apr 20, 2016

@mjscosta Indeed it's been solved. Good note about the documentation not being clear - can you file a new ticket about getting that documented?

To answer your question, your last detail is correct in describing the behavior -- this is right:

if it is added the the batch of events being filtered, and will only apply, the transformations/conditionals following the clone configuration ("call") ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment