Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filters/clone/conditionals bug #1548

Closed
colinsurprenant opened this issue Jul 17, 2014 · 14 comments
Closed

filters/clone/conditionals bug #1548

colinsurprenant opened this issue Jul 17, 2014 · 14 comments
Assignees

Comments

@colinsurprenant
Copy link
Contributor

@colinsurprenant colinsurprenant commented Jul 17, 2014

I am not sure what the source of this issue is but right now my brain is fried so I'll post it here for others to look at it, and when my brain is unfried I'll work on it.

with this configurations

input { stdin { } }

filter {
  clone { clones => ["clone"] }

  if [type] == "clone" {
    mutate { add_field => { "clone" => "true" } }
  } else {
    mutate { add_field => { "original" => "true" } }
  }
}

output { stdout { codec => "rubydebug" } }

type in any input, "foo" for example

{
       "message" => "foo",
      "@version" => "1",
    "@timestamp" => "2014-07-17T23:12:29.536Z",
          "host" => "colin-mbp13r",
      "original" => "true"
}
{
       "message" => "foo",
      "@version" => "1",
    "@timestamp" => "2014-07-17T23:12:29.536Z",
          "host" => "colin-mbp13r",
          "type" => "clone",
      "original" => "true"
}

clearly, the 2nd event, the clone, did not correctly went through the if [type] == "clone" conditional.

@jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Jul 17, 2014

I swear there was a patch floating around for this, hmm..

@colinsurprenant
Copy link
Contributor Author

@colinsurprenant colinsurprenant commented Jul 17, 2014

I originally thought I introduced that bug in PR #1545 while playing with the filters compiler & stuff but no, unrelated.

@samuraraujo
Copy link

@samuraraujo samuraraujo commented Jul 28, 2014

I also notice that conditionals does not work on events cloned and consumed by pipeline.filter_flusher.

I do not have an example, but in general, in many cases that I tested clone events and conditionals do not work.

Any patch for this?

@colinsurprenant
Copy link
Contributor Author

@colinsurprenant colinsurprenant commented Jul 29, 2014

not yet. did not get back to it...

@wiibaa
Copy link
Contributor

@wiibaa wiibaa commented Jul 29, 2014

@colinsurprenant it's a bug since 1.2.x, so surely not introduced by you... If you need some more context please check LOGSTASH-1477
Maybe your change in #1545 already covered this issue...

@jpyth
Copy link

@jpyth jpyth commented Jul 29, 2014

It seems like any event created by a filter can cause this, all future conditionals are ignored.

This conf:

[jpyth@logstash bin]# cat bug-merge.conf 
input {
  file {
    type => bug
    path => "/tmp/bugData.log"
  }
}
filter {

  if [type] == "bug" {
    if "a" == [message] {
      noop {
        add_tag => "inA"
      }
      merge {
        order => "first"
        key => "a"
        add_tag => "aMerge"
      }
      merge {
        order => "last"
        key => "a"
        add_tag => "aMerge"
      }
    }
    if "b" == [message] {
      noop {
        add_tag => "inB"
      }
    }
    if "c" == [message] {
      noop {
        add_tag => "inC"
      }
    }
    if "d" == [message] {
      noop {
        add_tag => "inD"
      }
    }

  }

}
output {
  stdout {
    codec => rubydebug
  }
}

Dumping this data into /tmp/bugData.log

[jpyth@logstash ]# cat bugData 
a
a
[jpyth@logstash]# 

Resulted in:

Sending logstash logs to /tmp/debug.log.
{
       "message" => "a",
      "@version" => "1",
    "@timestamp" => "2014-07-29T12:55:18.513Z",
          "type" => "bug",
          "host" => "logstashHost.overthere",
          "path" => "/tmp/bugData.log",
          "tags" => [
        [0] "inA",
        [1] "aMerge"
    ]
}
{
       "message" => "a",
      "@version" => "1",
    "@timestamp" => "2014-07-29T12:55:18.514Z",
          "type" => "bug",
          "host" => "logstashHost.overthere",
          "path" => "/tmp/bugData.log",
          "tags" => [
        [0] "inA",
        [1] "aMerge"
    ]
}
{
            "@version" => "1",
          "@timestamp" => "2014-07-29T12:55:18.513Z",
              "source" => "logstashHost.overthere",
           "merge.key" => "a",
    "merge.time_delta" => 0.0,
             "message" => "a",
                "type" => "bug",
                "host" => "logstashHost.overthere",
                "path" => "/tmp/bugData.log",
                "tags" => [
        [0] "inA",
        [1] "aMerge",
        [2] "inB",
        [3] "inC",
        [4] "inD"
    ]
}
{
            "@version" => "1",
          "@timestamp" => "2014-07-29T12:55:18.514Z",
              "source" => "logstashHost.overthere",
           "merge.key" => "a",
    "merge.time_delta" => 0.0,
             "message" => "a",
                "type" => "bug",
                "host" => "logstashHost.overthere",
                "path" => "/tmp/bugData.log",
                "tags" => [
        [0] "inA",
        [1] "aMerge",
        [2] "inB",
        [3] "inC",
        [4] "inD"
    ]
}
[jpyth@logstash]# 

Same result doing it with the metric plugin, and another one I wrote 'alert'. The events are created by the filter and picked up by 'flush'. I initially used event.append to merge two events but then switched to just created each field from the original events. Same end result.

I had read in the Jira someone states the conditionals were honoring the base event, but that is not what I am seeing. They are just being ignored all together on the created event.

Found a comment from October/2013 suggesting using type/tags deprecated conditionals. Confirmed these work. :)

@colinsurprenant
Copy link
Contributor Author

@colinsurprenant colinsurprenant commented Aug 8, 2014

this is now fixed in #1545.
I'd really appreciate if any of you could run tests on it!

jordansissel added a commit that referenced this issue Sep 5, 2014
This will be used by the filter flush compiler

Add generation of a flush lambda for each filter.

This allows filters to flush and have any generated events proceed
downward through the config as you would expect, respecting any
branches or future plugins.

Die on IOError which occurs when reading from a closed STDIN

Make filter_flusher invoke the new (and correct, I hope!) way to flush.

- On shutdown, we will also flush all filters.
- The flusher thread will terminate if we are shutting down.

Clarify the comment

Fix comment generation in the code to avoid newlines.

Add 'max_age' setting to multiline for flushing.

This setting chooses how long (in seconds) an event is considered to be
fresh before it will be automatically flushed.

This is useful for:
* slow log sources to get the 'last event' flushed,
* transaction-id-style events that have no obvious "end" event and also
  are mixed among other-id events in the same stream.

Also:
- Make filters have no teardown by default.
- Remove 'enable_flush' since it is not needed anymore; flush is always
  enabled.

refactor flush

new spool filter and specs, mainly for testing flushing

turn off unrelated test error for now

fix the flush logic, fix the flush compiles code to not include output section

synchronize cross-thread  access to @pending

refactor for performance and readability

synchronize cross-thread access to @spool

unused code

input:udp removed boggus ShutdownSignal handling, morphed loop do into while true, cosmetic reformat

use transcient events and not exceptions for in-flow pipeline signaling

inline flushing into  filterworker

removed now unnecessary flushing thread safety

fix conditionals bug for new events generated by filters & specs

spec for issue #793

performance tweeks

simplify filter handling of events and new_events

this removes unecessary duplication when treating the original event as
a special case (different from new_events generated by a filter).
Also, since @filter_func only outputs non-cancelled events, some checks
were also removed.

Move multiple filter specs to a filter_chains file

append events generated by a filter using unshift instead of insert

closes #793, closes #1429, closes #1431, closes #1548
@colinsurprenant colinsurprenant modified the milestones: v1.5.0, 1.4.3 Nov 6, 2014
@karsaroth
Copy link

@karsaroth karsaroth commented Nov 13, 2014

I'm seeing a similar problem when using the elapsed filter (from contrib), and then trying to mutate the new event using a conditional. Version 1.4.2. Is this fix likely to solve that problem too?

@colinsurprenant
Copy link
Contributor Author

@colinsurprenant colinsurprenant commented Nov 13, 2014

looking at the elapsed filter I'd say it will but can't guarantee it without a config + sample data to reproduce.

@mchlsun
Copy link

@mchlsun mchlsun commented Nov 17, 2014

same problem with karsaroth. When elapsed create new elapsed_expired event, it will go through all conditional code in [if..else if...else...]..

ajchalk added a commit to catalyst/logstash that referenced this issue Dec 2, 2014
This will be used by the filter flush compiler

Add generation of a flush lambda for each filter.

This allows filters to flush and have any generated events proceed
downward through the config as you would expect, respecting any
branches or future plugins.

Die on IOError which occurs when reading from a closed STDIN

Make filter_flusher invoke the new (and correct, I hope!) way to flush.

- On shutdown, we will also flush all filters.
- The flusher thread will terminate if we are shutting down.

Clarify the comment

Fix comment generation in the code to avoid newlines.

Add 'max_age' setting to multiline for flushing.

This setting chooses how long (in seconds) an event is considered to be
fresh before it will be automatically flushed.

This is useful for:
* slow log sources to get the 'last event' flushed,
* transaction-id-style events that have no obvious "end" event and also
  are mixed among other-id events in the same stream.

Also:
- Make filters have no teardown by default.
- Remove 'enable_flush' since it is not needed anymore; flush is always
  enabled.

refactor flush

new spool filter and specs, mainly for testing flushing

turn off unrelated test error for now

fix the flush logic, fix the flush compiles code to not include output section

synchronize cross-thread  access to @pending

refactor for performance and readability

synchronize cross-thread access to @spool

unused code

input:udp removed boggus ShutdownSignal handling, morphed loop do into while true, cosmetic reformat

use transcient events and not exceptions for in-flow pipeline signaling

inline flushing into  filterworker

removed now unnecessary flushing thread safety

fix conditionals bug for new events generated by filters & specs

spec for issue elastic#793

performance tweeks

simplify filter handling of events and new_events

this removes unecessary duplication when treating the original event as
a special case (different from new_events generated by a filter).
Also, since @filter_func only outputs non-cancelled events, some checks
were also removed.

Move multiple filter specs to a filter_chains file

append events generated by a filter using unshift instead of insert

closes elastic#793, closes elastic#1429, closes elastic#1431, closes elastic#1548

Conflicts:
	spec/conditionals/test.rb
@oryband
Copy link

@oryband oryband commented Feb 19, 2015

This bug still occurs with clone and mutate in that order in 1.4.2:

if some condition {
  clone {
    clones => "cloned"
  }
}

if [type] == "cloned" {
  mutate { .. }  # doesn't happen
}
@tbragin tbragin added the v1.5.0 label Jun 18, 2015
@driverpt
Copy link

@driverpt driverpt commented Apr 20, 2016

@colinsurprenant Has this bug been fixed (2.3.1) ?

@mjscosta
Copy link

@mjscosta mjscosta commented Apr 20, 2016

Think, it has been solved! But the documentation is not clear about the semantics of the clone behavior!
Weather its reinserted into the queue, and reprocessed by the filter pipeline, or if it is added the the batch of events being filtered, and will only apply, the transformations/conditionals following the clone configuration ("call") ...

@jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Apr 20, 2016

@mjscosta Indeed it's been solved. Good note about the documentation not being clear - can you file a new ticket about getting that documented?

To answer your question, your last detail is correct in describing the behavior -- this is right:

if it is added the the batch of events being filtered, and will only apply, the transformations/conditionals following the clone configuration ("call") ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet