New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logstash 2.3.0 - Can not reliably create tests for my grok patterns #6000

Closed
emmanuellyautomated opened this Issue Oct 5, 2016 · 10 comments

Comments

Projects
None yet
3 participants
@emmanuellyautomated

emmanuellyautomated commented Oct 5, 2016

I am currently using a custom jruby-1.7.25 install and the $USE_RUBY flag is set to 1. Strangely enough, even with the # encoding: utf-8 "magic" at the top of my spec.rb, I am still having the exact same issue as referenced https://github.com/elastic/logstash/issues/3448.

Failures:

  1) Test event "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed
     Failure/Error: insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"

     Insist::Failure:
       Expected "2015-21-01 12:01:02.003 UTC", but got nil
     # /vagrant/logstash_test_stuff/test_spec.rb:83:in `(root)'

Finished in 0.534 seconds (files took 5.99 seconds to load)
1 example, 1 failure

Failed examples:

rspec /vagrant/logstash_test_stuff/test_spec.rb:72 # Test event "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed

Randomized with seed 38731
@untergeek

This comment has been minimized.

Show comment
Hide comment
@untergeek

untergeek Oct 5, 2016

Member

Logstash 1.5 is technically not being worked on any more. Is there a reason you can't use 2.4?

Member

untergeek commented Oct 5, 2016

Logstash 1.5 is technically not being worked on any more. Is there a reason you can't use 2.4?

@emmanuellyautomated emmanuellyautomated changed the title from Logstash 1.5.0 - Can not reliably create tests for my grok patterns to Logstash 2.3.0 - Can not reliably create tests for my grok patterns Oct 5, 2016

@emmanuellyautomated

This comment has been minimized.

Show comment
Hide comment
@emmanuellyautomated

emmanuellyautomated Oct 5, 2016

My bad, @untergeek. I was actually using logstash v.2.3.0. I've updated the title of the issue.

emmanuellyautomated commented Oct 5, 2016

My bad, @untergeek. I was actually using logstash v.2.3.0. I've updated the title of the issue.

@untergeek

This comment has been minimized.

Show comment
Hide comment
@untergeek

untergeek Oct 5, 2016

Member

So what does reliably mean in this context? Does it work sometimes and not others? To me, reliably would indicate that it always passes or always fails.

Member

untergeek commented Oct 5, 2016

So what does reliably mean in this context? Does it work sometimes and not others? To me, reliably would indicate that it always passes or always fails.

@emmanuellyautomated

This comment has been minimized.

Show comment
Hide comment
@emmanuellyautomated

emmanuellyautomated Oct 5, 2016

@untergeek:

that this test (below) works:

describe "simple test" do

  config <<-CONFIG
    filter {
      grok {
         match => { "message" => "am%{GREEDYDATA:foo}" }
      }
    }
  CONFIG

  sample 'amessage' do
    insist { subject["foo"] } == "essage"
    insist { subject["message"] } == "amessage"
  end
end

...but this one does not:

describe "intermediate test" do

  config <<-CONFIG
    filter {
      grok {
            match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{WORD:word1} %{INT:int1:int} %{NUMBER:[inner][float1]:float}" ]
            tag_on_failure => [ "_grokparsefailure_test" ]
      }
    }
  CONFIG

  message = '2015-21-01 12:01:02.003 UTC Hello 42 3.14159'

  sample("message" => message, "type" => "test") do
    insist { subject["type"] } == "test"
    insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"
    insist { subject["word1"] } == "Hello"
    insist { subject["int1"] } == 42
    insist { subject["inner"]["float1"] } == 3.14159
  end
end

emmanuellyautomated commented Oct 5, 2016

@untergeek:

that this test (below) works:

describe "simple test" do

  config <<-CONFIG
    filter {
      grok {
         match => { "message" => "am%{GREEDYDATA:foo}" }
      }
    }
  CONFIG

  sample 'amessage' do
    insist { subject["foo"] } == "essage"
    insist { subject["message"] } == "amessage"
  end
end

...but this one does not:

describe "intermediate test" do

  config <<-CONFIG
    filter {
      grok {
            match => [ "message", "%{TIMESTAMP_ISO8601:timestamp} %{WORD:word1} %{INT:int1:int} %{NUMBER:[inner][float1]:float}" ]
            tag_on_failure => [ "_grokparsefailure_test" ]
      }
    }
  CONFIG

  message = '2015-21-01 12:01:02.003 UTC Hello 42 3.14159'

  sample("message" => message, "type" => "test") do
    insist { subject["type"] } == "test"
    insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"
    insist { subject["word1"] } == "Hello"
    insist { subject["int1"] } == 42
    insist { subject["inner"]["float1"] } == 3.14159
  end
end
@emmanuellyautomated

This comment has been minimized.

Show comment
Hide comment
@emmanuellyautomated

emmanuellyautomated Oct 5, 2016

@untergeek: my gut is telling me this has something to do with access to patterns. Strange though because the top of my spec.rb looks like -->

require "spec_helper"
require "logstash/filters/grok"
require "logstash/patterns/core"

emmanuellyautomated commented Oct 5, 2016

@untergeek: my gut is telling me this has something to do with access to patterns. Strange though because the top of my spec.rb looks like -->

require "spec_helper"
require "logstash/filters/grok"
require "logstash/patterns/core"
@untergeek

This comment has been minimized.

Show comment
Hide comment
@untergeek

untergeek Oct 5, 2016

Member

What's the failure in the second test?

Member

untergeek commented Oct 5, 2016

What's the failure in the second test?

@emmanuellyautomated

This comment has been minimized.

Show comment
Hide comment
@emmanuellyautomated

emmanuellyautomated Oct 5, 2016

@untergeek, the failure is as follows:

Failures:

  1) intermediate test "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed
     Failure/Error: insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"

     Insist::Failure:
       Expected "2015-21-01 12:01:02.003 UTC", but got nil
     # ./spec/test_spec.rb:63:in `(root)'

Finished in 0.33 seconds (files took 6.7 seconds to load)
1 example, 1 failure

...no match seems to have been made.

emmanuellyautomated commented Oct 5, 2016

@untergeek, the failure is as follows:

Failures:

  1) intermediate test "{"message":"2015-21-01 12:01:02.003 UTC Hello 42 3...." when processed
     Failure/Error: insist { subject["timestamp"] } == "2015-21-01 12:01:02.003 UTC"

     Insist::Failure:
       Expected "2015-21-01 12:01:02.003 UTC", but got nil
     # ./spec/test_spec.rb:63:in `(root)'

Finished in 0.33 seconds (files took 6.7 seconds to load)
1 example, 1 failure

...no match seems to have been made.

@untergeek

This comment has been minimized.

Show comment
Hide comment
@untergeek

untergeek Oct 5, 2016

Member

We're moving away from the RSpec 2.0 insist terminology, to using RSpec 3's expect. You should look at how we're doing RSpec 3.0 tests in other plugins and such and go with that, mostly because we may pull the plug on RSpec 2.0 usage at any time. Here's an example using expect from the logstash-core-patterns repository: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/spec/patterns/httpd_spec.rb

That said, I think the problem is that UTC is not a valid timezone for ISO8601.
From: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns#L69-L71

ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
ISO8601_SECOND (?:%{SECOND}|60)
TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?

So, an appropriate ISO8601_TIMEZONE can be Z for Zulu time (which is UTC), or it can be -0700, or -07:00, or +1000 or +10:30 or anything like that, but UTC is not valid. See: https://en.wikipedia.org/wiki/ISO_8601#UTC

That seems to be why you're getting nil instead of the expected timestamp.

Member

untergeek commented Oct 5, 2016

We're moving away from the RSpec 2.0 insist terminology, to using RSpec 3's expect. You should look at how we're doing RSpec 3.0 tests in other plugins and such and go with that, mostly because we may pull the plug on RSpec 2.0 usage at any time. Here's an example using expect from the logstash-core-patterns repository: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/spec/patterns/httpd_spec.rb

That said, I think the problem is that UTC is not a valid timezone for ISO8601.
From: https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns#L69-L71

ISO8601_TIMEZONE (?:Z|[+-]%{HOUR}(?::?%{MINUTE}))
ISO8601_SECOND (?:%{SECOND}|60)
TIMESTAMP_ISO8601 %{YEAR}-%{MONTHNUM}-%{MONTHDAY}[T ]%{HOUR}:?%{MINUTE}(?::?%{SECOND})?%{ISO8601_TIMEZONE}?

So, an appropriate ISO8601_TIMEZONE can be Z for Zulu time (which is UTC), or it can be -0700, or -07:00, or +1000 or +10:30 or anything like that, but UTC is not valid. See: https://en.wikipedia.org/wiki/ISO_8601#UTC

That seems to be why you're getting nil instead of the expected timestamp.

@wiibaa

This comment has been minimized.

Show comment
Hide comment
@wiibaa

wiibaa Oct 6, 2016

Contributor

@emmanuellyautomated In addition to the good point made by @untergeek I would recommend when you design test against grok to always test that the tag does not contains the failure tag like (with old syntax) insist { subject["tags"] }.nil?.
It helps to early reconsider both your patterns/regex and your input, once you trust the grok filter enough. Because your input string is 2015-21-01, and ISO8601 is yyyy-mm-dd the regex will already fail here even before the timezone issue :D

Contributor

wiibaa commented Oct 6, 2016

@emmanuellyautomated In addition to the good point made by @untergeek I would recommend when you design test against grok to always test that the tag does not contains the failure tag like (with old syntax) insist { subject["tags"] }.nil?.
It helps to early reconsider both your patterns/regex and your input, once you trust the grok filter enough. Because your input string is 2015-21-01, and ISO8601 is yyyy-mm-dd the regex will already fail here even before the timezone issue :D

@emmanuellyautomated

This comment has been minimized.

Show comment
Hide comment
@emmanuellyautomated

emmanuellyautomated Oct 7, 2016

@untergeek; @wiibaa: I ended up using a test structure that looks like this:

# encoding: utf-8

require "spec_helper"
require "logstash/filters/grok"
require "logstash/patterns/core"


describe "cisco example -- CISCO737016" do
  let(:pattern) { "CISCOFW737016" }

  context "src_ip error message for 737016 can be returned" do
    let(:message) { '<158>Sep 01 2016 22:04:07 10.9.2.253 : %ASA-6-737016: IPAA: Freeing local pool address 172.31.7.19' }

    subject { grok_match(pattern, message) }

    it "should render a hash from matching" do
      expect(subject['src_ip']).to include('172.31.7.19')
    end
  end
end

...turns out that changing to a more explicit testing style fixed the problem. Thanks for the suggestion @untergeek!

emmanuellyautomated commented Oct 7, 2016

@untergeek; @wiibaa: I ended up using a test structure that looks like this:

# encoding: utf-8

require "spec_helper"
require "logstash/filters/grok"
require "logstash/patterns/core"


describe "cisco example -- CISCO737016" do
  let(:pattern) { "CISCOFW737016" }

  context "src_ip error message for 737016 can be returned" do
    let(:message) { '<158>Sep 01 2016 22:04:07 10.9.2.253 : %ASA-6-737016: IPAA: Freeing local pool address 172.31.7.19' }

    subject { grok_match(pattern, message) }

    it "should render a hash from matching" do
      expect(subject['src_ip']).to include('172.31.7.19')
    end
  end
end

...turns out that changing to a more explicit testing style fixed the problem. Thanks for the suggestion @untergeek!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment