Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grok pattern doesn't work when there are double quotes at the beginning of the pattern #43

Closed
pranavgaikwad opened this issue Mar 13, 2018 · 4 comments

Comments

@pranavgaikwad
Copy link

pranavgaikwad commented Mar 13, 2018

This could be a non issue.

But, somehow I am not able to get following pattern working. Is my grok pattern wrong ?

td-agent configuration

<source>
  @type beats
  metadata_as_tag
  format grok
  time_format %d/%b/%Y:%H:%M:%S %z
  grok_failure_key grokfailure
  <grok>
    pattern "%{DATA:side}" \[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{QS:agent} "%{WORD:method} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{QS:response_code} %{QS:cache} %{NUMBER:first_byte:float} %{NUMBER:upstream_resp_time:float}
  </grok>
  <grok>
    pattern %{GREEDYDATA:message}
  </grok>
  port 5044
  bind 0.0.0.0
</source>

When I use above pattern, I get below error

/opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config/basic_parser.rb:92:in `parse_error!': expected end of line at td-agent.conf line 152,26 (Fluent::ConfigParseError)
151:   <grok>
152:     pattern "%{WORD:side}" \[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{QS:agent} "%{WORD:method} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{QS:response_code} %{DATA:cache} %{DATA:first_byte:float} %{DATA:upstream_resp_time:float}

     --------------------------^
153:   </grok>
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config/v1_parser.rb:132:in 'parse_element'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config/v1_parser.rb:95:in 'parse_element'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config/v1_parser.rb:95:in 'parse_element'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config/v1_parser.rb:43:in 'parse!'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config/v1_parser.rb:33:in 'parse'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/config.rb:39:in 'parse'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/supervisor.rb:741:in 'read_config'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/supervisor.rb:451:in 'run_supervisor'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/lib/fluent/command/fluentd.rb:310:in '<top (required)>'
    from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in 'require'
    from /opt/td-agent/embedded/lib/ruby/site_ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in 'require'
    from /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.0.2/bin/fluentd:8:in '<top (required)>'
    from /opt/td-agent/embedded/bin/fluentd:22:in 'load'
    from /opt/td-agent/embedded/bin/fluentd:22:in '<top (required)>'
    from /usr/sbin/td-agent:7:in 'load'
    from /usr/sbin/td-agent:7:in '<main>'

But when I replace the pattern with following, it works as expected.

  <grok>
    pattern %{QS:side} \[%{HTTPDATE:timestamp}\] %{IPORHOST:clientip} %{QS:agent} "%{WORD:method} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{QS:response_code} %{QS:cache} %{NUMBER:first_byte:float} %{NUMBER:upstream_resp_time:float}
  </grok>
@okkez
Copy link
Collaborator

okkez commented Mar 13, 2018

This is expected behavior.
See test code.

@pranavgaikwad
Copy link
Author

Understood @okkez
Any workaround you would suggest ?

@okkez
Copy link
Collaborator

okkez commented Mar 14, 2018

There are some workarounds:

  • Write custom grok pattern for your log message
  • Fix grok pattern as you did: Start with %{QS:side}
  • Fix beats configuration not to start message with double quotes
    • But I'm not familiar with beats
  • Fix log format before processing with beats (not to start with double quotes)
  • Use parser_regexp
  • Use filter_record_transformer to cleanup your message not to start with double quotes

I recommend first or second.

@pranavgaikwad
Copy link
Author

@okkez Agreed. QS is a good choice when it comes to quoted strings.
Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants