No description or website provided.
Ruby
Pull request Compare This branch is 1 commit ahead, 48 commits behind tagomoris:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
lib/fluent/plugin
test
.gitignore
Gemfile
LICENSE.txt
README.rdoc
Rakefile
example.conf
fluent-plugin-datacounter.gemspec

README.rdoc

fluent-plugin-datacounter

Component

DataCounterOutput

Count messages with data matches any of specified regexp patterns in specified attribute.

  • Counts per min/hour/day

  • Counts per second (average every min/hour/day)

  • Percentage of each pattern in total counts of messages

DataCounterOutput emits messages contains results data, so you can output these message (with 'datacount' tag by default) to any outputs you want.

output ex1 (aggregates all inputs): {"pattern1_count":20, "pattern1_rate":0.333, "pattern1_percentage":25.0, "pattern2_count":40, "pattern2_rate":0.666, "pattern2_percentage":50.0, "unmatched_count":20, "unmatched_rate":0.333, "unmatched_percentage":25.0}
output ex2 (aggregates per tag): {"test_pattern1_count":10, "test_pattern1_rate":0.333, "test_pattern1_percentage":25.0, "test_pattern2_count":40, "test_pattern2_rate":0.666, "test_pattern2_percentage":50.0, "test_unmatched_count":20, "test_unmatched_rate":0.333, "test_unmatched_percentage":25.0}

'input_tag_remove_prefix' option available if you want to remove tag prefix from output field names.

Configuration

DataCounterOutput

Count messages that have attribute 'referer' as 'google.com', 'yahoo.com' and 'facebook.com' from all messages matched, per minutes.

<match accesslog.**>
  type datacounter
  unit minute
  aggregate all
  count_key referer
  # patternX: X(1-20)
  pattern1 google google.com
  pattern2 yahoo  yahoo.com
  pattern3 facebook facebook.com
  # but patterns above matches 'this-is-facebookXcom.xxxsite.com' ...
</match>

Or, more exact match pattern, output per tags (default 'aggregate tag'), per hours.

<match accesslog.**>
  type datacounter
  unit hour
  count_key referer
  # patternX: X(1-20)
  pattern1 google ^http://www\.google\.com/.*
  pattern2 yahoo  ^http://www\.yahoo\.com/.*
  pattern3 twitter ^https://twitter.com/.*
</match>

HTTP status code patterns.

<match accesslog.**>
  type datacounter
  count_interval 1m    # just same as 'unit minute' and 'count_interval 60s'
                       # you can also specify '30s', '5m', '2h' ....
  count_key status
  # patternX: X(1-20)
  pattern1 2xx ^2\d\d$
  pattern2 3xx ^3\d\d$
  pattern3 404 ^404$    # we want only 404 counts...
  pattern4 4xx ^4\d\d$  # pattern4 doesn't matches messages matches pattern[123]
  pattern5 5xx ^5\d\d$
</match>

TODO

  • consider what to do next

  • patches welcome!

Copyright

Copyright

Copyright © 2012- TAGOMORI Satoshi (tagomoris)

License

Apache License, Version 2.0