Skip to content
This repository has been archived by the owner on Apr 17, 2018. It is now read-only.

Codec Support #4

Closed
vevo-john-delivuk opened this issue Nov 5, 2015 · 4 comments
Closed

Codec Support #4

vevo-john-delivuk opened this issue Nov 5, 2015 · 4 comments

Comments

@vevo-john-delivuk
Copy link

When using cloudwatch logs, and subscriptions, things become increasingly easier with json formatted data, is it possible to have the plugin support codecs, so that it can read in a string log file, then parse it with grok, and spit it out nice and clean in json via the the cloudwatch logs plugin?

Thanks!

@wanghq
Copy link
Contributor

wanghq commented Nov 10, 2015

Hi,

Can you use logstash filters to solve your problem? e.g. https://www.elastic.co/guide/en/logstash/current/plugins-filters-json_encode.html.

  • Use file input plugin to read a file
  • Use grok filter to parse it
  • Use json_encode filter to convert fields to json format
  • Use cloudwatch_logs output plugin to send to CloudWatch Logs service

Regards!

@vevo-john-delivuk
Copy link
Author

So I've looked at this, I didn't try an example, but it looks like every field we split using grok, we would have to run an instance of json_encode in order to format it, additionally it looks like cwlogs output will only look at the message field, so we'd still have to alter it or we're going to keep getting the same source message.

@vevo-john-delivuk
Copy link
Author

So I looked at the implementation of @codec in stdout, and s3, but because of the buffer, and I'm not a ruby expert by any means (not really even a novice), I couldn't really get that working properly, for our needs I actually grab the event object and convert it to json similar to the file output. In our experience we've found it exponentially easier to use cloudwatch logs with json. I don't want to suggest doing this change for everyone, but it really does make life significantly easier, and if you're already setup with grok you're in heaven.

121       #:message => event[MESSAGE] })
122    :message => event.to_json })

@vevo-john-delivuk
Copy link
Author

Just final comments,

I see why you want it to pull from message, as oppose to the whole object. You end up with a bunch of undesired fields in cloudwatch. I used a combination of json_encode, and blood sweat and tears.

#Grok to pull the original message apart
   grok {
     match => ["message", "%{DATESTAMP:timestamp} %{IPORHOST:clientip} %{QS:hostname} %{IPORHOST:hostip} %{NUMBER:hostport} %{WORD:method} %{URIPATH:url} %{NOTSPACE:querystring} %{NUMBER:status} %{NUMBER:subresponse} %{NUMBER:scstatus} %{NUMBER:timetaken} %{QS:site} (?<authbearer>-|\"[^\"]+\") (?<useragent>-|\"[^\"]+\") %{NOTSPACE:referrer} (?<country-code>-|\"[^\"]+\") (?<true-client-ip>-|\"[^\"]+\") (?<x-forwarded-for>-|\"[^\"]+\") (?<authuser>-|\"[^\"]+\") (?<logonuser>-|\"[^\"]+\") (?<remoteuser>-|\"[^\"]+\")"]
   }
  #Date to clean up that  
   date {
   match => [ "timestamp", "YY-MM-dd HH:mm:ss.SSSS"]
   }

#Wanted to make sure the nested value was correct for date, then we just remove unwanted quotes.
#Additionally we create a new object to put all of values we want to move over.
   mutate {
   update => { "timestamp" => "%{@timestamp}"}
     gsub => [
       "hostname", "\"", "", 
       "site", "\"", "",
       "useragent", "\"", "",
       "referrer", "\"", "",
       "authuser", "\"", "",
       "logonuser", "\"", "",
       "remoteuser", "\"", "",
       "x-forwarded-for", "\"", "",
       "country-code", "\"", "",
       "true-client-ip", "\"", "" 
     ]

     add_field => {
        "jsonmessage[timestamp]" => "%{timestamp}"
        "jsonmessage[clientip]" => "%{clientip}"
        "jsonmessage[hostname]" => "%{hostname}"
        "jsonmessage[hostip]" => "%{hostip}"
        "jsonmessage[hostport]" => "%{hostport}"
        "jsonmessage[method]" => "%{method}"
        "jsonmessage[url]" => "%{url}"
        "jsonmessage[querystring]" => "%{querystring}"
        "jsonmessage[status]" => "%{status}"
        "jsonmessage[timetaken]" => "%{timetaken}"
        "jsonmessage[site]" => "%{site}"
        "jsonmessage[authbearer]" => "%{authbearer}"
        "jsonmessage[useragent]" => "%{useragent}"
        "jsonmessage[referrer]" => "%{referrer}"
        "jsonmessage[authuser]" => "%{authuser}"
        "jsonmessage[logonuser]" => "%{logonuser}"
        "jsonmessage[remoteuser]" => "%{remoteuser}"
        "jsonmessage[headers][true-client-ip]" => "%{true-client-ip}"
        "jsonmessage[headers][x-forwarded-for]" => "%{x-forwarded-for}"
        "jsonmessage[headers][country-code]" => "%{country-code}"

     }

        }
#Finally I just encode that, and let CWLogs output pick it up. 
    json_encode {
        "source" => "jsonmessage"
        "target" => "message"
}
{
    "timestamp": "2015-11-03T20:44:07.254Z",
    "clientip": "OMIT",
    "hostname": "OMIT",
    "hostip": "OMIT",
    "hostport": "OMIT",
    "method": "GET",
    "url": "OMIT",
    "querystring": "token=_OMIT",
    "status": "200",
    "timetaken": "842",
    "site": "OMIT",
    "authbearer": "-",
    "useragent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36",
    "referrer": "OMIT",
    "authuser": "OMIT",
    "logonuser": "OMIT",
    "remoteuser": "OMIT",
    "headers": {
        "true-client-ip": "-",
        "x-forwarded-for": "OMIT",
        "country-code": "US"
    }
}

Thanks for your time @wanghq Great stuff btw!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants